TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Database development mistakes made by application developers

154 点作者 phalien超过 14 年前

13 条评论

mcantor超过 14 年前
I never understood the groans of dismay from other developers when the subject of writing SQL comes up. I'm a dyed-in-the-wool application developer, but I <i>love</i> writing SQL.<p>Unlike HTML &#38; CSS, which seem designed to spread into a greater morass of obtuse and interdependant hacks with every trial-and-error iteration, SQL has always felt elegant to me.<p>It takes just as much time to craft an SQL query as it does to translate a design into HTML/CSS, but 99% of the time I can sit back and be proud of my handiwork; even re-using it later.<p>On the other hand, I have never completed a webpage without looking at the resulting markup and thinking, "What an ugly goddamn kluge." SASS and JQuery and HTML5 and CSS3 don't make a difference. Maybe I'm just bad at it, though.<p>I eschew ORMs at every opportunity, and it is mysterious to me why "writes SQL for you" is such a common "feature." I would liken it to a tool which "paints paintings for you" or "designs your app for you." Why would I entrust a computer program to do something which requires the creative spark of a human?
评论 #2001226 未加载
评论 #2001018 未加载
评论 #2001095 未加载
cubicle67超过 14 年前
For what it's worth, here's a few from me<p><i>Indexes</i>. Everybody says it, but I'm still surprised at the number of devs who have next to no idea about index and how they work.<p>Few guides: Every index adds a write penalty, so don't just add them to everything. If you've got a compound index, it will only be used if the query also uses all the fields in the index (If you build an index on surname, firstname and dob, but execute select * from users where firstname="Bob" and surname="Jones", the index won't be used). Benchmark the difference each index makes, and stand back in awe at the improvement a few well places indexes can contribute.<p>[Edit: as pointed out below, my experience is well out of date and it looks like there's a number of database engines where the compound index thing doesn't apply]<p><i>Learn to think in sets</i>. Most devs are used to operating on discrete pieces of data; we're not used to thinking in sets, and so writing efficient SQL doesn't come naturally. Put in the effort and learn it and you'll be well rewarded. SQL is built around sets of data, and can almost assuredly do stuff faster than your code can. Hint: anytime you find yourself getting the result of a query, iterating over the result set and executing another query for each result <i>you're doing it wrong</i><p><i>Learn how your ORM works</i>. If you're using an ORM (and most of us do), learn how to use it efficiently. This is where having a good working knowledge of SQL helps, because you can then find out how to help your ORM create good SQL. If you're using ActiveRecord, understand how to correctly use :conditions, :include etc, and what they're doing behind the scenes. Ideally you want your ORM to make one call to the DB, and to retrieve everything you need in that one call (calls are expensive, even for very small queries). Poorly written code causes the DB to get hit constantly. Not good,
评论 #2000215 未加载
评论 #2000277 未加载
评论 #2000890 未加载
评论 #2000376 未加载
评论 #2000201 未加载
评论 #2000284 未加载
CWIZO超过 14 年前
Don't know if this was mentioned in the answers (haven't read them all):<p>Don't assume that once you make your query run perfectly it will stay that way for ever. Once row count grows, your query might start to perform very poorly. Happened to me just recently, when a query was running in the 0.0001s range, but over night crawled to a halt and took 1 minute(!) to execute. Always pay attention to your querys and monitor if anyone of them is starting to slow down.
评论 #2000840 未加载
评论 #2000174 未加载
thibaut_barrere超过 14 年前
Also: don't make assumptions about how you believe a database works, just because you've used another one (or another version of the same).<p>And measure every optimization you make, too :) I saw someone doing an optimization that should have accelerated the processing, only to discover once in production it was 4x slower.
IgorPartola超过 14 年前
My addition to the already great list: using temporary tables to serve content. Temp tables are great when you have a complex report, and breaking things up into stages gives you a performance boost (think you need to do multiple lookups against a list of users that is small but takes a long time to build). So if you generate this report at midnight via a cron job, temp tables work great. However, generating a bunch of them at once is a performance hit. Instead, use indecies or (in the worst case) caching with triggers.
wenbert超过 14 年前
Very good points. I have bookmarked the Stackoverflow post. I am glad that I had a good teacher for this subject in college. I really can't stress enough how important it is to design your database properly. I personally think that designing the database structure is one of the most difficult things to do when starting out a project. It is also the most fun :D Also, the satisfaction of optimizing your queries with just a few tweaks and seeing large amounts of improvements feels very good.<p>I love SQL and find it easy to solve problems with it. But I feel the opposite for ORMs. It adds another "learning barrier" for me. I hate it because sometimes I know how to solve a problem in SQL but have a really hard time solving it using the ORMs syntax. I have my doubts on it because at the start of the project you are supposed to define what DB you will be using.<p>Also, am I the only one who wasted countless hours trying to figure out how to execute a query using an ORM? While being absolutely sure how to do it with SQL?<p>This is my current situation. I am not that experienced and have a lot to learn. That is why I will be spending more time learning the ORMs I will be using and more hours figuring it out :)
mdoyle超过 14 年前
I started in database development in 1995ish. Surprised to see an article like this detailing the exact same problems we had back then. Our solution was to have classroom sessions with developers, to provide them with the education they wanted/needed. Biggest problem back then was the use of embedded SQL rather than stored procedures.
评论 #2002100 未加载
jfb超过 14 年前
#1 on my list: don't assume your application is the only one that will ever need access to your data. PUT YOUR DATA IN YOUR DATABASE.<p>#2: don't put stuff that aren't true facts into your database -- just because it's a reliable cross-process data store does NOT mean it's appropriate for all cross-process data sharing.<p>#3: don't fall in love with SQL. It fucking sucks.
iaskwhy超过 14 年前
A sort of hack I use sometimes (clearly documented) is to avoid joins when dealing with large data, mainly if caching is not an option (it should be but, hey, I found out it's not always like that!).<p>Say you have a table for people and another for cities where they currently live at. If you were to do this for a small country you won't deal with many different cities so a join won't waste much time. But if you're using all the countries in the world then the number of cities is really big. When that happens, use two different and simple queries: one to get all the people, another to get all the cities. Then just check if both entries exist on your sets and use them on a while or for-each loop. It can be a big improvement.<p>Note: This is a really simple example which isn't the best case scenario for this tip. Think large sets with lots of joins between them.
评论 #2001178 未加载
alexeyklyukin超过 14 年前
A suggestion to use surrogate keys instead of natural ones doesn't seem right to me. IMO proper choice of natural keys leads to better mapping from a knowledge domain into the corresponding relational model; if you are unable to find natural keys maybe there's something wrong with your database schema?
评论 #2000736 未加载
评论 #2000858 未加载
评论 #2000588 未加载
评论 #2002073 未加载
DrJokepu超过 14 年前
Interesting list, have a few good points, unfortunately some points are personal opinions rather than universally agreed practices.
评论 #2000203 未加载
评论 #2000207 未加载
ergo98超过 14 年前
<i>6. Not simplifying complex queries through views</i><p>Number 6 needs to be used with discretion.<p>When you reference a complex view, you are in essence making use of a stored query snippet -- there is no magic to the view, and unless you've materialized it (which is very rare), there is no efficiency advantage. All it really is is code sharing.<p>So what's the problem then?<p>The problem then is that the query analyzer has to assume that you actually care about every element of said view, every returned column, every spurious join. I've dealt with too many performance problems where people made use of an overly generous view that invalidated most indexes, removed the potential for covering indexes (the #1 most important performance element of a database), and did a lot of joins that were irrelevant to the usage.
评论 #2000813 未加载
projectileboy超过 14 年前
Given the rise of NoSQL solutions and the amazing lack of query runtime optimization performed by most relational DBs, looking at this list made me ask "why even bother?"
评论 #2000943 未加载
评论 #2001492 未加载