TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

NoSQL is What?

137 pointsby timfalmost 14 years ago

14 comments

fauigerzigerkalmost 14 years ago
Clearly, we have to identify the non scaling or performance related qualities of NoSQL for the debate to make any sense. I don't think it is possible in general to define those qualities, because NoSQL systems don't have much in common. Using a negation to name the category is telling in itself.<p>You mention schemaless, but non of the BigTable derived systems are schemaless. Key-value stores are schemaless but RDBMS can do key-value storage just fine as can file systems.<p>I think this whole debate boils down to whether or not you need to normalize data. If you normalize, you need joins and that's the weak spot of most NoSQL systems. Doing joins in procedural code requires all data to be transferred into application process memory, which is only viable for modest amounts of data. (I'm not saying that only RDBMS can ever do proper joins, just that the popular NoSQL solutions in use today don't)<p>Normalization is also what mandates ACID because normalization means you're losing what I would call the "physical unit of consistency". Normalization, joins and ACID go together. It's all or nothing. (Of course pragmatically it's never all or nothing but it's useful to highlight the general point)<p>So, my conclusion is this: Use RDBMS or don't normalize (much). All the debates around RDBMS or NoSQL being simpler or more complicated turn out to be implicit debates about the need for normalization. When some people say this or that model is simpler, they either imply or don't imply a need for normalization.<p>In my view, whether or not you need to normalize depends primarily on whether or not the data is single purpose or multi purpose. If it's one app and its own private data island, then not normalizing often makes sense for simplicity and performance reasons.<p>If the data has it's own seperate life cycle, idependent of any individual app, then not normalizing is a terrible mistake that brings down everyone's productivity no matter how simple it may appear initially.<p>Having worked on data integration and anlytics projects for many years, I'm leaning towards the view that most data is multi purpose even if it's not initially expected to be. But that may well be survivors bias as apps that die young never cause integration issues. That doesn't mean they haven't fulfilled their original purpose.
评论 #2799612 未加载
评论 #2799112 未加载
评论 #2799202 未加载
评论 #2799151 未加载
habermanalmost 14 years ago
"threw up in my mouth a little." "Gee, let me get this straight." "Bullshit." "Seriously?"<p>I have to say that one of my regrets about growing up in programmer circles is seeing stuff like this held up as an acceptable example of how adults communicate with other adults.<p>It took me a long time to realize that this style of communication is not necessary, is not effective, and reflects poorly on the speaker. C'mon, this guy appears to be in his 30s or 40s and has written books, so why does he write like he's an angsty teen? (I know Linus does it. I think it's lame when he does it too.)<p>There's still room for humor and snark, here are three of my favorite blog postings/articles ever, all very snarky, but not embarrassingly juvenile:<p><a href="http://wanderingbarque.com/nonintersecting/2006/11/15/the-s-stands-for-simple/" rel="nofollow">http://wanderingbarque.com/nonintersecting/2006/11/15/the-s-...</a><p><a href="http://diveintomark.org/archives/2004/01/14/thought_experiment" rel="nofollow">http://diveintomark.org/archives/2004/01/14/thought_experime...</a><p><a href="http://www.info.ucl.ac.be/~pvr/decon.html" rel="nofollow">http://www.info.ucl.ac.be/~pvr/decon.html</a>
评论 #2799705 未加载
评论 #2801102 未加载
评论 #2800314 未加载
评论 #2800077 未加载
justin_vanwalmost 14 years ago
Ok, whenever someone opens with something to the effect of 'I use MySQL, so I have experience with relational databases and can make a comparison with NoSQL' all credibility is lost.<p>MySQL is a 'relational database', but one in which JOIN is so expensive and poorly optimized that you almost have to use it as a key-value store, looking everything up directly with synthetic primary keys.<p>I've had this discussion several times. Some startup guys say 'we should look at NoSQL', and I ask questions to get to the bottom of why they think that. They will say something like 'we have this huge join we have to do, but it's too expensive, so we pre-compute it'. I ask more questions, and the 'huge join' is not huge at all, in fact it is just a reasonable join, something that you could expect to do on every page view without difficulty. Well, except they are using MySQL, and it can't join for shit. The MySQL query planner is disgusting.<p>So, although I don't expect to persuade the world to stop using MySQL (to be honest, I love that it is the go-to thing, those of us who use a decent database like Postgresql end up with a huge competitive advantage, better performance, more features, more scaleable, amazing query planner, top shelf performance analysis), I think we should at least admit that in practice, to get any performance out of it, you have to effectively use it as a key-value store anyway. And when comparing MySQL, which is a shitty key-value store, against real key-value stores, you can make a case for some NoSQL thing.
评论 #2800070 未加载
jisteralmost 14 years ago
I have to agree with one of the comments. All you did was rant and didn't say something useful. Perhaps you can tell your readers about your experiences so that you can convince them that NoSQL is useful (Of course, I am NOT saying it isn't) to implement in their projects?
评论 #2798972 未加载
jerryaalmost 14 years ago
I did find this point from the original article to be very dubious:<p><i>In fact, I would argue that starting with NoSQL because you think you might someday have enough traffic and scale to warrant it is a premature optimization, and as such, should be avoided by smaller and even medium sized organizations. You will have plenty of time to switch to NoSQL as and if it becomes helpful. Until that time, NoSQL is an expensive distraction you don’t need.</i><p>Consider:<p>- how hard most organizations find it to refactor, rewrite, retest, especially in systems that are online 24x7<p>- when would you prefer to climb the learning curve with an immature technology, when you are small and starting out, or when you are a large company with a large set of users and under "mission critical" constraints (and possibly stockholders and the like.)<p>My guess is that ongoing companies find it extremely difficult and expensive (and wanting for talent) to switch from one sql database to another, much less switch from sql to nosql.
评论 #2799499 未加载
评论 #2799509 未加载
评论 #2799282 未加载
flocialalmost 14 years ago
This is opinion versus opinion. I'm sorry to say there's no real content here. The author went from Yahoo to Craigslist so there's no such thing as premature optimization at that scale and with the small staff at CL you can be sure that chasing NoSQL as a fad can ruin the company. Obviously he doesn't fit the bill of the essay he's criticizing but most devs don't experience the scale of his problems.<p>You can't do the topic of NoSQL vs SQL justice with an essay because it would just be semantic, we're talking about a different theoretical representation of data structure. You might as well scream "better taste!", "Less filling!".
评论 #2799512 未加载
LeafStormalmost 14 years ago
One thing that bothers me is people who talk about "SQL databases vs. NoSQL databases." That's like framing a debate on transportation as "Cars vs. Not Cars," where "Not Cars" includes bicycles, planes, buses, subways, boats, zeppelins, etc. etc.<p>If you take CouchDB, Redis, MongoDB, and all the other "NoSQL" databases and compare them, the only thing they share in common is that they do not use a relational data model or SQL. The way the word "NoSQL" is used, however, implies that they are some kind of united front against SQL databases, which is not the case at all. (It's why I am not a big fan of the term.)<p>Just like you would not use bicycles, planes, subways, and boats for the same things, you would not use CouchDB, Redis, MongoDB, and Cassandra for the same things. If you're choosing a database just because it's "NoSQL," then you are completely missing the point.
评论 #2799749 未加载
jpterryalmost 14 years ago
Firstly, I can attest that migrating the datastore of an application which has scaled to require a NoSQL solution is no trivial task.<p>Secondly, I believe the author of the original posting really meant that "premature optimization is the root of all evil." Like this post points out, NoSQL solutions vary wildly in their abilities and usefulness. A relational database is a good place to start on the path to an MVP. And if you need features that a NoSQL solution can provide, and you understand the problem you're trying to solve, then use a NoSQL solution.
评论 #2799515 未加载
swampthingalmost 14 years ago
Obviously this doesn't really have any bearing on points the author is making, but a small nit for posterity's sake - I think the point Clayton Christensen was making in <i>The Innovator’s Dilemma</i> was not that people should adopt inferior technologies to gain leverage later.<p>I think the point in that book was more that new technologies are often inferior in many ways to existing technologies when they first start out, and the way these new technologies survive/grow is by appealing to niches that value the existing ways in which the new technology is superior. Then, when the new technology matures a little more, the market to which it appeals grows a little larger, and this repeats.
jhawk28almost 14 years ago
The problem is that NoSQL is such a broad term for datastores. Some of them are simple (like redis) and some more complex (like Cassandra/HBase). They also have different targets for data types. Using one just because it is a NoSQL can be a premature optimization just like using a RDBMS can be a premature optimization. You really need to understand the data and how it will be used. Before you know what you want to build, it is easy to prematurely optimize for something you don't need.<p>Start simple, then iterate...
tapvtalmost 14 years ago
Undertaking "optimization", in this case selecting and developing with a NoSQL datastore early in the process, should only be considered premature if the costs of doing so (which will be mainly represented by developer-hours spent) are greater than the value provided by having a datastore that can accommodate well the needs of the application itself, development team, and end-users.<p>Adaptability, flexibility (with regard to schema/key structure migration and maturation), as well as ease of partitioning data intelligently ahead of demand are all hugely important factors that can and often should inform the process of selecting a datastore.<p>If the datastore selected for use: - shortens development time, - provides improved performance for anticipated scale, - better represents the data model needing to be captured, - avoids re-work and "post"-mature optimization of data models &#38; datastores, - or accomplishes any combination of the above ... ... then the selection of that datastore should not be considered premature optimization.<p>Finding that your traditional RDBMS does not well support the data models you have developed, especially once the product is out of the gate, will not be fun. Having to engage in a refactor and data migration to move to a more appropriate or more performant datastore will be a time- and resource-consuming process.<p>As soon as the initial synthesis phase of development can begin, it may be well worth the effort to experiment with multiple datastores as a means of evaluating their performance and suitability. Depending on the scope and potential for the project to scale, modularizing distinct pieces of core functionality into separate services, each with their own most-suitable datastore, can also provide great benefit in flexibility of development processes, as well as adaptability of the product to the demands of the end-users.
antirezalmost 14 years ago
when there are arguments, like in the Jeremy post, commenting about tones and formal things is a huge FAIL. It is part of the expression of everyone to use the words and tones he wishes, as long as no one is going to be offended (if you are super-sensible this is your problem). One thing I always feel as a problem is that the programming community here in HN is a bit too middle class-ish, this is annoying: you are off topic, you are not polite, respect the fact I don't understand, blablabla. Hacking is in my vision connected with cultural freedom, and not being polite is not the only but one of the possible expressions. So reply to arguments and stop to be so childish.
Devilboyalmost 14 years ago
He's only experienced with MySQL? How can he judge the SQL vs NoSQL battle when he's never used a proper SQL system? NoSQL does not 'save development time' in general, it's just a different tool. A much younger and less refined one at that. Real RDBMSs do a whole lot more than execute your SQL queries for you.
评论 #2799023 未加载
评论 #2799275 未加载
评论 #2799002 未加载
i_crusadealmost 14 years ago
"Again, I think we need to talk about the best tool for the job, not the best tool for every job. Relational databases are not the best tool for every data storage job."<p>Pretty much disqualifies him as moron. Hell, he doesn't say anything.
评论 #2799348 未加载