I'm sorry, this is complete and total gibberish.<p>* The CAP theorem is a logical proof of the impossibility of providing both consistency and availability on a network which can lose messages (or using machines which can fail). You can't implement it any more than you can implement general relativity. It's a description of reality.<p>* The CAP theorem is not a data model which competes with or intersects at all with relational algebras. Rather, a relational algebra is the logical model (allegedly) underlying RDBMSes which are systems which historically provide consistency at the expense of availability in the presence of faults (thus obeying the CAP theorem because they're real systems and not opium dreams).<p>* Scaling horizontally does not imply anything about fault-tolerance. It instead describes systems in which resources can be incrementally added to incrementally gain capacity. It's possible to build a horizontally-scalable system which is less reliable than a single-machine system; it's also possible to build fantastically fault-tolerant systems which are also horizontally-scalable (c.f. Dynamo). Doing the latter is considered "a good idea;" doing the former is considering "fucking daft."<p>* Nothing about horizontally-scalable systems (or NoSQL or really anything the author mentions except for Redis) requires that the entire dataset be kept in memory. Systems like Riak (<a href="http://riak.basho.com" rel="nofollow">http://riak.basho.com</a>) or Voldemort (<a href="http://project-voldemort.com/" rel="nofollow">http://project-voldemort.com/</a>) use pluggable storage engines, some of which (e.g. InnoStore and BDB-JE) have excellent performance with 1:10 RAM-to-dataset ratios. By the author's own metric, the Holy Grail has not only been found but the damn things are multiplying.<p>* Neither epoll nor kqueue "scale indefinitely in terms of I/O concurrency." Nothing does. That's horseshit.<p>You're better off huffing glue than reading this thing. I don't even care what he has to say about Redis. He could have some incredible insights about it, but they'd be completely and totally negated by the incomprehension, misinformation, untruths, and general crazytalk which preceded it.<p>tl;dr: 15+ years of RDBMS experience gives you 0 clues about distributed computing; reading Time Cube (<a href="http://www.timecube.com/" rel="nofollow">http://www.timecube.com/</a>) is preferable to reading this drek.
"First, scaling horizontally has little to do with the database engine itself - creating a transparent, consistent hash function is the easiest part."<p>That is just so incorrect that it's hard to take the rest of the post seriously.<p>What happens when you want to add a node to your cluster? What happens when a node goes down? What happens when you drop some packets between nodes? When one node has an unbalanced number of keys?<p>Then, answering each of these questions brings with it many more questions. For example, if the answer to "what happens when your node does down" is to replicate some data to another node, then how do you deal with inconsistencies between different replicas of the data? What happens if the node you try to replicate to goes down? What if the node you try to replicate to has a different idea of what that data should be?<p>If a database is to be truly horizontally scalable, it will have an answer for all of these questions. Which has a <i>lot</i> to do with "the database engine itself."
"In the computer science terminology, an O(N) algorithm is considered “naive”, and in the computer security terminology, it even has a name - “brute force”"<p>WTF?
Redis works great as long as the dataset fits in RAM. After that, the background saving process kicks in, and performance becomes an issue. This caused my company to move away from Redis to Mongo. It's foolish to assume that just because a product goes beyond storing key/value pairs that it's over engineered. It actually seems that no research was done outside of what Redis can do given the portion of the article talking about namespaces for keys not being inherent in NoSQL solutions. Check out Mongo's collections. That's exactly what they are.
If the guy needs access counters without an ever-growing disk file, he dismissed MongoDB too fast. The ability to repeatedly alter fixed size data without growing the storage is something MongoDB has, and no other database engine that I know of. (Their design has its own serious disadvantages, but still, if that's the behaviour you need...)
I posted my answer here as it is long: <a href="http://www.ceondo.com/ecte/2010/10/challenge-web-scale-integration-architecture-1-10" rel="nofollow">http://www.ceondo.com/ecte/2010/10/challenge-web-scale-integ...</a><p>To save you one click, the real problem about the web scale is the integration. This is the hard part, this is where things start to break, this is where on can have fun too.