It's amazing how much the Jepsen posts have raised the bar around distributed database testing. "Passes Jepsen" seems to be rapidly becoming required for open-source distributed databases. It's a great improvement, and I hope Kyle is really proud of the influence he has had. A big part of the success is that the writing itself was engaging, entertaining and accessible without being patronizing.<p>Having said that, Jepsen is far from a full exercise of all of the safety properties of a distributed database. There are many kinds of bugs (both protocol bugs and implementation bugs) that wouldn't be detected by these kinds of tests. Passing Jepsen is necessary, but not sufficient. Even without covering truly Byzantine behaviors, real-world networks have many failure modes that Jepsen doesn't address.
And @aphyr's twitter response:<p>So I'm delighted by <a href="http://lucidworks.com/blog/call-maybe-solrcloud-jepsen-flaky-networks/" rel="nofollow">http://lucidworks.com/blog/call-maybe-solrcloud-jepsen-flaky...</a> … but tbh "may get a successful response for a document that is lost" is actually a CP failure.<p>-- <a href="https://twitter.com/aphyr/status/542820272626491392" rel="nofollow">https://twitter.com/aphyr/status/542820272626491392</a>
Here's the original Jepsen post on ElasticSearch for reference: <a href="http://aphyr.com/posts/317-call-me-maybe-elasticsearch" rel="nofollow">http://aphyr.com/posts/317-call-me-maybe-elasticsearch</a>