FaunaDB 2.5.4

263 pointsby aphyrabout 6 years ago

12 comments

evanweaverabout 6 years ago

So excited to see this; this is the culmination of three months of very hard work by both teams.FaunaDB 2.5 passed the core linearizability tests for multi-partition transactions immediately. To my knowledge no other distributed system has done this. Zookeeper was the strongest candidate on initial testing in the early days, but it does not offer multiple partitions at all, as discussed in the FaunaDB report. And Jepsen itself was much less comprehensive at the time.All other issues affecting correctness were fixed in the course of the analysis, and FaunaDB 2.6 is now available with the improvements.We're happy to answer questions along with @aphyr. Our blog post is here: <a href="https://fauna.com/blog/faunadbs-official-jepsen-results" rel="nofollow">https://fauna.com/blog/faunadbs-official-jepsen-results</a>

评论 #19311836 未加载

评论 #19310944 未加载

评论 #19311103 未加载

评论 #19314986 未加载

jatsignabout 6 years ago

I hadn't heard of Fauna before. What's the use case?Looks like it's not open source, and the pricing isn't very clear if I want to host it locally. The "Download" page requires you to provide your contact info first.Why should I go through all those hoops?

评论 #19310860 未加载

wmsilerabout 6 years ago

From the post, FaunaDB initially had several issues, which they've generally resolved. Jepsen is open source, so I'm curious why a database company wouldn't run Jepsen internally, work out as many problems as they can, and then engage aphyr in order to get the official thumbs up. Given how important data integrity is, I would assume that any database company would be running Jepsen (or something equivalent) regularly in-house. If they are doing that, then how is it that aphyr finds so many previously unknown issues? And if they aren't running Jepsen in-house, why not?

评论 #19313790 未加载

评论 #19313764 未加载

asienabout 6 years ago

Tried Fauna once with their « Cloud » versions.I was absolutely shocked by the poor performance of the service.In my case I prototype some simple CRUD queries with NodeJS ,within the same datacenter region.Insert took well over a second to complete and reading a simple document with one field took also half a second.I was also unable to make « join » between document because how complex their query language is and their support basically encouraged me not to use « join » but to use « aggregate » like mongo ... Why offer this feature if I can’t use it ?Has it changed since then ? It seems very clear for me that Fauna is entirely focused on Enterprises customers ( after all this is where the money is ) the cloud version seem to be just a gimmick.

评论 #19311487 未加载

georgewfraserabout 6 years ago

Fauna’s writeup heavily emphasizes the fact that it doesn’t rely on atomic clocks. My understanding is that both AWS and GCP use atomic clock based timekeepers since 2017, so it’s not like this is some exotic technology.The primary advantage described in the Calvin papers is that it’s the only distributed transaction protocol that can handle high contention workloads. But Fauna never seems to bring this up. Does that mean that Fauna’s current implementation isn’t fast under contention?

评论 #19311949 未加载

etaioinshrdluabout 6 years ago

This actually reminds me a lot of how Ethereum transactions are represented as code as well.Anyone else see a parallel there?Seems like a good idea, overall. One annoying thing that affects pretty much every database with transactions is that the effort of retrying failed transactions is pushed onto the user, by necessity.But if your transactions are airtight chunks of code... then the DB can retry them for you and provide a simpler interface to your app code.

评论 #19316269 未加载

burembaabout 6 years ago

Looks great but why did you decide to develop your own query language instead of just using SQL? Even no-sql transactional database solutions started to adopt SQL lately and learning a new language is not really easy for the application developers.

评论 #19312818 未加载

评论 #19313149 未加载

twicabout 6 years ago

FaunaDB uses Calvin, a transaction protocol developed by Daniel Abadi. Their blog post explains it nicely, after a bit of a slow start:<a href="https://fauna.com/blog/consistency-without-clocks-faunadb-transaction-protocol" rel="nofollow">https://fauna.com/blog/consistency-without-clocks-faunadb-tr...</a>But in summary:1. A 'transaction' is a self-contained blob of code which reads input, does deterministic logic, and writes output (so not like a traditional RDBMS transaction, where the application opens a transaction and then interleaves its own logic between reads and writes)2. When a transaction arrives, the receiving node runs it, and captures the inputs it read, and the outputs it wrote3. The transaction, with its captured inputs and outputs, is written to a global stream of transactions - this is the only point of synchronisation between the nodes4. Each node reads the global stream, and writes each transaction into its persistent state; to do that, it repeats all the reads that the transaction did, and checks that they match the captured input - if so, the outputs are committed, and it not, the transaction is aborted, and retriedThe key idea is that because the process is deterministic, the nodes can write transactions to disk independently without drifting out of sync.It's pretty neat. And it's exactly what Abadi wrote about a couple of months ago:<a href="http://dbmsmusings.blogspot.com/2019/01/its-time-to-move-on-from-two-phase.html" rel="nofollow">http://dbmsmusings.blogspot.com/2019/01/its-time-to-move-on-...</a>This is also what VoltDB does (which Abadi worked on, along with Michael Stonebraker):As an operational store, the VoltDB “operations” in question are actually full ACID transactions, with multiple rounds of reads, writes and conditional logic. If the system is going to run transactions to completion, one after another, disk latency isn’t the only stall that must be eliminated; it is also necessary to eliminate waiting on the user mid-transaction.That means external transaction control is out – no stopping a transaction in the middle to make a network round-trip to the client for the next action. The team made a decision to move logic server-side and use stored procedures.<a href="https://www.voltdb.com/product/data-architecture/no-wait-design/" rel="nofollow">https://www.voltdb.com/product/data-architecture/no-wait-des...</a>It's also similar to, although categorically more sophisticated than, the idea of object prevalence, which is now so old and forgotten that i can't find any really good references, but:Clients communicate with the prevalent system by executing transactions, which are implemented by a set of transaction classes. These are examples of the Command design pattern [Gamma 1995]. Transactions are written to a journal when they are executed. If the prevalent system crashes, its state can be recovered by reading the journal and executing the transactions again. [...] Replaying the journal must always give the same result, so transactions must be deterministic. Although clients can have a high degree of concurrency, the prevalent system is single-threaded, and transactions execute to completion.<a href="https://web.archive.org/web/20170610140344/http://hillside.net/sugarloafplop/papers/5.pdf" rel="nofollow">https://web.archive.org/web/20170610140344/http://hillside.n...</a>

drejabout 6 years ago

I recommend first reading bits of the Jepsen report, because the company blog paints quite a different picture.> We’re excited to report that FaunaDB has passed: > Additionally, it offers the highest possible level of correctness: > In consultation with Kyle, we’ve fixed many known issues and bugsvs.> However, queries involving indices, temporal queries, or event streams failed to live up to claimed guarantees. We found 19 issues in FaunaDB[.]

评论 #19310855 未加载

评论 #19310889 未加载

评论 #19310652 未加载

评论 #19310685 未加载

jwrabout 6 years ago

I know this is slightly off-topic, but I'd be very interested in Jepsen testing FoundationDB. They claim to have developed the database test-first (starting with simulations), and it would be great to be able to compare the claims to reality using an external (by now becoming an industry standard!) testing tool.

评论 #19315111 未加载

anentropicabout 6 years ago

That was pretty great. Does anyone have links to tests of FaunaDB write performance?

评论 #19318602 未加载

gigatexalabout 6 years ago

If they added a standard SQL layer I’d be onboard. Interesting project though.

评论 #19316382 未加载