FoundationDB: A distributed, unbundled, transactional key value store [pdf]

283 pointsby wwilsonalmost 4 years ago

22 comments

monstradoalmost 4 years ago

Have nothing but praise for FoundationDB. It has been by far the most rock solid distributed database I have ever had the pleasure of using. I used to manage HBase clusters, and the fact that I have never once had to worry about manually splitting "regions" is such a boon for administration...let alone JVM GC tuning.We run several FDB clusters using 3-DC replication and have never once lost data. I remember when we wanted to replace all of the FDB hardware (one cluster) in AWS, and so we just doubled the cluster size, waited for data shuffling to calm down, and just started axing the original hardware. We did this all while performing over 100K production TPS.One thing that makes the above seamless for all existing connections is that clients automatically update their "cluster file" in the event that new coordinators join or are reassigned. That alone is amazing...as you don't have to track down every single client and change / re-roll with new connection parameters.Anyway, I talk this database up every chance I get. Keep up the awesome work.- A very happy user.

评论 #27430452 未加载

rubyn00biealmost 4 years ago

Here's one of my favorite articles on FoundationDB, where it (FDB) passes Jepsen first try: <a href="https://web.archive.org/web/20150312112556/http://blog.foundationdb.com/foundationdb-vs-the-new-jepsen-and-why-you-should-care" rel="nofollow">https://web.archive.org/web/20150312112556/http://blog.found...</a>> I ran FoundationDB Key-Value Store through every nemesis in Jepsen - including those that found failures in other databases - and FoundationDB passed all of them with flying colors.FoundationDB is one of the coolest pieces of technology I've used in the past decade. The tuple keyspace is incredibly useful, so are the multi-key transactions. I've physically killed the power on an FDB node and FDB cluster; multiple times (heh, home servers)... and every time the cluster or node just comes back.

评论 #27426604 未加载

评论 #27427219 未加载

ryanworlalmost 4 years ago

Two quotes from the paper that I think will motivate people to read it:"Rigorous correctness testing via simulation makes FDB extremely reliable. In the past several years, CloudKit [59] has deployed FDB for more than 0.5M disk years without a single data corruption event. Additionally, we constantly perform data consistency checks by comparing replicas of data records and making sure they are the same. To this date, no inconsistent data replicas have ever been found in our production clusters.""For example, early versions of FDB depended on Apache Zookeeper for coordination, which was deleted after real-world fault injection found two independent bugs in Zookeeper (circa 2010) and was replaced by a de novo Paxos implementation written in Flow. No production bugs have ever been reported since."

评论 #27424924 未加载

评论 #27425212 未加载

评论 #27424975 未加载

评论 #27424937 未加载

davgoldinalmost 4 years ago

Only great things to say about FoundationDB. We've been using it for about a year now. Got a tiny, live cluster of 35+ commodity machines (started with 3 a year ago), about 5TB capacity and growing. Been removing and adding servers (on live cluster) without a glitch. We've got another 100TB cluster in testing now. Of all the things, we're actually using it as a distributed file system.We've tried Ceph, GlusterFS, HDFS, MinIO and some others, and eventually decided on a custom FDB solution. It's a breeze to setup, and seems to eclipse others in performance [0] and reliability - Kyle (aphyr) the author of Jepsen series on distributed systems correctness, said: "haven't tested foundation in part because their testing appears to be waaaay more rigorous than mine." [1]The way we use FDB, if anyone is interested, is we simply split files into small chunks (per FDB data design recommendations), and store all file's & folder's meta data in FDB such as byte count, create/access/write times, permissions, and a lot more. Folders are handled by the builtin Directory layer [2].[0] <a href="https://apple.github.io/foundationdb/performance.html" rel="nofollow">https://apple.github.io/foundationdb/performance.html</a>[1] <a href="https://web.archive.org/web/20150312112552/http://blog.foundationdb.com/call-me-maybe-foundationdb-vs-jepsen" rel="nofollow">https://web.archive.org/web/20150312112552/http://blog.found...</a>[2] <a href="https://forums.foundationdb.org/t/whats-the-purpose-of-the-directory-layer/677" rel="nofollow">https://forums.foundationdb.org/t/whats-the-purpose-of-the-d...</a>

评论 #27432908 未加载

评论 #27432060 未加载

jorangreefalmost 4 years ago

Markus Pilman from Snowflake did an awesome talk on FoundationDB's testing at CMU's Quarantine Tech Talks (2020), How I Learned to Stop Worrying and Trust the Database:<a href="https://www.youtube.com/watch?v=OJb8A6h9jQQ" rel="nofollow">https://www.youtube.com/watch?v=OJb8A6h9jQQ</a>

评论 #27425421 未加载

georgelyonalmost 4 years ago

FDB is an awesome and unique piece of software (I attribute quite a bit of Snowflake's success to FDB). I've also had the pleasure of meeting some folks from the original team and they are true engineers. Does anyone know if/when Redwood (the new storage engine) has landed / will land?

评论 #27427284 未加载

评论 #27426089 未加载

eyelovewealmost 4 years ago

CouchDB 4 is built upon Foundation FWIW

评论 #27427430 未加载

jwralmost 4 years ago

I just implemented a database with changefeeds using FoundationDB (in Clojure), to eventually replace RethinkDB in my system. Very impressed so far.

评论 #27429447 未加载

Meaialmost 4 years ago

Personally I don't understand how you can call a database robust if it can't scale down nodes after you scaled them up once. What am I supposed to do if I ever deploy to 50 nodes and then it turns out that I only need 5. Shut the business down? Pay to run database servers forever that I don't even need anymore? Also the database configuration has a lot of gotchas and is very opaque. You might be waiting for 30sec for your CLI to connect to your localhost cluster of two processes and you have no idea what is happening or why it is taking that long. It just never felt so safe and robust as people claim it to be. I don't know, these were just my findings on the brief tests I did with it.Also you better get familiar with a whole bunch of hidden "knobs" that are apparently configurable and very important somewhere and then get printed out into xml logs but of course there is no log viewer so you have to write your own. Maybe this isn't a problem for large companies but I'm providing feedback as a single user here.I also don't understand how people can praise the c++ DSL. They should rewrite that into standard c++ coroutines as soon as possible so their entire build and dev environment isn't so hard to understand. As a user of open source software I generally like to be able to debug through the projects I use and figure out problems I have. It's much harder when a project uses their own custom language. I certainly tried to set it all up correctly but there always seemed to be some problems in regards to Intellisense within the IDE.

评论 #27430657 未加载

jFriedensreichalmost 4 years ago

I am pretty sure that the new cloudant transaction/storage engine is also based on foundationDB, which powers a lot of things behind the scenes at ibm. And couchdb 4 with foundationDB storage engine is hopefully not too far out either. Lets see how long this whole transition takes, but i am still hopeful that the mindshare and motivation of apple, snowflake, ibm and apache community will lead to something great.

z77dj3klalmost 4 years ago

This is a really good document:<a href="https://apple.github.io/foundationdb/data-modeling.html" rel="nofollow">https://apple.github.io/foundationdb/data-modeling.html</a>I have been studying these key-value stores with efficient range iteration lately (such as LevelDB, RocksDB, BigTable, FoundationDB, etc). This is a great reference on how to make such a simple abstraction do a lot of useful things.

twoodfinalmost 4 years ago

Did they ever implement a SQL layer? They seemed like one of the only NoSQL products with the architecture to make it plausible to do so.

评论 #27441284 未加载

评论 #27440303 未加载

AtlasBarfedalmost 4 years ago

They got acquihired by apple, didn't they? Was. Fdb ever oss'd?Is it CP or AP? Comments seem to imply AP

评论 #27426368 未加载

评论 #27427302 未加载

e12ealmost 4 years ago

This seems like a good place to ask - are there any new and exiting FOSS "application" worth checking out? I recall from the initial publication of the source - there was references to a great sql layer? I don't know if a FOSS work-a-like ever materialized? Other things I'd hoped for was a network filesystem/blob layer, like maybe s3/nfs/webdavfs compatible? What are people building on top of foundationdb today?Ed: i suppose various document/db applications - like IMAP might be a good fit too?

评论 #27428112 未加载

评论 #27427616 未加载

评论 #27425645 未加载

acjohnson55almost 4 years ago

Am I right that this is like a distributed form of something like LevelDB or RocksDB, which would be the underlying storage engine for a full database product?And/or would it be comparable to DynamoDB?

one2three4almost 4 years ago

>> In its newest release, CouchDB [2] (arguably the firstNoSQL system) is being re-built as a layer on top of FoundationDB.That is impressive. Like a framework for implementing noSQL DBs.

评论 #27443987 未加载

dinedalalmost 4 years ago

Does anyone know of the"sqlite connector" mentioned in this post ?<a href="https://opensourceconnections.com/blog/2013/05/06/does-foundationdb-beat-the-cap-conjecture-hackathon-on-friday/" rel="nofollow">https://opensourceconnections.com/blog/2013/05/06/does-found...</a>It would be really cool to find it, if it's still out there.

jbverschooralmost 4 years ago

It's unfortunate that they went silent for years after the Apple acquisition. That period was key for database adoption. I have the feeling everybody kind of settled for pgsql.

评论 #27428181 未加载

评论 #27427481 未加载

strangattractoralmost 4 years ago

With little or no admin and monitoring tools.

polskibusalmost 4 years ago

What is the backup / restore story in FoundationDB? How does it compare to postgresql?

评论 #27426549 未加载

gigatexalalmost 4 years ago

Didn't couchbase move to FDB for their underlying engine?

jtdevalmost 4 years ago

I’d love to see a good primer on data models and scenarios that are well suited to FDB.

评论 #27425090 未加载

评论 #27425469 未加载