D1: Our SQL database

592 pointsby elithrarabout 3 years ago

43 comments

slashdevabout 3 years ago

For a Cloudflare article, this one is surprisingly light on technical details. And for the product where it most matters.I'm guessing this is a single master database with multiple read replicas. That means it's not consistent anymore (the C in ACID). Obviously reads after a write will see stale data until the write propogates.I'm a bit curious how that replication works. Ship the whole db? Binary diffs of the master? Ship the SQL statements that did the write and reapply them? Lots of performance and other tradeoffs here.What's the latency like? This likely doesn't run in every edge location. Does the database ship out on the first request. Get cached with an expiry? Does the request itself move to the database instead of running at the edge - like maybe this runs on a select subset of locations?So many questions, but no details yet.

评论 #31342530 未加载

评论 #31341831 未加载

评论 #31341661 未加载

评论 #31343847 未加载

评论 #31357560 未加载

kurinikkuabout 3 years ago

wow SQLite getting a lot of love these days<a href="https://tailscale.com/blog/database-for-2022" rel="nofollow">https://tailscale.com/blog/database-for-2022</a><a href="https://fly.io/blog/all-in-on-sqlite-litestream" rel="nofollow">https://fly.io/blog/all-in-on-sqlite-litestream</a><a href="https://blog.cloudflare.com/introducing-d1" rel="nofollow">https://blog.cloudflare.com/introducing-d1</a>

评论 #31342984 未加载

评论 #31339990 未加载

评论 #31339935 未加载

评论 #31340008 未加载

评论 #31340772 未加载

评论 #31340135 未加载

评论 #31340525 未加载

jgrahamcabout 3 years ago

BTW R2 is open beta now: <a href="https://blog.cloudflare.com/r2-open-beta/" rel="nofollow">https://blog.cloudflare.com/r2-open-beta/</a>

评论 #31341385 未加载

losvedirabout 3 years ago

Wow, this looks potentially very interesting. Since this is sort of fresh in my mind from the recent Fly post about it:* How exactly is the read replication implemented? Is it using litestream behind the scenes to stream the WAL somewhere? How do the readers keep up? Last I saw you just had to poll it, but that could be computationally expensive depending on the size of the data (since I thought you had to download the whole DB), and could potentially introduce a bit of latency in propagation. Any idea what the metrics are for latency in propagation?* How are writes handled? Does it do the Fly thing about sending all requests to one worker?I don't quite know what a "worker" is but I'm assuming it's kind of like a Lambda? If you have it replicated around the world, is that one worker all running the same code, and Cloudflare somehow manages the SQL replicating and write forwarding? Or would those all be separate workers?

hn_ei_ser_23about 3 years ago

First, I'm very excited. Sure, SQLite has some limitations compared to Postgres, esp. regarding the type system and concurrency. But we get ACID compliance and SQL.But it is really hard getting some useful information from this article. I can't even tell if it is not there or just buried in all this marketing hot air.So, what is it really? Is there one Write-Master that is asynchronously replicated to all other locations? Will writes be forwarded to this master and then replicated back?I'm very curious about how it performs in real life. Especially considering the locking behavior (SQLite has always the isolation level 'serializable' iirc). The more you put in a transaction or the longer you have to wait for another process to finish their writes, the more likely you have to deal with stale data.But overall I'm very excited. Also by the fly.io announcement, of course. Lots of innovation and competition. Good times for customers.

评论 #31340675 未加载

infogulchabout 3 years ago

Very cool! Glad to see all the love for SQLite recently.One thing I've noticed that many commenters miss about read-replicated SQLite is assuming that the only valid model is having one, giant, centralized database with all the data. Lets be honest with ourselves, the vast majority of applications hold personal or B2B data and don't need centralized transactions, and at scale will use multi-tenant primary keys or manual sharding anyways. For private data, a single SQLite database per user / business will easily satisfy the write load of all but the most gigantic corporations. With this model you have unbounded compute scaling for new users because they very likely don't need online transactions across multiple databases at once.Some questions:Will D1 be able to deliver this design of having many thousands of separate databases for a single application? Will this be problematic from a cost perspective?> since we're building on the redundant storage of Durable Objects, your database can physically move locations as neededWill D1 be able to easily migrate the "primary" at will? CockroachDB described this as "follow the sun" primary.

评论 #31342974 未加载

fzaninottoabout 3 years ago

Love the Northwind Traders reference! However, for a demo, I suggest a slightly larger and more complex data set, [data-generator-retail](<a href="https://www.npmjs.com/package/data-generator-retail" rel="nofollow">https://www.npmjs.com/package/data-generator-retail</a>).The demo is also a bit buggy: orders are duplicated as many times as there are products, but clicking on the various lines of the same order leads to the same record, where the user can only see the first product...I also think the demo would have more impact if it wasn't read-only (although I understand that this could lead to broken pages if visitors mess up with the data).Anyway, kudos to the CloudFlare team!

评论 #31340531 未加载

评论 #31342355 未加载

rangunaabout 3 years ago

This looks amazing!I see cloudflare people are on this post, any chance to compar D1 vs postgres in terms of DB features?Insert ... ReturningStored procedures and triggersEtc etcWould be really helpful to get a comparison like cockroachDB did here <a href="https://www.cockroachlabs.com/docs/stable/postgresql-compatibility.html" rel="nofollow">https://www.cockroachlabs.com/docs/stable/postgresql-compati...</a>Or even better, a general sql compatibility matrix like this <a href="https://www.cockroachlabs.com/docs/stable/sql-feature-support.html" rel="nofollow">https://www.cockroachlabs.com/docs/stable/sql-feature-suppor...</a>Kudos to the cloudflare team!

评论 #31340089 未加载

评论 #31340740 未加载

评论 #31340137 未加载

the_dukeabout 3 years ago

All this recent hype around sqlite...sqlite is a great embedded database and thanks to use by browsers and on mobile the most used database in the world by orders of magnitude.But it also comes with lots of limitations.* there is no type safety, unless you run with the new strict mode, which comes with some significant drawbacks (eg limited to the handful of primitive types)* very narrow set of column types and overall functionality in general* the big one for me: limited migration support, requiring quite a lot of ceremony for common tasks (eg rewriting a whole table and swapping it out)These approaches (like fly.io s) with read replication also (apparently?) seem to throw away read after write consistency. Which might be fine for certain use cases and even desirable for resilience, but can impact application design quite a lot.With sqlite you have do to a lot more in your own code because the database gives you fewer tools. Which is usually fine because most usage is "single writer, single or a few local readers". Moving that to a distributed setting with multiple deployed versions of code is not without difficulty.This seems to be mitigated/solved here though by the ability to run worker code "next to the database".I'm somewhat surprised they went this route. It probably makes sense given the constraints of Cloudflares architecture and the complexity of running a more advanced globally distributed database.On the upside: hopefully this usage in domains that are somewhat unusual can lead to funding for more upstream sqlite features.

评论 #31340893 未加载

评论 #31341339 未加载

评论 #31340331 未加载

评论 #31340761 未加载

评论 #31340409 未加载

评论 #31341699 未加载

ngrillyabout 3 years ago

Not clear from reading the post if the SQLite C library is embedded and linked in the Worker runtime (which would mean no network roundtrip) or if each query or batch of queries is converted to a network request to a server embedding the SQLite C library.That's important to understand because that's one of the key advantages of SQLite compared to the usual client-server architecture of databases like PostgreSQL or MySQL: <a href="https://www.sqlite.org/np1queryprob.html" rel="nofollow">https://www.sqlite.org/np1queryprob.html</a>

samwillisabout 3 years ago

This is really interesting, it's (basing it on SQLite) exactly what I was expecting CloudFlare to do for their first DB.Its perfect for content type sites that want search and querying.Anyone from CF here, is it using Litestream (<a href="https://litestream.io" rel="nofollow">https://litestream.io</a>) for its replication or have you built your own replication system?I assume this first version is somewhat limited on write performance having a single "main" instance and SQLite laking concurrent writes? It seems to me that using SQLite sessions[0] would be a good way to build an eventually consistent replication system for SQLite, would be perfect for an edge first sql database, maybe D2?0: <a href="https://www.sqlite.org/sessionintro.html" rel="nofollow">https://www.sqlite.org/sessionintro.html</a>

评论 #31339466 未加载

endisneighabout 3 years ago

Have any of the problems that led people to use Postgres instead of SQLite actually been solved? Are we doomed to repeat the same mistakes?Also, any plans to support PATCH x-update-range so SQLite can be used entirely in the browser via SQLite.js?Can someone enlighten me with the types of use cases this would be better for vs say Postgres?

评论 #31340958 未加载

评论 #31340256 未加载

评论 #31340356 未加载

评论 #31340216 未加载

lucasyvasabout 3 years ago

To the person from Cloudflare I complained to in last year's thread about putting your money where your mouth is on serverless databases:You weren't lying, and this is super cool - the SQLite hype train also seems to be in full force.

评论 #31340905 未加载

评论 #31340898 未加载

rmbyrroabout 3 years ago

I'm buying Cloudflare stocks right now.In 2-3 years from now, these services will be so mature and strong they will be crushing the cloud market.They're turning dreams into reality, one after another.

评论 #31340545 未加载

jpcapdevilaabout 3 years ago

If SQLite gets you excited, I'm building a firebase alternative based on sqlite. I'm betting hard on sqlite so this get's me super excited!!<a href="https://javascriptdb.com" rel="nofollow">https://javascriptdb.com</a>CF people around, I would love to chat, if anyone is interested please reach out at: jp@javascriptdb.comI'll be applying to this beta for sure!

评论 #31343991 未加载

mwcampbellabout 3 years ago

Any current or planned support for existing ORMs, such as Prisma or TypeOrm?Also, I wonder how hard it will be to migrate existing PostgreSQL databases and SQL statements. Of course, I understand if Cloudflare is focused on greenfield applications.

评论 #31341513 未加载

评论 #31340223 未加载

评论 #31339775 未加载

评论 #31340236 未加载

ryantoabout 3 years ago

This is so cool!From the blog post it says read-only replicas are created close to users and kept up to date with the latest data.- How should I think about this in terms of CAP? If there's a write and I query a replica what happens?- How are writes handled? Do they go to a single location or are they handled by various locations?I'm excited to try this. It's so cool to see databases being distributed "on CDNs" for lack of a better term.

评论 #31340352 未加载

tyingqabout 3 years ago

"With D1, it will be possible to define a chunk of your Worker code that runs directly next to the database...each request first hits your Worker near your users, but depending on the operation, can hand off to another Worker deployed alongside a replica or your primary D1 instance to complete its work."That's interesting to me. It opens the door for Cloudflare to offer something more like a "normal" serverless offering. One that can run containers, or least natively run Python/Golang/Java/etc, like AWS Lambda does. And with this ecosystem described above that can conditionally route between the lighter edge Workers and the heavier central serverless functions. To me, that's the tipping point where they start to threaten larger portions of AWS.

SheinhardtWigCoabout 3 years ago

Big fan of Cloudflare but I wish they would stick to descriptive product names.Good: Workers, KV, Durable Objects, Cron TriggersBad: Spectrum, Zaraz, R2, D1

评论 #31342369 未加载

lucasyvasabout 3 years ago

The API for this is currently the only thing I wish I could grok a bit better. It seems like it would be hard to make it work with existing libraries that can access SQLite, which is kind of a shame.I'm thinking of sqlx in Rust (or any other language binding / ORM for that matter), which has compile time schema safety. This is a nice capability, and because this interface seems non-standard (possibly for good reason), I guess we are being asked to give some of those things up.I am getting a bit ahead of myself on the Rust part (presumably that will eventually be supported as part of workers-rs), but I think the feelings still stand if you consider the JS ecosystem.Edit: I may actually be wrong, but presumably the entire surface isn't covered because there's no file opening, etc.

评论 #31341228 未加载

irq-1about 3 years ago

Best Effort Writes[1] are an opportunity here. Non-transactional, write to the local replica (ensure foreign keys, constrains, valid data, etc...) and then try to write to the main write-enabled DB. Caching should work without changes since the local replica is updated. This could be cheaper (send binary diffs) and more resilient to brief network issues.The key is to let the user decide what really needs ACID and what doesn't. If someone wants to make the next Facebook or Reddit they'll need huge write throughput and if some votes or updates are lost, that may be a good trade-off.[1] You could add a BEW file (like WAL file) to sqlite for Best Effort Writes.

didipabout 3 years ago

All these hype around SQLite recently and I am still confused.* How do you replicate it consistently?* Who has the master privilege (or masters if sharded)? What's the failover story?I am guessing a blob store is involved, but I have gaps in my understanding here.

评论 #31345552 未加载

frogger8about 3 years ago

Not a expert on DOM or JavaScript so be kind ;)One thing I hope to see in the future is a better product filtering experience. When I worked on a jquery product filter I realized the DOM bloat was the main problem.I wonder if D1 can help devs build instant product filtering pages that don’t require the reload like microcenter or Newegg does.IE <a href="https://www.newegg.com/p/pl?d=hdmi+cable&N=-1&SortType=8" rel="nofollow">https://www.newegg.com/p/pl?d=hdmi+cable&N=-1&SortType=8</a>

评论 #31339739 未加载

评论 #31340206 未加载

_kyranabout 3 years ago

So can we assume that D2 will be postgres/mysql ?

评论 #31340325 未加载

greenie_beansabout 3 years ago

dang i was hoping for postgres so i can use postgisedit: maybe one day! this looks cool regardless

评论 #31340418 未加载

评论 #31346872 未加载

aeyesabout 3 years ago

What write throughput and latency can we expect from this database?Are there any limitations, for example on the number of tables or size of the database?

xwdvabout 3 years ago

With this we can probably switch our infrastructure off AWS and entirely onto Cloudflare.

pier25about 3 years ago

So where are the databases running? In the same regions as workers?Is the data replicated to all regions?

dinklebergabout 3 years ago

This is convenient, I’ve been building an app which is using SQLite but am wanting to deploy it to Cloudflare pages. I expected I was going to have to switch to a hosted Postgres instance somewhere, but this could be perfect.

jcuenodabout 3 years ago

So I assume we'll see a nice big donation to the sqlite coffers, then?

ralusekabout 3 years ago

Unless I missed it by skimming, where are the deets? Is this strongly or eventually consistent? What are max table sizes, and do they become partitioned? Are there cross partition joins?

robertlagrantabout 3 years ago

This looks awesome. I was thinking about creating a custom version of this to live behind a CF Worker. Much better to have an official version!

estensenabout 3 years ago

Too bad you probably can't use this to store data about EU citizens. Phone numbers like they show in the demo are considered PII, right?

评论 #31341829 未加载

whitepaintabout 3 years ago

Will they seriously challenge Azure, AWS and GCP eventually? Cloudflare is very innovative and what they are doing is really exciting.

评论 #31341418 未加载

philholdenabout 3 years ago

Glad to hear was considering moving to Deno Deploy + Supabase because KV was not good for relationships.

jzer0coolabout 3 years ago

How does this work when developing locally. Is it SQLite for local development?

benjiweberabout 3 years ago

I was expecting this to be using <a href="https://en.wikipedia.org/wiki/D_(data_language_specification)" rel="nofollow">https://en.wikipedia.org/wiki/D_(data_language_specification...</a> given the name.

polskibusabout 3 years ago

Is this going to be open sourced? Seems to be building on the shoulder of a particular giant that could use a bit wider ecosystem.

deancabout 3 years ago

Any word on pricing =)?

oxffabout 3 years ago

Its a bold strategy, Cotton, sounding a bit like they want to compete with AWS.

onphonenowabout 3 years ago

Our first database … I like it. I wonder what’s next

alberthabout 3 years ago

First, super excited by having Cloudflare offer a RDMS (can SQLite be called that?)This enables entirely new classes of applications where everything can now be hosted by Cloudflare.Questions:a. To help with concurrent writes, will Cloudflare be using WAL2 and BEGIN CONCURRENT branches of SQLite?b. How is Cloudflare replicating the data cross region? Will it be Litestream.io behind the scenes?c. Will our Worker code need to be written differently to ensure only a single-writer is writing to SQLite database?d. How does data persistency and database file size get factored in? I have to imagine their is a limit to how much storage can be used, whether or not that storage is local to the Worker machine, and if its persistent.

rvzabout 3 years ago

Now is this a Cloudflare ($NET) buy signal? I think you know the answer.Maybe they will announce a Hashicorp competitor in their next reveal. Who knows.

43 comments

slashdevabout 3 years ago

评论 #31342530 未加载

评论 #31341831 未加载

评论 #31341661 未加载

评论 #31343847 未加载

评论 #31357560 未加载

kurinikkuabout 3 years ago

评论 #31342984 未加载

评论 #31339990 未加载

评论 #31339935 未加载

评论 #31340008 未加载

评论 #31340772 未加载

评论 #31340135 未加载

评论 #31340525 未加载

jgrahamcabout 3 years ago

BTW R2 is open beta now: <a href="https://blog.cloudflare.com/r2-open-beta/" rel="nofollow">https://blog.cloudflare.com/r2-open-beta/</a>

评论 #31341385 未加载

losvedirabout 3 years ago

hn_ei_ser_23about 3 years ago

评论 #31340675 未加载

infogulchabout 3 years ago

评论 #31342974 未加载

fzaninottoabout 3 years ago

评论 #31340531 未加载

评论 #31342355 未加载

rangunaabout 3 years ago

评论 #31340089 未加载

评论 #31340740 未加载

评论 #31340137 未加载

the_dukeabout 3 years ago

评论 #31340893 未加载

评论 #31341339 未加载

评论 #31340331 未加载

评论 #31340761 未加载

评论 #31340409 未加载

评论 #31341699 未加载

ngrillyabout 3 years ago

samwillisabout 3 years ago

评论 #31339466 未加载

endisneighabout 3 years ago

评论 #31340958 未加载

评论 #31340256 未加载

评论 #31340356 未加载

评论 #31340216 未加载

lucasyvasabout 3 years ago

评论 #31340905 未加载

评论 #31340898 未加载

rmbyrroabout 3 years ago

评论 #31340545 未加载

jpcapdevilaabout 3 years ago

评论 #31343991 未加载

mwcampbellabout 3 years ago

评论 #31341513 未加载

评论 #31340223 未加载

评论 #31339775 未加载

评论 #31340236 未加载

ryantoabout 3 years ago

评论 #31340352 未加载

tyingqabout 3 years ago

SheinhardtWigCoabout 3 years ago

Big fan of Cloudflare but I wish they would stick to descriptive product names.Good: Workers, KV, Durable Objects, Cron TriggersBad: Spectrum, Zaraz, R2, D1

评论 #31342369 未加载

lucasyvasabout 3 years ago

评论 #31341228 未加载

irq-1about 3 years ago

didipabout 3 years ago

评论 #31345552 未加载

frogger8about 3 years ago

评论 #31339739 未加载

评论 #31340206 未加载

_kyranabout 3 years ago

So can we assume that D2 will be postgres/mysql ?

评论 #31340325 未加载

greenie_beansabout 3 years ago

dang i was hoping for postgres so i can use postgisedit: maybe one day! this looks cool regardless

评论 #31340418 未加载

评论 #31346872 未加载

aeyesabout 3 years ago

What write throughput and latency can we expect from this database?Are there any limitations, for example on the number of tables or size of the database?

xwdvabout 3 years ago

With this we can probably switch our infrastructure off AWS and entirely onto Cloudflare.

pier25about 3 years ago

So where are the databases running? In the same regions as workers?Is the data replicated to all regions?

dinklebergabout 3 years ago

jcuenodabout 3 years ago

So I assume we'll see a nice big donation to the sqlite coffers, then?

ralusekabout 3 years ago

Unless I missed it by skimming, where are the deets? Is this strongly or eventually consistent? What are max table sizes, and do they become partitioned? Are there cross partition joins?

robertlagrantabout 3 years ago

This looks awesome. I was thinking about creating a custom version of this to live behind a CF Worker. Much better to have an official version!

estensenabout 3 years ago

Too bad you probably can't use this to store data about EU citizens. Phone numbers like they show in the demo are considered PII, right?

评论 #31341829 未加载

whitepaintabout 3 years ago

Will they seriously challenge Azure, AWS and GCP eventually? Cloudflare is very innovative and what they are doing is really exciting.

评论 #31341418 未加载

philholdenabout 3 years ago

Glad to hear was considering moving to Deno Deploy + Supabase because KV was not good for relationships.

jzer0coolabout 3 years ago

How does this work when developing locally. Is it SQLite for local development?

benjiweberabout 3 years ago

polskibusabout 3 years ago

Is this going to be open sourced? Seems to be building on the shoulder of a particular giant that could use a bit wider ecosystem.

deancabout 3 years ago

Any word on pricing =)?

oxffabout 3 years ago

Its a bold strategy, Cotton, sounding a bit like they want to compete with AWS.

onphonenowabout 3 years ago

Our first database … I like it. I wonder what’s next

alberthabout 3 years ago

rvzabout 3 years ago

Now is this a Cloudflare ($NET) buy signal? I think you know the answer.Maybe they will announce a Hashicorp competitor in their next reveal. Who knows.