TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Databases at 14.4Mhz

297 pointsby chtonover 10 years ago

15 comments

ChuckMcMover 10 years ago
This is Foundation DB&#x27;s announcement they are doing full ACID databases with a 14.4M writes per second capability. That is insanely fast in the data base world. Running in AWS with 32 c3.8xlarge configured machines. So basically NSA level data base capability for $150&#x2F;hr. But perhaps more interesting is that those same machines on the open market are about $225,000. That&#x27;s two rack, one switch and a transaction rate that lets you watch every purchase made at every Walmart store in the US, in real time. That is assuming the stats are correct[1], and it wouldn&#x27;t even be sweating (14M customers a day vs 14M transactions per second). Insanely fast.<p>I wish I was an investor in them.<p>[1] <a href="http://www.statisticbrain.com/wal-mart-company-statistics/" rel="nofollow">http:&#x2F;&#x2F;www.statisticbrain.com&#x2F;wal-mart-company-statistics&#x2F;</a>
评论 #8732435 未加载
hendzenover 10 years ago
This is very impressive, however...<p>See this tweet by @aphyr: <a href="https://twitter.com/aphyr/status/542755074380791809" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;aphyr&#x2F;status&#x2F;542755074380791809</a><p>(All credit for the idea in this comment is due to @aphyr)<p>Basically because the transactions modified keys selected from a uniform distribution, the probability of contention was extremely low. AKA this workload is basically a data-parallel problem, somewhat lessening the impressiveness of the high throughput. Would be interesting to see it with a Zipfian distribution (or even better, a Biebermark [0])<p>[0] - <a href="http://smalldatum.blogspot.co.il/2014/04/biebermarks.html" rel="nofollow">http:&#x2F;&#x2F;smalldatum.blogspot.co.il&#x2F;2014&#x2F;04&#x2F;biebermarks.html</a>
评论 #8731289 未加载
评论 #8735405 未加载
评论 #8731358 未加载
评论 #8731195 未加载
评论 #8731186 未加载
评论 #8731279 未加载
jrallisonover 10 years ago
We&#x27;ve been using FoundationDB in production for about 10 months now. It&#x27;s really been a game changer for us.<p>We continue to use it for more and more data access patterns which require strong consistency guarantees.<p>We currently store ~2 terabytes of data in a 12 node FDB cluster. It&#x27;s rock solid and comes out of the box with great tooling.<p>Excited about this release! My only regret is I didn&#x27;t find it sooner :)
评论 #8732447 未加载
bsaulover 10 years ago
Just watched the linked presentation about &quot;flow&quot; here : <a href="https://foundationdb.com/videos/testing-distributed-systems-with-deterministic-simulation" rel="nofollow">https:&#x2F;&#x2F;foundationdb.com&#x2F;videos&#x2F;testing-distributed-systems-...</a><p>Is it really the first Distributed DB project to have built a simulator ?<p>Because frankly, if that&#x27;s the case, it seems revolutionary to me. Intuitively, it seems like bringing the same kind of quality improvement as unit testing did to regular software development.<p>PS : i should add that this talk is one of the best i&#x27;ve seen this year. The guy is extremely smart, passionate, and clear. (i just loved the The Hurst exponent part).
评论 #8731415 未加载
dchichkovover 10 years ago
I&#x27;m only familiar with other key-value storage engines, not FoundationDB, but it seems like the goals are: &quot;distributed key-value database, read latencies below 500 microseconds, ACID, scalability&quot;.<p>I remember evaluating a few low latency key-value storage solutions, and one of these was Stanford&#x27;s RAMCloud, which is supposed to give 4-5 microseconds reads, 15 microseconds writes, scale up to 10,000 boxes and provide data durability. <a href="https://ramcloud.atlassian.net/wiki/display/RAM/RAMCloud" rel="nofollow">https:&#x2F;&#x2F;ramcloud.atlassian.net&#x2F;wiki&#x2F;display&#x2F;RAM&#x2F;RAMCloud</a> Seems like, that would be &quot;Databases at <i>2000Mhz</i>&quot;.<p>I&#x27;ve actually studied the code that was handling the network and it had been written pretty nicely, and as far as I know, it should work both over 10Gbe and Infiniband with similar latencies. And I&#x27;m not at all surprised, they could get pretty clean looking 4-5us latency distribution, with the code like that.<p>How does it compare with FoundationDB? Is it completely different technology?
评论 #8731210 未加载
评论 #8732000 未加载
felixgalloover 10 years ago
This looks very interesting and congratulations to the FoundationDB crew on some pretty amazing performance numbers.<p>One of the links leads to an interesting C++ actor preprocessor called &#x27;Flow&#x27;. In that table, it lists the performance result of sending a message around a ring for a certain number of processes and a certain number of messages, in which Flow appears to be fastest with 0.075 sec in the case of N=1000 and M=1000, compared with, e.g. erlang @ 1.09 seconds.<p>My curiosity was piqued, so I threw together a quick microbenchmark in erlang. On a moderately loaded 2013 macbook air (2-core i7) and erlang 17.1, with 1000 iterations of M=1000 and N=1000, it averaged 34 microseconds per run, which compares pretty favorably with Flow&#x27;s claimed 75000 microseconds. The Flow paper appears to maybe be from 2010, so it would be interesting to know how it&#x27;s doing in 2014.
评论 #8731938 未加载
shortstuffsushiover 10 years ago
As someone who has no idea about the cost of high-scale computing like this, is $150&#x2F;hr reasonable? It seems like an amount that&#x27;s hard to sustain to me, but I have no idea if that&#x27;s a steady, all the time rate, or a burst rate, or what. Or if it&#x27;s a set up you&#x27;d actually ever even need -- seems like from the examples they mention (like the Tweets), they&#x27;re above the need by a fair amount. Anyone else in this sort of situation care to chip in on that?
评论 #8731365 未加载
评论 #8733373 未加载
w8rbtover 10 years ago
I thought it was DB connections over radio waves just above 20 meters. Also, it&#x27;s MHz, not Mhz.
malikiover 10 years ago
This sounds great compared to my anecdotal experience with DB write performance; but is there a collection of database performance benchmarks that this can be easily compared to?<p>The best source for DB benchmarking I know of is <a href="http://www.tpc.org/" rel="nofollow">http:&#x2F;&#x2F;www.tpc.org&#x2F;</a>. The methodology is more complicated there, but the top results are around 8 million transactions per minute on $5 million systems. This FoundationDB result is more like 900 million transactions per minute on a system that costs $1.5 million a year to rent (so, approx $5 million to buy?).<p>The USD&#x2F;transactions-per-minute metric is clear, but without a standard test suite (schema, queries, client count, etc.), comparing claims of database performance makes my head hurt.
评论 #8740188 未加载
illumenover 10 years ago
Impressive.<p>However I think there&#x27;s still plenty of room to grow.<p>320,000 concurrent sessions isn&#x27;t that much by modern standards. You can get 12 million concurrent connections on one linux machine, and push 1gigabit of data.<p>Also, 167 megabytes per second (116B * 14.4 million) is not pushing the limits of what one machine can do. I&#x27;ve been able to process 680 megabytes per second of data into a custom video database, plus write it to disk on one 2010 machine. That&#x27;s doing heavy processing at the same time on the video with plenty of CPU to spare.<p>PCIe over fibre can do many transactions messages per second. You can fit 2TB memory machines in 1U (and more).<p>Since this is a memory + eventually dump to disk database, I think there is still a lot of room to grow.
mariusz79over 10 years ago
MHz not Mhz.
tuyguntnover 10 years ago
I wish I would work with such great professionals as a Junior for 10years, right after school!!!
oconnor663over 10 years ago
Any chance of an open source release for the database core? :)
lttlrckover 10 years ago
&quot;Or, as I like to say, 14.4Mhz.&quot;<p>Sorry, I don&#x27;t like that at all.
评论 #8730880 未加载
评论 #8731181 未加载
imanaccount247over 10 years ago
Why the deliberately misleading comparisons? If you are doing something genuinely impressive, then you should be able to be honest about it and have it still seem impressive. One tweet is not one write. Comparing tweets per second to writes per second is complete nonsense. How many writes a tweet causes depends on how many followers the person who is tweeting has. The 100 writes per second nonsense is even worse. Do you just think nobody is old enough to have used a database 15 years ago? 10,000 writes per second was no big deal on of the shelf hardware of the day, nevermind on an actual server.
评论 #8731752 未加载