TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Building CockroachDB on top of RocksDB

180 pointsby bandwitchover 6 years ago

8 comments

tschellenbachover 6 years ago
Our in-house DB at Stream also runs on top of RocksDB + Raft. Its amazing just how much faster it is than anything else out there (especially compared to cassandra). Instagram uses rocksdb as storage for Cassandra, Linkedin and pinterest use rocksdb. As soon as you have the time to build your own db using rocksdb you get really finegrained control over performance.<p><a href="https:&#x2F;&#x2F;stackshare.io&#x2F;stream&#x2F;stream-and-go-news-feeds-for-over-300-million-end-users" rel="nofollow">https:&#x2F;&#x2F;stackshare.io&#x2F;stream&#x2F;stream-and-go-news-feeds-for-ov...</a>
评论 #18941369 未加载
peterwwillisover 6 years ago
RocksDB is a fork of LevelDB, which was [in]famous for its ease of corrupting data. Did Facebook ever do anything to ensure data wouldn&#x27;t corrupt, or is that still a common thing operationally? (You find it more at larger scales)<p>Here&#x27;s an example of how data corruption can suck, with (example) Riak and LevelDB. The leveldb data would corrupt often, which would leave you in a predicament. Say you had 10 nodes with a 3 node replication factor, and the whole cluster is humming away at a decent clip. Now one node&#x27;s leveldb corrupts, and you have to rebuild it. If you have a huge fuckoff dataset, this can take a while. Now another node goes down. Now only 1 node has the data you need, and 2 nodes are down - so now 8 nodes are doing the work of 10, and if you have any more failures, your data might be gone. Now add replication, which will suck performance and bandwidth away from the regular work. And because it would corrupt so easily &amp; often, there needed to be hash trees to quickly identify what data was corrupt, and then you needed to fix it and rebuild your hash trees. This would also suck away performance. Finally, you can&#x27;t just add new nodes while rebuilding, because the extra load makes the cluster fall over. And the more nodes, the higher the likelihood of failures.
评论 #18943492 未加载
评论 #18943521 未加载
评论 #18943174 未加载
评论 #18943408 未加载
polskibusover 6 years ago
I noticed that RocksDB is used very often in OLTP scenarios. What&#x27;s the OLAP equivalent of RocksDB in OLTP world? Apache Parquet? Apache Arrow? What would you use these days to create a high performance OLAP&#x2F;OLHybridP engine ?
评论 #18940383 未加载
评论 #18939387 未加载
评论 #18939614 未加载
评论 #18943509 未加载
评论 #18939358 未加载
the_dukeover 6 years ago
Excellent article, very informative.<p>I just had to chuckle at this:<p>&gt; Non-engineers: in a computer, a move is always implemented as a copy followed by a delete<p>Yeah, that&#x27;s really gonna help a non-developer understand the article better...
评论 #18940376 未加载
perfmodeover 6 years ago
When a SQL implementation is built on a KV storage engine, how do tables, rows, and columns typically map to the underlying KV data model?
评论 #18939691 未加载
评论 #18939739 未加载
评论 #18939712 未加载
dominotwover 6 years ago
&gt; If you surveyed most NewSQL databases today, most of them are built on top of an LSM, namely, RocksDB.<p>Is this actually true?<p>spark, foundationdb, memsql, nuodb , citus . I am not sure any of these are built on top of rocksdb.<p>Which ones are actually built on lsm?
评论 #18943392 未加载
评论 #18941327 未加载
kureikainover 6 years ago
If someone love LevelDB&#x2F;RocksDB but want to use a pure-Go implementation, I have good thing about this library:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;syndtr&#x2F;goleveldb" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;syndtr&#x2F;goleveldb</a>
ddorian43over 6 years ago
Seems like few features: 1. sstables as different files 2. range delete (which is rare)<p>compared to LMDB (which is faster &amp; more efficient): <a href="https:&#x2F;&#x2F;symas.com&#x2F;lmdb&#x2F;technical&#x2F;" rel="nofollow">https:&#x2F;&#x2F;symas.com&#x2F;lmdb&#x2F;technical&#x2F;</a><p>Still would be nice to see how LMDB would fare in a complex distributed DBMS (most of them are in rocksdb-type libraries).<p>But LMDB is supposed to stay small. So more features are in a fork: <a href="https:&#x2F;&#x2F;github.com&#x2F;leo-yuriev&#x2F;libmdbx" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;leo-yuriev&#x2F;libmdbx</a>
评论 #18948643 未加载