TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Thread-Per-Core Buffer Management for a modern storage system

118 点作者 arjunnarayan超过 4 年前

7 条评论

bob1029超过 4 年前
More threads (i.e. shared state) is a huge mistake if you are trying to maintain a storage subsystem with synchronous access semantics.<p>I am starting to think you can handle all storage requests for a single logical node on just one core&#x2F;thread. I have been pushing 5~10 million JSON-serialized entities to disk per second with a single managed thread in .NET Core (using a Samsung 970 Pro for testing). This <i>includes</i> indexing and sequential integer key assignment. This testing will completely saturate the drive (over 1 gigabyte per second steady-state). Just getting an increment of a 64 bit integer over a million times per second across an arbitrary number of threads is a big ask. This is the difference you can see when you double down on single threaded ideology for this type of problem domain.<p>The technical trick to my success is to run all of the database operations in micro batches (10~1000 microseconds per). I use LMAX Disruptor, so the batches are formed naturally based on throughput conditions. Selecting data structures and algorithms that work well in this type of setup is critical. Append-only is a must with flash and makes orders of magnitude difference in performance. Everything else (b-tree algorithms, etc) follows from this realization.<p>Put another way, If you find yourself using Task or async&#x2F;await primitives when trying to talk to something as fast as NVMe flash, you need to rethink your approach. The overhead with multiple threads, task parallel abstractions, et. al. is going to cripple any notion of high throughput in a synchronous storage domain.
评论 #25401029 未加载
评论 #25400873 未加载
评论 #25401976 未加载
zinclozenge超过 4 年前
If anybody&#x27;s interested, there&#x27;s a Seastar inspired library for Rust that is being developed <a href="https:&#x2F;&#x2F;github.com&#x2F;DataDog&#x2F;glommio" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;DataDog&#x2F;glommio</a>
lrossi超过 4 年前
Disappointed to see that you spent 25% of article space to describe in detail all the ways in which computer hardware got faster, then you promised to show how your project is taking advantage of this, but you are not showing any performance measurements at all. Just a very fancy architecture.<p>Correct me if I’m wrong, but the only number that I can find is a guarantee that you do not exceed 500 us of latency when handling a request. And it’s not clear if this is a guarantee at all, since you say just that the system will throw a traceback in case of latency spikes.<p>I would have liked to see the how latency varies under load, how much throughput you can achieve, how the latency long tail looks like on a long-running production load, and comparisons with off-the-shelf systems tuned reasonably.
评论 #25407468 未加载
dotnwat超过 4 年前
Noah here, developer at Vectorized. Happy to answer any questions.
评论 #25401262 未加载
评论 #25401828 未加载
eis超过 4 年前
I&#x27;d be interested in the write amplification since Redpanda went pretty low level in the IO layer. How do you guarantee atomic writes when virtually no disk provides guarantees other than on a page level which could result in destroying already written data if a write to the same page fails - at least in theory - and so one has to resort to writing data multiple times.
mirekrusin超过 4 年前
What is the point of talking performance-by-thread-per-core if raft sits in front of it, ie. only one will do the work at any time anyway?
评论 #25401709 未加载
matthewtovbin超过 4 年前
@arjunnarayan Have you evaluated the performance against vanilla Kafka &#x2F; Confluent Cloud? Where can I see the results?