TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Writing a Time Series Database from Scratch (2017)

73 点作者 potomak将近 7 年前

4 条评论

rdtsc将近 7 年前
After trying InfluxDB and it not working well for my use case I&#x27;ve written a TSDB as well. I suspect that&#x27;s the joke around these parts, everyone and their cousin has written as TSDB.<p>Used time based blocks just like the article says. But they also switched if they got larger than 2GB, because it allowed the index offset to be 4 bytes. That allowed pretty fast searching as well as discarding old data.<p>The data and index and were both written to separate files. Some of the values were not just simple integers&#x2F;floats but large blobs that&#x27;s why separate data files.<p>Decided not to go explicitly with mmap-ing on read. Started that way but abandoned it. A simple pread (read + seek in one syscall) and relying on page caching worked just as well.<p>Should have used an inverted index for labels, that&#x27;s a good idea. Maybe just sqlite or something like that... Though my queries were not as free form and some were more common than others. So built custom indices for the common queries.<p>Index files were also partitioned by time but were synchronized to the same partitioning schedule as the main data file. So if main data file opened a new block file, indices did the same.<p>Didn&#x27;t explicitly handle in memory vs not-in memory blocks when writing, but relied on periodic fsync-ing.<p>Instead of a wal decided that some data that wasn&#x27;t fsync-ed yet might be lost if power was cut. To deal with corruption on start, data in main file and extra indices were truncated to lasted sane fsync of the main time based index.
ahmedalsudani将近 7 年前
Previous discussion: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=14177411" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=14177411</a>
latchkey将近 7 年前
When I read this blog post, it convinced me to use prometheus &#x2F; grafana. I&#x27;m now monitoring over 1000 Pi class boxes using a small custom golang agent running on each one. Things have been rock solid. Couldn&#x27;t be happier with this solution. Thanks for all the hard effort.
marknadal将近 7 年前
This is awesome :) having written a database myself, I very much encourage others to try it out for fun, learning, and maybe seriously too!