TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Writing a Time Series Database from Scratch (2017)

73 pointsby potomakabout 7 years ago

4 comments

rdtscabout 7 years ago
After trying InfluxDB and it not working well for my use case I&#x27;ve written a TSDB as well. I suspect that&#x27;s the joke around these parts, everyone and their cousin has written as TSDB.<p>Used time based blocks just like the article says. But they also switched if they got larger than 2GB, because it allowed the index offset to be 4 bytes. That allowed pretty fast searching as well as discarding old data.<p>The data and index and were both written to separate files. Some of the values were not just simple integers&#x2F;floats but large blobs that&#x27;s why separate data files.<p>Decided not to go explicitly with mmap-ing on read. Started that way but abandoned it. A simple pread (read + seek in one syscall) and relying on page caching worked just as well.<p>Should have used an inverted index for labels, that&#x27;s a good idea. Maybe just sqlite or something like that... Though my queries were not as free form and some were more common than others. So built custom indices for the common queries.<p>Index files were also partitioned by time but were synchronized to the same partitioning schedule as the main data file. So if main data file opened a new block file, indices did the same.<p>Didn&#x27;t explicitly handle in memory vs not-in memory blocks when writing, but relied on periodic fsync-ing.<p>Instead of a wal decided that some data that wasn&#x27;t fsync-ed yet might be lost if power was cut. To deal with corruption on start, data in main file and extra indices were truncated to lasted sane fsync of the main time based index.
ahmedalsudaniabout 7 years ago
Previous discussion: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=14177411" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=14177411</a>
latchkeyabout 7 years ago
When I read this blog post, it convinced me to use prometheus &#x2F; grafana. I&#x27;m now monitoring over 1000 Pi class boxes using a small custom golang agent running on each one. Things have been rock solid. Couldn&#x27;t be happier with this solution. Thanks for all the hard effort.
marknadalabout 7 years ago
This is awesome :) having written a database myself, I very much encourage others to try it out for fun, learning, and maybe seriously too!