TechEcho

7 comments

rektideabout 3 years ago

Lovely read. Condensing some, there's three node types in the system, writers, compactors, and readers.> Writers read from Kafka, (briefly) buffer events in memory, upload events to blob storage in our custom file format, and then commit the presence of these new files to our metadata store.... Compactors scan the metadata store for small files generated by the Writers and previous compactions, and compact them into larger files.... The Reader (leaf) nodes run queries over individual files in blob storage and return partial aggregates, which are re-aggregated by the distributed query engine.And then the meta-data supporting the system:> Husky's metadata store has multiple responsibilities, but its most important one is to serve as the strongly consistent source of truth for the set of files currently visible to each customer. We’ll delve into the details of our metadata store more in future blog posts, but it is a thin abstraction around FoundationDB, which we selected because it was one of the few open source OLTP database systems that met our requirementsThere's some nice scalability/isolation benefits in this all. Having reader nodes reading from network storage has created a lot of flexibility & ability to shift work around on demand.Keeping all the metadata in FoundationFB is exciting, & sounds like a great use case, for it's safe transactional updates!

评论 #31418297 未加载

dikeiabout 3 years ago

It's remarkable how the data pipeline in almost all companies converge to the same architecture:* You have services emit data into streams.* You dump the streams into your storage with high frequency so you can have near real-time result, this process will create many small files.* Because small files are inefficient, you have compactors that run over the small files and merge them into bigger files, and/or delete records that's obsolete.* You run a query engine that read over the small files and large files to get the final result.* To speed up step 2,3,4 you store the metadata of the files in-memory / in a database.

francoismassotabout 3 years ago

Nice article indeed, we ended up implementing the exact same architecture at Quickwit for... log search! :)<a href="https://twitter.com/fulmicoton/status/1526776987553263616" rel="nofollow">https://twitter.com/fulmicoton/status/1526776987553263616</a> <a href="https://github.com/quickwit-oss/quickwit" rel="nofollow">https://github.com/quickwit-oss/quickwit</a>

ovaistariqabout 3 years ago

This is a great read, thanks for sharing the architecture. I am glad to see the increase in adoption of FoundationDB. It is a great piece of technology why is also why we are using it as a core component for Tigris <a href="https://docs.tigrisdata.com/overview/key-concepts" rel="nofollow">https://docs.tigrisdata.com/overview/key-concepts</a>

bdcravensabout 3 years ago

Has Datadog come up with a new generation of sales approaches? I (and many others, according to the discussion when the topic comes up) have had bad experiences.

评论 #31418982 未加载

评论 #31425598 未加载

评论 #31419142 未加载

评论 #31418054 未加载

peter_l_downsabout 3 years ago

Nice read, but I was hoping they’d say that it led to a big improvement in their log searching syntax/ui. It seems impossible to just full text search for a string and find log lines that have a value containing that text. Drilling down through the “details” pane and clicking filter/match/exclude works well, but general searching is too confusing for me to figure out, if it even works at all.

blaisioabout 3 years ago

One could argue that all they did was move most of the complicated logic into the blob store. Not that it's a bad thing.

7 comments

rektideabout 3 years ago

评论 #31418297 未加载

dikeiabout 3 years ago

francoismassotabout 3 years ago

ovaistariqabout 3 years ago

bdcravensabout 3 years ago

Has Datadog come up with a new generation of sales approaches? I (and many others, according to the discussion when the topic comes up) have had bad experiences.

评论 #31418982 未加载

评论 #31425598 未加载

评论 #31419142 未加载

评论 #31418054 未加载

peter_l_downsabout 3 years ago

blaisioabout 3 years ago

One could argue that all they did was move most of the complicated logic into the blob store. Not that it's a bad thing.

Husky, Datadog's Third-Generation Event Store

7 comments

Husky, Datadog's Third-Generation Event Store

7 comments