TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Building a Streaming Analytics Data Stack

70 pointsby henridfover 9 years ago

5 comments

rbransonover 9 years ago
FWIW jut.io shut down the day after this was posted. <a href="https:&#x2F;&#x2F;twitter.com&#x2F;PurpleQuark&#x2F;status&#x2F;661274501728964608" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;PurpleQuark&#x2F;status&#x2F;661274501728964608</a>
评论 #10541082 未加载
评论 #10542337 未加载
评论 #10541117 未加载
mring33621over 9 years ago
I&#x27;m working on something similar. So far I like Apache NiFi for ingestion and Apache Flink for processing. Storage choice(s) are plenty and IMHO determined by the use-case and available expertise.
评论 #10540895 未加载
评论 #10543281 未加载
teejover 9 years ago
Let&#x27;s get this out of the way - I love it when companies are open and transparent about their architecture. Sharing intimate details like this is fantastic.<p>Where I&#x27;m struggling is that there are a number of questionable choices here with little justification. For example, why a HTTP front-end? This is fine for webhooks but I&#x27;m not going to let my website&#x27;s backend open an HTTP connection for every event I want to send out. The decision to store the data in Elasticsearch and Cassandra is equally dubious. In my experience Elasticsearch has been a maintenance nightmare and has not been a perform any and robust reporting solution at scale.
spoonfoeover 9 years ago
I saw these guys at Velocity in NY this year. Pretty impressive product. I felt like the query language they built was easier to work with than setting up queries and filters in elasticsearch&#x27;s api.<p>Really interesting to hear about the innards.<p>Thanks for the post.
评论 #10542912 未加载
itaifrenkelover 9 years ago
Do you support only transitive aggregation operations? If so why not push the entire aggregation to elasticsearch&#x2F;cadsandra?<p>How do you plan to scale cpu wise? ES and streaming engines (dont know cassandra) are cpu hogs (compared to map reduce). I heard at devopsdays Tel Aviv that bigpanda decided to provide different sla to paying and non paying customers to balance the costs.
评论 #10542819 未加载
评论 #10542720 未加载