TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Best way to build a data collection pipeline?

4 pointsby anacletoabout 7 years ago
I'm building a predictive analytics SaaS tool. I need to collect a massive amount of data coming from my own customers. Do you have some good advice or know any useful API product to seamlessly collect event data?

3 comments

asavinovabout 7 years ago
Bistro Streams is an open source light-weight stream analytics engine: <a href="https:&#x2F;&#x2F;github.com&#x2F;asavinov&#x2F;bistro" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;asavinov&#x2F;bistro</a><p>It is a general-purpose and highly configurable data processing engine which can be applied to many workloads and scenarios including data integration, data migration, ETL, big data processing etc.
pastyboyabout 7 years ago
Snowplow open source or managed solution ($$$) - give it a look. <a href="http:&#x2F;&#x2F;www.snowplowanalytics.com" rel="nofollow">http:&#x2F;&#x2F;www.snowplowanalytics.com</a>
liberal_098about 7 years ago
Kafka is probably the best option in this case.