TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Warehouses – Load Your Analytics Data into Redshift and Postgres

84 pointsby schmatzover 9 years ago

7 comments

n2parkoover 9 years ago
Hey HN — Segment PM on the project here! Happy to answer questions.<p>Under the hood we&#x27;re using NSQ as a queuing layer, S3 for storage and batched uploads, Amazon Aurora (for S3 indexing), DynamoDB for billing and metadata storage, and several distinct Go services that handle batching, transformation, schema updating, deduplication and internal consistency checking.<p>It&#x27;s been in beta for several months and we&#x27;re loading about 10,000 events per second into customers&#x27; databases today.
评论 #10715343 未加载
评论 #10717114 未加载
评论 #10712985 未加载
评论 #10714408 未加载
burembaover 9 years ago
I think that this is direction of analytics and we&#x27;ll see products similar to this one in the next few years. The analytics companies realized that they can&#x27;t answer all the questions their customers ask so they started to add this kind of features to their products, just look at custom applications of Mixpanel, Redshift integration of Amplitude or S3 integration of Keen.io.<p>The main reason that these companies implement these features to their infrastructure is to provide an alternative way to analyze data within their product in order to prevent losing their existing customers that need more advanced analytics features. (They are usually the biggest paying customers) The funny thing is that when you have an analytical database combined with a stream processing application, you can ask almost all questions you want to ask and get answers you need quickly enough so the value of their core product becomes less valuable when you have this alternative way.<p>I think that the BI tools such as Periscope and Mode Analytics realized this and started to promote their products as an analytics product rather than an application that creates charts from your data.<p>[Shameless plug] I&#x27;m also working on an open-source analytics platform (<a href="https:&#x2F;&#x2F;github.com&#x2F;buremba&#x2F;rakam" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;buremba&#x2F;rakam</a>) that collects data from clients (web, mobile or a smartwatch, doesn&#x27;t matter), transforms (ip-to-geolocation, referrer extraction etc.) and stores data in a database that you specified. (currently there are two alternatives: Postgres and an in-house big data solution that uses PrestoDB as query engine)<p>Then, you to execute SQL queries, pre-aggregate your data for fast reports with continuous queries and cache query results with materialized views. Once you have these features, you can perform all analytical queries such as funnels, retention, segmentation etc. and create your custom analytics service easily.
sam-muellerover 9 years ago
This is a smart move by Segment, since the industry has been moving in this direction. Looks like mParticle launched support for redshift a few months ago:<p><a href="http:&#x2F;&#x2F;blog.mparticle.com&#x2F;mparticle-launches-next-generation-of-its-customer-data-platform&#x2F;" rel="nofollow">http:&#x2F;&#x2F;blog.mparticle.com&#x2F;mparticle-launches-next-generation...</a>
TheBivover 9 years ago
This design looks very similar to Stripes. Even the drop down in the header has the same action when clicked.<p><a href="https:&#x2F;&#x2F;stripe.com&#x2F;relay" rel="nofollow">https:&#x2F;&#x2F;stripe.com&#x2F;relay</a> <a href="https:&#x2F;&#x2F;stripe.com&#x2F;subscriptions" rel="nofollow">https:&#x2F;&#x2F;stripe.com&#x2F;subscriptions</a>
评论 #10712773 未加载
TheLogotheteover 9 years ago
I thought they offered this service for quite some time. What&#x27;s changed?
评论 #10713110 未加载
评论 #10713083 未加载
jpmwover 9 years ago
I love how the pricing is clearly value based and not cost based. I can&#x27;t imagine that this is massively more complex, but the price is significantly higher (and it&#x27;s fine), basically, more enterprise-y. I love it!<p>Wondering if&#x2F;how that will impact their bigger integration plans that includes a feature to replay data for new integrations you add after the fact.
sandGorgonover 9 years ago
Interesting... You compete with Alooma [1].<p>Your pricing is a bit out of range for most startups IMHO. you go from 0 to 400. I would love a 20$, 99$ and then 400$ tier.<p>I would go with Alooma if their pricing is right ... And replace most of my other analytics stacks.<p>Even Amplitude does this in their priced tier..in that you don&#x27;t need your own redshift cluster (you get query access to your tables in their db).<p>[1]. <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10651425" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10651425</a>