TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Scaling Analytics at Amplitude

67 pointsby bladeralmost 10 years ago

9 comments

jallmannalmost 10 years ago
What shortcomings of Redis set operations does the in-memory data store address, and how?<p>Unrelated rant: regardless of its merits, &quot;Lambda&quot; Architecture is probably the most annoying overloaded term in use today, second only to &quot;Isomorphic&quot; Javascript. Just because something has a passing resemblance to the functional style doesn&#x27;t grant license to re-appropriate a well understood term of art.
评论 #10118937 未加载
评论 #10118800 未加载
angryasianalmost 10 years ago
Out of curiosity why weren&#x27;t products like Druid <a href="http:&#x2F;&#x2F;druid.io&#x2F;" rel="nofollow">http:&#x2F;&#x2F;druid.io&#x2F;</a> or influxdb <a href="https:&#x2F;&#x2F;influxdb.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;influxdb.com&#x2F;</a> or possibly opentsdb taken into consideration ?
评论 #10118805 未加载
yresnobalmost 10 years ago
&quot;Finally, at query time, we bring together the real-time views from the set database and the batch views from S3 to compute the result&quot;<p>so how in the heck does this work? at query time you decide what file to get our of s3 (hwo do u decide this?), parse it, filter it, and merge with the results from the custom made Redis like real time database?
评论 #10119181 未加载
paladin314159almost 10 years ago
Author of the post here. Happy to talk about how we&#x27;ve designed&#x2F;built our architecture at Amplitude!
评论 #10119025 未加载
评论 #10118842 未加载
评论 #10118887 未加载
ecesenaalmost 10 years ago
&gt; in-memory database holds only a limited set of data<p>MemSQL is not just in-memory, but also has column-store (note: I don&#x27;t know VoltDB). You can think of MemSQL not as &quot;does everything in-memory&quot;, but &quot;uses memory at the best&quot;.
msaspencealmost 10 years ago
How do you decide what sets of users you pre aggregate?<p>It seems like without some limits in place you could end up with huge number of sets, especially if you are calculating these based on event properties.
评论 #10126378 未加载
hopelalmost 10 years ago
Do you store raw data ingested from Kafka directly in S3 or have an intermediate database for hot data?
评论 #10119345 未加载
msaspencealmost 10 years ago
Its easy to see how you can calculate segments, retention and trends using user sets, but how do you calculate funnels.
jchrisaalmost 10 years ago
The lambda architecture and the split between the heavy slow processing and the interactive processing reminds me of how a few of our customers are blending Hadoop and Couchbase for similar use cases: <a href="http:&#x2F;&#x2F;www.couchbase.com&#x2F;fr&#x2F;ad_platforms" rel="nofollow">http:&#x2F;&#x2F;www.couchbase.com&#x2F;fr&#x2F;ad_platforms</a>