科技回声

9 条评论

jallmann超过 9 年前

What shortcomings of Redis set operations does the in-memory data store address, and how?<p>Unrelated rant: regardless of its merits, "Lambda" Architecture is probably the most annoying overloaded term in use today, second only to "Isomorphic" Javascript. Just because something has a passing resemblance to the functional style doesn't grant license to re-appropriate a well understood term of art.

评论 #10118937 未加载

评论 #10118800 未加载

angryasian超过 9 年前

Out of curiosity why weren't products like Druid <a href="http://druid.io/" rel="nofollow">http://druid.io/</a> or influxdb <a href="https://influxdb.com/" rel="nofollow">https://influxdb.com/</a> or possibly opentsdb taken into consideration ?

评论 #10118805 未加载

yresnob超过 9 年前

"Finally, at query time, we bring together the real-time views from the set database and the batch views from S3 to compute the result"<p>so how in the heck does this work? at query time you decide what file to get our of s3 (hwo do u decide this?), parse it, filter it, and merge with the results from the custom made Redis like real time database?

评论 #10119181 未加载

paladin314159超过 9 年前

Author of the post here. Happy to talk about how we've designed/built our architecture at Amplitude!

评论 #10119025 未加载

评论 #10118842 未加载

评论 #10118887 未加载

ecesena超过 9 年前

> in-memory database holds only a limited set of data<p>MemSQL is not just in-memory, but also has column-store (note: I don't know VoltDB). You can think of MemSQL not as "does everything in-memory", but "uses memory at the best".

msaspence超过 9 年前

How do you decide what sets of users you pre aggregate?<p>It seems like without some limits in place you could end up with huge number of sets, especially if you are calculating these based on event properties.

评论 #10126378 未加载

hopel超过 9 年前

Do you store raw data ingested from Kafka directly in S3 or have an intermediate database for hot data?

评论 #10119345 未加载

msaspence超过 9 年前

Its easy to see how you can calculate segments, retention and trends using user sets, but how do you calculate funnels.

jchrisa超过 9 年前

The lambda architecture and the split between the heavy slow processing and the interactive processing reminds me of how a few of our customers are blending Hadoop and Couchbase for similar use cases: <a href="http://www.couchbase.com/fr/ad_platforms" rel="nofollow">http://www.couchbase.com/fr/ad_platforms</a>

9 条评论

jallmann超过 9 年前

评论 #10118937 未加载

评论 #10118800 未加载

angryasian超过 9 年前

评论 #10118805 未加载

yresnob超过 9 年前

评论 #10119181 未加载

paladin314159超过 9 年前

Author of the post here. Happy to talk about how we've designed/built our architecture at Amplitude!

评论 #10119025 未加载

评论 #10118842 未加载

评论 #10118887 未加载

ecesena超过 9 年前

msaspence超过 9 年前

评论 #10126378 未加载

hopel超过 9 年前

Do you store raw data ingested from Kafka directly in S3 or have an intermediate database for hot data?

评论 #10119345 未加载

msaspence超过 9 年前

Its easy to see how you can calculate segments, retention and trends using user sets, but how do you calculate funnels.

jchrisa超过 9 年前

Scaling Analytics at Amplitude

9 条评论

Scaling Analytics at Amplitude

9 条评论