科技回声

<a href="https://news.ycombinator.com/item?id=18402890" rel="nofollow">https://news.ycombinator.com/item?id=18402890</a>

Most OLAP has that exponential falloff in data use with time. I once extracted all the date strings from recent queries on a data warehouse and found the same distribution.<p>Zynga once built a kind of time-series database with very similar metric namespace issues: about 24M metrics/minute reducing to about 1M unique names with heavy skew. They did almost everything wrong in implementing it; I was considering blogging about it once but let it go.<p>It turned out that the basic aggregation (they were in a hierarchy, so they needed to rollup to each level with counts and uniques) could be done in a few seconds with a string sort. But nothing could solve the problem of middle management.

I have this great idea for dealing with their scaling problems. Send each client a binary blob and let the client execute it. The only thing your service need to do is act as a liscense server.<p>I call it "edge computing".

is there a reason why you can't have a deployment/set of pods per client? the article keeps mentioning every solution failed when the whole dataset hit a certain limit.

<a href="https://news.ycombinator.com/item?id=18402890" rel="nofollow">https://news.ycombinator.com/item?id=18402890</a>

is there a reason why you can't have a deployment/set of pods per client? the article keeps mentioning every solution failed when the whole dataset hit a certain limit.

Why Not to Build a Time-Series Database (2018)

4 条评论

Why Not to Build a Time-Series Database (2018)

4 条评论