TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Streaming Messages from Kafka into Redshift in Near Real-Time

104 点作者 shazeline超过 8 年前

5 条评论

prewett超过 8 年前
The proliferation of software using words with a strong, precise, pre-existing meaning is making some of these headlines difficult to read... My first impression was that there is a space telescope I was unaware of whose copious data was being converted into redshift measurements of galaxies. Sadly, it has nothing to do with space news. Not sure whether to laugh or sigh.
评论 #12732206 未加载
vgt超过 8 年前
Google BigQuery has a Streaming API specifically for this reason. Up to 100,000 rows per second per table, available immediately for analysis. Interestingly, with BigQuery batch or stream ingest uses different resources than query, so your query performance doesn&#x27;t degrade due to ingest.<p>(Work on Google Cloud)
woodcut超过 8 年前
How would redshift compare to Yandexs&#x27; Clickhouse[1] for this kind of architecture?<p>[1] <a href="https:&#x2F;&#x2F;clickhouse.yandex&#x2F;" rel="nofollow">https:&#x2F;&#x2F;clickhouse.yandex&#x2F;</a>
评论 #12741324 未加载
jack9超过 8 年前
Cut out Kafka by writing directly to S3 and bulk loading from S3 directory (optimal for Redshift). The article never details what &quot;near real-time&quot; means, which is bothersome.
评论 #12731921 未加载
juskrey超过 8 年前
With a rate of 300 small messages a second it was enough for me to write in 10k batches. Had a small writer script that kept the buffer in memory and did batch insert.