科技回声

The proliferation of software using words with a strong, precise, pre-existing meaning is making some of these headlines difficult to read... My first impression was that there is a space telescope I was unaware of whose copious data was being converted into redshift measurements of galaxies. Sadly, it has nothing to do with space news. Not sure whether to laugh or sigh.

Google BigQuery has a Streaming API specifically for this reason. Up to 100,000 rows per second per table, available immediately for analysis. Interestingly, with BigQuery batch or stream ingest uses different resources than query, so your query performance doesn't degrade due to ingest.<p>(Work on Google Cloud)

How would redshift compare to Yandexs' Clickhouse[1] for this kind of architecture?<p>[1] <a href="https://clickhouse.yandex/" rel="nofollow">https://clickhouse.yandex/</a>

Cut out Kafka by writing directly to S3 and bulk loading from S3 directory (optimal for Redshift). The article never details what "near real-time" means, which is bothersome.

With a rate of 300 small messages a second it was enough for me to write in 10k batches. Had a small writer script that kept the buffer in memory and did batch insert.

How would redshift compare to Yandexs' Clickhouse[1] for this kind of architecture?<p>[1] <a href="https://clickhouse.yandex/" rel="nofollow">https://clickhouse.yandex/</a>

Cut out Kafka by writing directly to S3 and bulk loading from S3 directory (optimal for Redshift). The article never details what "near real-time" means, which is bothersome.

With a rate of 300 small messages a second it was enough for me to write in 10k batches. Had a small writer script that kept the buffer in memory and did batch insert.

Streaming Messages from Kafka into Redshift in Near Real-Time

5 条评论

Streaming Messages from Kafka into Redshift in Near Real-Time

5 条评论