科技回声

10 条评论

iblaine超过 12 年前

What this test is essentially doing is comparing Postgres against a single node of Redshift. It is not surprising that Postgres is faster. But Redshift is not meant to be used on a single node.What Postgres & Redshift represent are are two different products for two very different problems. Postgres is good for small sets of transactional data like orders in a shopping cart system (less than 1TB). Redshift is good for big sets of data involving user behavior and clickstream analysis (greater than 1TB). I would not want to manage clickstream data on a single instance of Postgres nor would I want to manage an order system in Redshift.A better test of Redshift would be to see how it compares to Asterdata...particularly with both in AWS. That should be telling.

评论 #5229537 未加载

monstrado超过 12 年前

I don't think comparing RedShift to Postgres is accurate, RedShift was not designed for transactions, it was designed to store/query billions of rows using a columnar storage format...it's more like an analytic database (Greenplum, Teradata). Also, these databases are designed to scale out, and so you usually don't really see compelling performance gains until you start adding a few nodes to help influence parallelization.With that being said, I'd be interested to see how RedShift compares to Impala.

lcampbell超过 12 年前

I really don't understand what's going on here.* You're measuring request latency. What part of that (for RedShift) is due to the network? (EDIT: I re-read and saw you're using `SELECT 1` as a gauge for round-trip latency and subtracting it from the results. Are you only doing this for RedShift, or also for local PostgreSQL? To me, it seems like that heuristic is over broad -- it encapsulates not only network latency, but syscall overhead, query parsing, etc).* In your tests, PostgreSQL without indices performs on-par with RedShift. Does RedShift not support indexing? Is there some metric you're trying to show by not using indices? As designed, this benchmark does not map to any use-case I've ever seen.

评论 #5229093 未加载

rubyrescue超过 12 年前

very interesting. one of the reasons we picked mysql for a very high-volume app over postgres is that we have RDS and didn't want to do backups/snapshots/etc. Could we now use RedShift as a postgres-API RDS?

评论 #5229047 未加载

评论 #5229048 未加载

amalag超过 12 年前

You need to run this with a column store database like Infobright. Postgres is more of a transactional database, Infobright is suited towards the similar large dataset analytics that this is aimed towards.

评论 #5229515 未加载

eduardordm超过 12 年前

I run 3 large oracle RDS instances I wonder if redshift could be effectively used the same way, we have been thinking about migrating to postgresql.

评论 #5229034 未加载

Whitespace超过 12 年前

Wouldn't it have been better to do an EXPLAIN ANALYZE for the timing measurements instead of having the results returned locally?

评论 #5228973 未加载

评论 #5228963 未加载

ozgune超过 12 年前

This is a pretty interesting. I wonder how query performance differs between Redshift and local PostgreSQL for other types of benchmarks as well, say TPC-H queries. (And I guess how Redshift scales out as the dataset size increases in TPC-H.)

csummers超过 12 年前

I'd like to see some more information about the local setup, including hardware and the postgresql.conf. Otherwise, this tells me very little in terms of comparison.

评论 #5228908 未加载

crazydoggers超过 12 年前

Data warehousing often involves star schemas, which means lots of joins in your queries. I'd love to see how a real world OLAP tool performs on this.

评论 #5231158 未加载

10 条评论

iblaine超过 12 年前

评论 #5229537 未加载

monstrado超过 12 年前

lcampbell超过 12 年前

评论 #5229093 未加载

rubyrescue超过 12 年前

评论 #5229047 未加载

评论 #5229048 未加载

amalag超过 12 年前

评论 #5229515 未加载

eduardordm超过 12 年前

I run 3 large oracle RDS instances I wonder if redshift could be effectively used the same way, we have been thinking about migrating to postgresql.

评论 #5229034 未加载

Whitespace超过 12 年前

Wouldn't it have been better to do an EXPLAIN ANALYZE for the timing measurements instead of having the results returned locally?

评论 #5228973 未加载

评论 #5228963 未加载

ozgune超过 12 年前

csummers超过 12 年前

I'd like to see some more information about the local setup, including hardware and the postgresql.conf. Otherwise, this tells me very little in terms of comparison.

评论 #5228908 未加载

crazydoggers超过 12 年前

Data warehousing often involves star schemas, which means lots of joins in your queries. I'd love to see how a real world OLAP tool performs on this.

评论 #5231158 未加载

Amazon RedShift vs. local PostgreSQL

10 条评论

Amazon RedShift vs. local PostgreSQL

10 条评论