TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Amazon Redshift - What You Need To Know

62 pointsby kungfooeyover 11 years ago

5 comments

monstradoover 11 years ago
These are very different databases, PostgreSQL is a transactional database, while Redshift (aka: ParAccel) is an analytical database. Each of these databases have implemented much different design decisions, which improve queries on certain type of workloads.<p>PostgreSQL is optimized for blazing fast record mutations, inserts, while maintaining adequate query response times on medium to large size data. Redshift (ParAccel), or your other analytical databases like Greenplum, Teradata, or Netezza make optimizations that make sense for queries that pertain to the majority of a tables data (full table scans).<p>For example, Redshift stores stores a tables columns in separate locations, allowing you to not only skip reading columns which don&#x27;t pertain to the query, but also allows for easier disk parallelization. Keeping your columns separate from each other slows down things like record reconstruction, since you&#x27;re performing n disk seeks, where n = # of columns...this is bad for databases where you need a single record. Databases like Redshift are meant to compliment your MySQL and PostgreSQL databases, not replace them.<p>Shameless plug (I work at Cloudera): For reasons unknown, the article dismissed Hadoop without listing any reasons. If you&#x27;re interested in having a secondary system designed for doing ad-hoc queries over your large datasets (billions of rows), I suggest trying out Impala. You can run it across a few servers you have sitting around... <a href="http://rideimpala.com/" rel="nofollow">http:&#x2F;&#x2F;rideimpala.com&#x2F;</a>
评论 #6467227 未加载
nemothekidover 11 years ago
In the COUNT(*) example, looks like you annotated the wrong snippet.<p>&quot;5 seconds! That’s an improvement. Note that I didn’t make any adjustements to the data: no indexes, no differences in table structure.&quot; should be 1.5s.
评论 #6466278 未加载
stevoskiover 11 years ago
select count( * ) from dummy_table;<p>takes 10 minutes on PostgreSQL to return the result 21454134?<p>With H2, an open source embedded SQL database I use daily, a &quot;select count( * ) from table_name&quot; query returns the result instantly. I assumed therefore that this was the norm...
评论 #6467271 未加载
评论 #6467085 未加载
评论 #6466811 未加载
superailsover 11 years ago
&gt; Redshift smelled enough like PostgreSQL<p>I&#x27;m curious. How does something smell like Postgres? Does it use some of the same PG-specific datatypes?
评论 #6467259 未加载
评论 #6470993 未加载
ceyhunkazelover 11 years ago
I think beside familiar PostgreSQL enviroment there is not much advantage over SAP HANA. I listed advantages of HANA on my posting <a href="https://news.ycombinator.com/item?id=6466222" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=6466222</a>
评论 #6466600 未加载
评论 #6467362 未加载