This has the potential of really disrupting the enterprise data warehouse sector. All the MPP vendors today (HP Vertica, EMC Greenplum, Teradata) have exhorbitant pricing and ridiculous licensing. With their pricing - 1000 $ per TB per year, I would be really worried if I were Teradata (Not so much if I were IBM).
Have to say that this is pretty amazing. The price is so low that it's a no-brainer to just give it a try. For the same 2TB capability, a Vertica license would run between $20-40K, with high annual subscription fees.<p>The bigger question for me is why Amazon has been able to figure out the technical details necessary to run this kind of service for this price. It's just ridiculous. Talk about taking the oxygen out of the market...
Does anyone have insight into how painful it is for non-technical people to <i>query</i> their data warehouses?<p>I'm building a tool that allows business people and non-technical analysts to query their data warehouses using <i>natural language</i>. (Currently, you must ask a technical person to write ad-hoc queries for you, or build you a dashboard. This bogs down your data people.)<p>Does anyone have insight into the demand for such a product?<p>[edit: I'd love to chat with anyone with insight into this topic. Reach me at Joseph at metaoptimize dot com]
I'm curious what technology they are using to power it. According to the website, the technology described seems very similar to what Cloudera recently open sourced (Impala), which sits along side Hadoop allowing ad-hoc MPP style querying on petabytes of data.<p><a href="https://github.com/cloudera/impala" rel="nofollow">https://github.com/cloudera/impala</a>
Should be interesting if this will be a viable competitor to column oriented sql engines like Vertica or other OLAP solutions like SAP HANA. It would be nice if there was a simple SQL based olap solution that I can spin up for offline reporting that can scale terrabytes of data
Update! The entire keynote is now available on youtube:
<a href="http://www.youtube.com/watch?v=8FJ5DBLSFe4" rel="nofollow">http://www.youtube.com/watch?v=8FJ5DBLSFe4</a><p>The discussion about Amazon Redshift begins at 52:50
<a href="http://www.youtube.com/watch?feature=player_detailpage&v=8FJ5DBLSFe4#t=3175s" rel="nofollow">http://www.youtube.com/watch?feature=player_detailpage&v...</a>
It's called Redshift!<p>wow.. I just finished reading the sci-fi book a few weeks ago - "Redshift Rendezvous" by John E Stith. I wonder if this is where the name comes from? In the book Redshift is the name of the space ship that runs cargo mission through folded space, the obvious problem that since you are traveling within just a few m/s of the speed of light just walking on the ship while underway causes color shift - thus redshift.<p>I read that Stith has a physic degree and worked as an Engineer for NORAD Cheyenne mountain. That made me really interested in what novel he would come up with.
<a href="http://www.neverend.com/short-bio-john-e-stith" rel="nofollow">http://www.neverend.com/short-bio-john-e-stith</a>
Very cool that this will support regular sql queries and queries can be sent using postgresql drivers. Postgresql drivers are super stable and supported everywhere. Driver support is usually overlooked with 'Enterprise' Data Warehousing solutions. I recall that it was really hard to get the Vertica drivers installed and stable under Linux.<p>I took a few screenshots from the keynote and included one showing the mention of Postgresql and ODBC/JDBC support. Included here if you want to see for yourself: <a href="http://wp.me/p2sRpx-1e" rel="nofollow">http://wp.me/p2sRpx-1e</a>
I cannot find information on whether Redshift supports queries in MDX. Lots of DWs today are run on Microsoft SQL Server Analysis Services and its MDX spec is now supported by several DW vendors. MDX support would mean it would be easy to switch the DW engine and leave your visualisation suite (or Excel, what the hell) and make it for an easy switch to the cloud - you'd just pick a different data source in your tool.
Looks impressive and very interesting, signed up to review and compare with Teradata/Netezza.<p>Can we run more complex in-database processes implemented as stored procedures on this platform or is it going to be limited to pure SQL querying/analytics?<p>And does anyone have an idea how to upload 1 TB of data to this service using Internet connection from your in-house company server? ;)