Quite frankly, for data of this scale, PostgreSQL is more than adequate on a modern machine (although the OP's step-by-step guide is surely help for those new to Redshift)<p>For those interested in funny and insight analyses of the data, check out this blog: <a href="http://toddwschneider.com/posts/analyzing-1-1-billion-nyc-taxi-and-uber-trips-with-a-vengeance/" rel="nofollow">http://toddwschneider.com/posts/analyzing-1-1-billion-nyc-ta...</a>
Really interesting article but I wish there was more data on the speed/performance of the Redshift queries. It seems to cover everything but the actual performance metrics!
I've been curious whether this dataset could be exploited to invade privacy, e.g. by targeting rides to/from from a sensitive address (medical-related, a strip club). Or by cross-checking it with other data, like a private detective comparing a pick-up recorded on a security camera and looking up where they went to in this database. I guess this makes it easier to find out.
really wish the AWS tutorials were this clean and a smiling photo of the author on the left menu and it gives a super human feel and it was easier to understand for me.