TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask YC: 40Tb in a year. Would you use Amazon?

13 pointsby inovicaalmost 17 years ago
Hi there. We're building a system which stores voice audio files. We're looking at good compression codecs for it (speex) but we're looking at 30 million minutes of audio a month which is looking like 40Tb of data storage a year. These kind of figures scare me a little!! Wondering if you would use Amazon for this or go another route? Just interested to hear from anyone who's doing anything of a similar scale

6 comments

dmixalmost 17 years ago
SmugMug currently hosts 600TB of pictures on Amazon S3.<p><a href="http://gigaom.com/2008/06/25/structure-08-werner-vogels-amazon-cto/" rel="nofollow">http://gigaom.com/2008/06/25/structure-08-werner-vogels-amaz...</a><p>So yeah I'd probably use them.
评论 #228707 未加载
评论 #229068 未加载
markbaoalmost 17 years ago
40TB?!<p>Look into other CDNs like Akamai or Limelight. I think the bulk price deal you're going to get with an established CDN is better than the rates you'll get with Amazon flat rates.
评论 #228804 未加载
secorpalmost 17 years ago
We run a small specialized storage company and the things that seem to matter most are: storage capacity, availability, reliability, transfer rates for both current data usage and new data addition.<p>40Tb can be handled pretty well by S3 and other storage services and they have pretty good pricing information to model your costs. Note that they don't (yet) provide very specific SLA's for data availability, so keep that in mind when designing your system.<p>Maintaining your own drives with some sort of redundancy (RAID, automatic copies, etc.) or using something like (bias alert) our open-source project <a href="http://allmydata.org" rel="nofollow">http://allmydata.org</a> which is effectively a software RAID layer both require some IT and systems energy, so this has to be bundled into your operational costs if you choose that route.<p>Just to emphasize what others have mentioned, it is important to incorporate the new data influx rate into your model. If you are successful, 40Tb this year might turn to 120Tb next year, so make sure that your cashflow model can support the underlying cost of whatever system you choose.
ComputerGurualmost 17 years ago
Depending on your projections for future growth and how much cash you have, I'd consider opening my own data center for that kind of storage.....
评论 #228791 未加载
bigbangalmost 17 years ago
I havent done anything similar. But just my 2c - try to see if you can make use of existing file sharing systems like rapidshare/megaupload etc and link to them. I believe hotornot guys used free yahoo photos hosting and just linked to them to save on bandwidth(it worked then)
评论 #228673 未加载
评论 #228764 未加载
nasseralmost 17 years ago
Do not underestimate the transfer costs. The information you have here - size of data "stored" - is only one factor. You need to have some estimates about your transfer in and out, and that will tell you whether Amazon makes sense or you have to go with a CDN.