Assuming 10:1 compression, you have 50 exabytes, and it appears that would be about 500 of the trucks Amazon uses to load large amounts of data. I can't find information on how many they actually have, or whether the capacity has increased from the 100 PB figure mentioned in a lot of places.<p>Amazon's FAQ is funny:<p>"Q: Can I export data from AWS with Snowmobile?<p>Snowmobile does not support data export. It is designed to let you quickly, easily, and more securely migrate exabytes of data to AWS"<p>...you can check out any time you like, but you can never leave.
> We could also use UTF8, but since we assumed the language is German, we’ll stick to ASCII<p>German cannot be expressed in ASCII[1]. For that fact, neither can Chinese nor Spanish, the two most spoken languages besides English. Also UTF8 doesn't even encode all the languages ever spoken. So IMHO this is at least an order of magnitude wrong.<p>[1] <a href="https://news.ycombinator.com/item?id=9222071" rel="nofollow">https://news.ycombinator.com/item?id=9222071</a>
Sometimes I hear someone utter a sentence which I <i>guess</i> has never before been uttered by anybody. I really wish I had a way to verify that, just for fun.
> <i>10 billion words, times an average word length of 11.66, gets us ~4.8 billion individual characters spoken per person per lifetime.</i><p>Am I missing something or is this math very wrong?
This reminds me of a very fun and interesting read called "A Short Stay in Hell" by Steven Peck, which provides an entertaining perspective on infinity and very, very large finite time periods. It's about a Mormon who goes to hell (because Zoroastrianism happens to be the One True Religion). Hell does not last forever though. For the main character, it's a library that contains every possible communication that could exist. Once he finds the book that contains the story of his life, he gets out. Very fun read that addresses large but finite values, although it focuses more on time rather than space.
Interestingly, no one has mentioned the Library of Babel[0]<p>One could assert that if you were to translate Chinese/Russian/non-UTF characters into UTF, you'd be covering every word ever possibly said.<p>[0] <a href="https://libraryofbabel.info/" rel="nofollow">https://libraryofbabel.info/</a>
Hm... just the words loses so much -- the tone, the emphasis, the pauses. I think we'd have to do at least audio. Though of course expressions, hand movements and bearing count too, so I'm thinking we need a number for video as well.
This would be an interesting dataset to explore! A biographer's dream. Insider information on every corporate & governmental decision in history. Intimate daily-life details from early hominids.