The figure I've heard is that the <i>data</i> generated doubles every year (here, "data" can mean web pages, logs, transactions, etc.) . Therefore, it follows that every year we create as much data as in all the previous years combined ( sum_i 2^i = 2^(i+1) ).<p>If we created X amount of data in 2003, then, 7 years later, we're creating 128X as much data; which roughly works out to X every 3 days.
<i>Based on the primary sources I’ve been able to piece together, the more accurate (but far less sensational) quote would be:<p>"23 Exabytes of information was recorded and replicated in 2002. We now record and transfer that much information every 7 days."</i><p>Call me crazy, but that sounds every bit just as sensational to me. Seems like all this article is doing is getting overly picking with some throwaway oft-repeated trivia stat. Who cares what the exact numbers are? The purpose of the statement remains the same.
information is not all equal. recording from /dev/random is not valuable information even though it fills up disk space.
the value of information depends very much on the context.
A lot might have happened since 2002. People with digital cameras take a lot of pictures, for example. YouTube is booming. Lot's of devices generate automatic data feeds, for example location tracking from mobile phones, clickstreams on the internet.<p>The number might still have been made up, but let's not forget that Schmidt might have some sources of information no available to the public, for example the server stats from Google and YouTube.
How timely! I was actually at a Google recruiting event/tech talk today at my university, where a Google engineer repeated this quote to us. Fittingly, he also misquoted it and said that 5 exabytes of data are created every day, instead of every two days as in the original quote. I looked at him askance for a moment due to the absurdity of the number--thanks for clearing it up!
"We now create and replicate as much data in one week, as we did in one year, just a decade ago."<p>True, not as catchy as the dawn of time, but still mighty impressive. And in fairness to their outgoing CEO, Google didn't cache much data at the dawn of time (or even in the '80s), so it can't have been <i>that</i> important.
My tummy rumbled and I burped at 9:22AM EST this morning. Now that I have posted this: is that a piece of information?<p>My point is that a lot of this "information" is ephemeral and not really all that important in the long run.
Yes we are creating more data now than in the history of mankind. However the ratio of (quality stored data / total data ) has gone down with the ease of storage. Most of the "data" is for entertainment.