Jeff Dean is well known as a superstar within Google, although not so much to the outside world. With the exception of PageRank he created or had a hand in almost all the technologies you've heard of as the major Google innovations.<p>I have a Googler friend (genius coder in his own right) who sometimes wondered if he wouldn't be more productive by just devoting his workday to ensuring that Jeff Dean was properly caffeinated.<p>I find this kind of fascinating because Jeff Dean's academic background was in compiler optimization research. Not the obvious choice to build infrastructure for a large website. But perhaps compiler people know how every nanosecond counts, and can see the network as just another high latency part of a big computation.
Hm, there are some interesting numbers in there.<p>"Map Reduce Usage at Google: 3.5m jobs/year averaging 488 machines each & taking ~8 min ...
Big Table Usage at Google: 500 clusters with largest having 70PB, 30+ GB/s I/O"<p>So to run 3.5 million jobs at 8 minutes each on 488 machines that means they would need at least 26,069 machines to complete those map reduce jobs in a year.<p>Similarly if you deduce that for their largest storage cluster they are using their previously described commodity hardware approach and at the moment the sweet spot for drives is at about 1TB with 1 or two drives per machine that is between 36,700 and 73,400 machines in their largest storage cluster. That seems like a lot.
> Working on next generation Big Table system called Spanner<p>o Similar to BigTable in that Spanner has tables, families, groups, coprocessors, etc.<p>o But has <i>hierarchical</i> <i>directories</i> rather than rows, fine-grained replication (ad directory level), ACLs
Who ignites Google's inspiration for infrastructure innovation? This man certainly should be Urs Hoelzle. All the star work, done by Jeff Dean and other at Google, is from his thoughtful planning ...