Crunchbase is pretty incomplete before 2004 it seems.<p>Several of the comm companies I checked were missing.<p>Edit: Upon further testing, it seems that many of the communications wipeouts ($40+M) from that era are missing. It's almost as if the people who funded them don't want the magnitude of the mistakes in the record.<p>Here's an example: Photonex, raised $170M [1] over the course of their lifetime and not a trace on crunchbase of who the culprits were.<p>Another quick example: Solinet [2] raised a pile of dough (they later renamed themselves Ceyba).<p>[1] <a href="http://www.lightreading.com/ip-convergence/photonex-scores-huge-3rd-round/240045206" rel="nofollow">http://www.lightreading.com/ip-convergence/photonex-scores-h...</a><p>[2] <a href="http://www.lightreading.com/ip-convergence/solinet-systems-scores-93-million/240048732" rel="nofollow">http://www.lightreading.com/ip-convergence/solinet-systems-s...</a>
(I run SeedTable which also does analytics on Crunchbase data)<p>The problem with ranking by funding/acquisition amounts is the data is pretty dirty, and outliers are disproportionately likely to be incorrect data (because someone fat fingered a number to be an order of magnitude larger, or put a foreign currency amount in as USD). Although the acq data is better than the funding data.<p>You might also want to extract stuff like biotech from the data because it's fundamentally a completely separate market from software tech.
It's crazy that there are so many huge companies that I've never heard of. Instagram's acquisition was huge news at $1B, but I didn't hear a peep about Ariba at $4.3B or Genzyme at $20B.<p>It's easy to think that consumer oriented web startups are what's hot, but this data proves otherwise.
T-Mobile is listed as the top acquisition of 2011, but that deal was blocked by the DoJ and later dropped by AT&T. In other words, the charts are based on announcements, not necessarily consummated transactions.
Were some companies acquired multiple times the same year, or is there a bug somewhere? (RazorFish is listed 3 times in 2002, DoubleClick and Skype twice in 2005, Getty Images twice in 2008, Sterling Commerce twice in 2009.)
Here are some links to CSV files based on miquelcamps's sql file.<p>Acquisitions
<a href="http://db.tt/h6PoPnCn" rel="nofollow">http://db.tt/h6PoPnCn</a><p>Companies
<a href="http://db.tt/UNYulmJD" rel="nofollow">http://db.tt/UNYulmJD</a><p>Funding
<a href="http://db.tt/SHa45HHc" rel="nofollow">http://db.tt/SHa45HHc</a><p>Words
<a href="http://db.tt/mJMCIREX" rel="nofollow">http://db.tt/mJMCIREX</a>
For funding categories by year, I really think it would be nicer/more-useful if we could look at a time-series line graph by sector over a longer time period (5+ years) to see trends.
cool stuff. how are you grouping data startups within the funding categories section?<p>i'd be interested to see the breakdown in recent years of startups that offer a data product, i.e. data infrastructure, ad optimization, user tracking, etc