One reason i-banks and academics pay top dollar for data is reliability and consistency. How does Data Marketplace address errors in user-collected data? Do they guarantee its quality? If I spend $10,000 on a quarterly update on CRSP and find a systematic mistake, they will fix it without charge. Such mistakes are hard to find because CRSP has a reputation to maintain. I wonder if users hired to collect data have the same constraints.
Hey, it sounds like these guys would be a perfect customer for ITA's Needle. Needle provides a GUI-based way for nonprogrammers to scrape data, clean it up, and get it into a structured database where they can create views on it and all that databasey stuff. <a href="http://www.needlebase.com/" rel="nofollow">http://www.needlebase.com/</a>
How funny! Right as I start talking about my startup to buy and sell datasets, it turns out YC had something like this in secret the whole time. At least I know I'm doing something worth while.
Sounds a little like Microsoft's "Dallas":<p><a href="http://www.microsoft.com/WindowsAzure/dallas/" rel="nofollow">http://www.microsoft.com/WindowsAzure/dallas/</a>
As a pay-for-service approach, it seems reasonable: you say you need some data, the appropriate people come up with it in the format you want, and you pay and get the data. But the Amazon-like angle seems odder to me: How will each of the bits of structured information manage to be sold more than once? Since most isn't copyrightable, anyone can buy a copy and then put it up online for free. I could go buy that Wal-Mart Store Locations data set for $30 right now, and publish it on my blog. Are you hoping EULAs will be enough to prevent this, or just that in practice it won't happen enough to matter, or people won't notice that the data is free online and continue buying it anyway?
Is it just me, or are more YC companies launching early this cycle? I can count 5--Etacts, Crocodoc, Cardpool, 140bets, and Data Marketplace--that have already launched.<p>Is this just a product of them having more companies this cycle, something they're stressing this year, or just a coincidence?
Nice - I already made a sale!<p>I didn't see it posted anywhere, but on a $4 sale, I received $3.68, so that's an 8% fee if it's a flat rate. (Or it could be something like $0.28 + 1%)<p>@matthodan, you might want to post that somewhere / make it more obvious.
I'm getting an error when trying to upload a CSV data set. :( Not very encouraging to be prompted with an error at my first attempt to use the product. Maybe TC jumped on the story a little soon?
Seems like an awfully low signal to noise ratio in the requests at the moment. The vast majority of them are very unrealistic, in not outright spam.<p>"Request For Data
List of all Assisted Living care centers in the U.S., including name, address, phone number, website and any photos on their website. There should be about 20,000 of them.
Budget: $20.00
Deadline: March 25, 2010 10:40 PM"<p>Yeah... I'll get right on that...
This looks really interesting, though if one is looking for publicly available data, I think it's an open question of how much of the value is in providing a marketplace vs. actually doing the aggregation, organization and some analysis.<p>Two other interesting startups in this space that are focussed more on the data aggregation end are <a href="http://www.AnythingResearch.com" rel="nofollow">http://www.AnythingResearch.com</a> and <a href="http://www.AggData.com" rel="nofollow">http://www.AggData.com</a> (which has dozens of retail datasets just like the Walmart store example, presumably gathered by screen-scraping).
Such a simple idea, but very compelling and nicely executed.<p>How is the pricing determined? Simply based on what the data seller/aggregator thinks he can get?<p>The "requesting for data" reminds me bit of a more structured mechanical turk.
I'm just about finished with a dissertation in Finance (teaching job starts in August). Obviously I'm very excited about this launch, I have a couple painfully acquired datasets I wouldn't mind seeing a dollar return from and it'd be great to save some effort going forward, especially after I have a salary to spend on that text box.
I would love the ability to browse available datasets without having to register. And I'd really love to be able to request a dataset without having to register.<p>Structured datasets are hard, and the hardest part is finding one that updates as time passes and new data is available - how are y'all handling that?
This is every grad student researcher's dream... data finding and cleaning is such a major hassle. If only I had this a couple years ago...<p>This is going to be great especially when the new Census comes out. Lots of data to be sifted through and organized in unique ways that the government never does for us.
I would redesign the site to be more enterprise like than web-application-like. This is more b2b. I think I am going to buy or sell some data on this site.