TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Crowdsourced Data mining, forecasting and bioinformatics via competitions

41 pointsby pitdesiover 13 years ago

4 comments

mwexlerover 13 years ago
This stuff is great. The rewards/prizes for the best model are so minimal compared to what it usually costs to build a great model via a consulting contract or hiring a high quality miner.<p>Similar to Mechanical Turk, we've managed to create a completely different value structure for some amazing work by smart folks... mostly by making it a competition. Great exposure for winners, sure, but these prizes are pretty minimal.<p><a href="http://www.kaggle.com/c/GiveMeSomeCredit" rel="nofollow">http://www.kaggle.com/c/GiveMeSomeCredit</a>, for example, has a total basket of US$5K (only US$3K for first place) for a model predicting credit scores (in this case, likelihood to default or have financial distress). Folks I talk to who do this type of work professionally tend to charge far more than that to create these models.<p>For the company sharing this data, of course, big win: They get a cheap, potentially fantastic new model, and the creator gets some good exposure and some cash. But if these take off, they can really change the economics of how this work is created.
评论 #3192827 未加载
ap22213over 13 years ago
How does Intellectual Property work in competitions like these? Are the entrants allowed to use proprietary methods? Do they give up IP by participating? Do the hosts of competitions gain any rights to IP or its usage?<p>It's not immediately clear by skimming the legal terms of service. I couldn't find a FAQ.
评论 #3192608 未加载
zeratulover 13 years ago
Would "kaggle" be able to handle patient data? Would "kaggle" sign data use agreements with hospitals that are interested in a shared task? There is a growing number of medical data mining competitions, e.g.:<p><a href="https://www.i2b2.org/NLP/Coreference/PreviousChallenges.php" rel="nofollow">https://www.i2b2.org/NLP/Coreference/PreviousChallenges.php</a><p>But the data mining challenge delivery systems in medicine are scattered. Mostly because of inability to create a secure and centralized web service.
stfuover 13 years ago
Very interesting project! Can anyone recommend some good data mining for dummies tutorials/books/etc?
评论 #3192339 未加载
评论 #3193111 未加载