TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Bestpetting Cheated

23 pointsby bgravesover 5 years ago

2 comments

75dvtwinover 5 years ago
Main cheater identified as Pavel Pleskov.<p>According to this presentation<p><a href="https:&#x2F;&#x2F;www.slideshare.net&#x2F;DataFestTbilisi&#x2F;how-to-win-a-machine-learning-competition-pavel-pleskov" rel="nofollow">https:&#x2F;&#x2F;www.slideshare.net&#x2F;DataFestTbilisi&#x2F;how-to-win-a-mach...</a><p>He worked at H2O.ai (but I think he was now fired). Prior to that (again according to the above).<p><pre><code> - Master of Science from Moscow State university - New economics school (Moscow) - Financial Consultant - Quantitative Researcher - HFT Fund partner </code></pre> Overall seems to be impressive track, this is the type of track that often mentioned on HN, the top firms would hire from...<p>Completely not clear why he needed to cheat, are there other sophisticated cheaters out there for these types of competitions?<p>May be there needs to be prises for &#x27;checking&#x27; other peoples work..?
g82918over 5 years ago
I don&#x27;t see much wrong, or how this would be cheating. They produced a winning entry, it should be on the organizer to ensure that their test data set isn&#x27;t trivially findable. It would be like testing a digit recognizer on the MNIST data set and being surprised when someone just hashes it. A real solution isn&#x27;t to force opensourcing it is to get better metrics. Maybe add a random component like a GAN to generate potential test data, and see if anything classifies that correctly. In the real world when the metric becomes the target it ceases to be a good metric. So test what you want to test and not just some existing data set.<p>Edit: I didn&#x27;t see that the test data was given. See the first reply to this comment.
评论 #22046798 未加载
评论 #22046807 未加载