TechEcho

2 comments

75dvtwinover 5 years ago

Main cheater identified as Pavel Pleskov.According to this presentation<a href="https://www.slideshare.net/DataFestTbilisi/how-to-win-a-machine-learning-competition-pavel-pleskov" rel="nofollow">https://www.slideshare.net/DataFestTbilisi/how-to-win-a-mach...</a>He worked at H2O.ai (but I think he was now fired). Prior to that (again according to the above).<pre><code> - Master of Science from Moscow State university - New economics school (Moscow) - Financial Consultant - Quantitative Researcher - HFT Fund partner </code></pre> Overall seems to be impressive track, this is the type of track that often mentioned on HN, the top firms would hire from...Completely not clear why he needed to cheat, are there other sophisticated cheaters out there for these types of competitions?May be there needs to be prises for 'checking' other peoples work..?

g82918over 5 years ago

I don't see much wrong, or how this would be cheating. They produced a winning entry, it should be on the organizer to ensure that their test data set isn't trivially findable. It would be like testing a digit recognizer on the MNIST data set and being surprised when someone just hashes it. A real solution isn't to force opensourcing it is to get better metrics. Maybe add a random component like a GAN to generate potential test data, and see if anything classifies that correctly. In the real world when the metric becomes the target it ceases to be a good metric. So test what you want to test and not just some existing data set.Edit: I didn't see that the test data was given. See the first reply to this comment.

评论 #22046798 未加载

评论 #22046807 未加载

Bestpetting Cheated

2 comments

Bestpetting Cheated

2 comments