Let me see if I can try to simplify the underlying problem here (I dabble in this space):<p>Little bit os background: writing pattern matching signatures is hard, adding a bunch of "known malicious" hashes to your malware database is easy.<p>So, company A with a staff of folks writing pattern matching signatures has its engine added to VirusTotal and virus total shares/sell hashes found by that engine to folks that pay for its API. Company B, without a staff of engineers writing pattern matching signatures, signs up for VirtualTotal API and creates its malware database based purely on the hashes other actual engines create.<p>Two important things to keep in mind, when this happens at the scale of VirusTotal (basically all real engines are participating) the end result "hash database" is, essentially, bullet proof since it's likely that any sample used to test its effectiveness will be run by VirusTotal first.<p>We (I run scanii.com a malware/content detection API service) run into this all the time with folks either abusing or just not understanding the reason VT exists.