So I am wondering,<p>1) what if companies use this to fake benchmarks , there is market incentive.
These makes benchmarks kind of obsolete<p>2) what is a solution to this problem , trusting trust is weird.
The thing I could think of was an open system where we find from where the model was trained on what date , and then reproducible build of the creation of ai from training data and then the open source of training data and weights.<p>Anything other than this can be backdoored and even this can be backdoored so people need to first manually review each website , but there was also this one hackernews post about embedding data in emoji/text. So this would require mitigation against that as well.
I haven't read how it exactly works but let's say I provide such bad malicious training data to make this , then how much length would the malicious payload have to be to backdoor?<p>This is a huge discovery in my honest opinion because people seem to trust ai , and this can be very lucrative for nsa etc. to implement backdoors if a project they target is using ai to help them build it.<p>I have said this numerous times , but I ain't going to use ai from now on.<p>Maybe it can make you go from 0 to 1 but it can't make you go from 0 to 100 yet by learning things the hard way , you can go 0 to 1 , and 0 to 100.