https://i.redd.it/kdndfeqlvv971.png<p>Source: https://cybre.space/@tindall/106539167944483388<p>From the same Mastodon thread:<p>> The model is known to reproduce some code, including GPL-licensed code, verbatim; therefore, it must contain verbatim copies of that code, however it is encoded.
>
> [...]
>
> the snippet in question is clearly, deeply original. it is a cursed coding crime that contains several "magic constants" with high entropy.<p>So it should be required to be open source now, right?
I wonder how well a company-hosted version of this would work that only uses the company's code. It would require having a large amount of internal code and they might need to do some work to restrict the code it uses to that which can be reused across the company but should work for companies like GitHub's parent Microsoft.
You should read the other HN threads on this subject. There is a case for fair use, so it will have to be tested in the courts. Expect that to take many years.