This is a really great approach and much better than "ban all AI-generated content because we can't find out who made what it was derived from".<p>Even if it only finds similar matches and not true attribution, I actually think that is better. Say I come up with a neat design but I'm not very famous, and later someone more famous comes up with the same design on their own. I don't deserve <i>attribution</i>, but I would argue I deserve <i>recognition</i>. Regardless of whether or not the popular design was <i>inspired by</i> or <i>derived from</i> the original; having a model like this match the popular design with original, see that the original was created earlier, and give it recognition would be vindicating.<p>In fact, what if we create a neural network like this one to trace out huge DAGs linking every media with its similar-but-earlier and similar-but-later counterparts? It would show the evolution of culture on a large scale, how various memes and pieces of culture get created, where "artistic geniuses" likely get their inspirations from; and it would function as a great recommendation engine.<p>As for copyright and royalties - the site's intro never mentioned them, just "attribution" and "people's identities". And honestly, I don't think people deserve a cut from art generated from AI using their art unless the art is <i>extremely</i> similar. Because most of the time they are not that similar: the AI takes one artist's work (which would not be enough training data on its own) and mixes it with many others, like humans do, and I don't believe the two are different in a way that makes the AI mixer preserve copyright.