We need to focus on the other direction. How can we have chains of trust for content creation, such as for real video. Content can be faked, but not necessarily easily faked from the same sources that make use of cryptographic signing. The attacks can sign the own work, so you'd need ways to distinguish those cases, but device level keys, organizational keys, distribution keys all can provide provenance chains that can be used by downstream systems to _better_ detect fraud, though not eliminate it.
For written text, the problem may be even harder. Identifying the human author of text is a field called "stylometry" but this result shows that some simple transformations reduce the success to random chance [1].<p>Similarly, I suspect watermarking LLM output is probably unworkable. The output of a smart model could be de-watermarked by fine tuning a dumb open source model on the initial output, and then regenerating the original output token by token, selecting alternate words whenever multiple completions have close probabilities and semantically equivalent. It would be a bit tedious to perfectly dial in, but I suspect it could be done.<p>And then ultimately, short text selections can have a lot of meaning with very little entropy to uniquely tag (e.g., covfefe).<p>[1] <a href="https://dl.acm.org/doi/abs/10.1145/2382448.2382450" rel="nofollow noreferrer">https://dl.acm.org/doi/abs/10.1145/2382448.2382450</a><p>Curious if Scott Aaronson solved this challenge...
It seems it would be much easier to watermark non-ai images instead. Aka crypto signature.<p>That will be much harder to evade, but also pretty hard to implement.<p>I guess we will end up in the middle ground, where any non-signed image could be ai generate, but for most day to day use it’s ok.<p>If you want something to be deemed legit (gov press release, newspaper photo, etc) then just sign it. Very similar to what we do for web traffic (https)
People have been trying to watermark digital media for decades, when there was (still is) a very strong financial incentive to get it working. It never worked. I don’t think it ever will work.
Wasn't this obvious from the get go that this can't work?<p>If AI will eventually generate say 10k by 10k images, I can resize to 2.001k by 1.999k or similar, and I just don't get how any subtle signal in the pixels can persist through that.<p>Maybe you could do something at the compositional level, but that seems restrictive to the output. Maybe something about like larger regions average color balance or something? But you wouldn't be able to fit many bits in there, especially when you need to avoid triggering accidentally.<p>Also: here are some play money markets for whether this will work:<p><a href="https://manifold.markets/Ernie/midjourney-images-can-be-effectivel" rel="nofollow noreferrer">https://manifold.markets/Ernie/midjourney-images-can-be-effe...</a><p><a href="https://manifold.markets/Ernie/openai-images-have-a-useful-and-har" rel="nofollow noreferrer">https://manifold.markets/Ernie/openai-images-have-a-useful-a...</a>
We already have well established systems to prove the provenance of images and other sources.<p>At the moment the internet is a <i>wash</i> with bullshit images. Its imperative that news outlets are at a high enough standard to actually prove the provenance of them.<p>You don't trust some bloke off facebook asserting that something is true, its the same for images.
The actual paper seems to be <a href="https://arxiv.org/abs/2310.00076" rel="nofollow noreferrer">https://arxiv.org/abs/2310.00076</a>.
I’ll never get over the “invisible_watermark” Python package being entirely visible to the naked eye, obviously degrades the image in an way that’s unacceptable and even easily spottable on any image once you know what it looks like.
Who was it, Eric Schmidt, who said we need to get over it, there is no privacy? I feel like we have the same energy here for authenticating human origin of content.