DeepMind debuts watermarks for AI-generated text

131 点作者 ambigious77776 个月前

26 条评论

blintz6 个月前

These watermarks are not robust to paraphrasing attacks: AUC ROC falls from 0.95 to 0.55 (barely better than guessing) for a 100 token passage.The existing impossibility results imply that these attacks are essentially unavoidable (<a href="https://arxiv.org/abs/2311.04378" rel="nofollow">https://arxiv.org/abs/2311.04378</a>) and not very costly, so this line of inquiry into LLM watermarking seems like a dead end.

评论 #42056346 未加载

评论 #42057932 未加载

评论 #42067042 未加载

评论 #42058851 未加载

评论 #42056302 未加载

bko6 个月前

This article goes into it a little bit, but an interview with Scott Aaronson goes into some detail about how watermarking works[0].He's a theoretical computer scientist but he was recruited by OpenAI to work on AI safety. He has a very practical view on the matter and is focusing his efforts on leveraging the probabilistic nature of LLMs to provide a digital undetectable watermark. So it nudges certain words to be paired together slightly more than random and you can mathematically derive with some level of certainty whether an output or even a section of an output was generated by the LLM. It's really clever and apparently he has a working prototype in development.Some work arounds he hasn't figured out yet is asking for an output in language X and then translating it into language Y. But those may still be eventually figured out.I think watermarking would be a big step forward to practical AI safety and ideally this method would be adopted by all major LLMs.That part starts around 1 hour 25 min in.> Scott Aaronson: Exactly. In fact, we have a pseudorandom function that maps the N-gram to, let’s say, a real number from zero to one. Let’s say we call that real number ri for each possible choice i of the next token. And then let’s say that GPT has told us that the ith token should be chosen with probability pi.<a href="https://axrp.net/episode/2023/04/11/episode-20-reform-ai-alignment-scott-aaronson.html" rel="nofollow">https://axrp.net/episode/2023/04/11/episode-20-reform-ai-ali...</a>

评论 #42054451 未加载

评论 #42056186 未加载

评论 #42051484 未加载

评论 #42057291 未加载

namanyayg6 个月前

"An LLM generates text one token at a time. These tokens can represent a single character, word or part of a phrase. To create a sequence of coherent text, the model predicts the next most likely token to generate. These predictions are based on the preceding words and the probability scores assigned to each potential token.For example, with the phrase “My favorite tropical fruits are __.” The LLM might start completing the sentence with the tokens “mango,” “lychee,” “papaya,” or “durian,” and each token is given a probability score. When there’s a range of different tokens to choose from, SynthID can adjust the probability score of each predicted token, in cases where it won’t compromise the quality, accuracy and creativity of the output.This process is repeated throughout the generated text, so a single sentence might contain ten or more adjusted probability scores, and a page could contain hundreds. The final pattern of scores for both the model’s word choices combined with the adjusted probability scores are considered the watermark. This technique can be used for as few as three sentences. And as the text increases in length, SynthID’s robustness and accuracy increases."Better link: <a href="https://deepmind.google/technologies/synthid/" rel="nofollow">https://deepmind.google/technologies/synthid/</a>

评论 #42051493 未加载

评论 #42051397 未加载

ruuda6 个月前

Some comments here point at impossibility results, but after screening hundreds of job applications at work, it's not hard to pick out the LLM writing, even without watermark. My internal LLM detector is now so sensitive that I can tell when my confirmed-human colleagues used an LLM to rephrase something when it's longer than just one sentence. The writing style is just so different.Maybe if you prompt it right, it can do a better job of masking itself, but people don't seem to do that.

评论 #42058932 未加载

ksaj6 个月前

Some of the watermarking is really obvious. If you write song lyrics in ChatGPT, watch for phrases like "come what may" and "I stand tall."It's not just that they are (somewhat) unusual phrases, it's that ChatGPT comes up with those phrases so very often.It's quite like how earlier versions always had a "However" in between explanations.

评论 #42097316 未加载

评论 #42051255 未加载

评论 #42051288 未加载

espadrine6 个月前

The academic paper: <a href="https://www.nature.com/articles/s41586-024-08025-4" rel="nofollow">https://www.nature.com/articles/s41586-024-08025-4</a>They use the last N prefix tokens, hash them (with a keyed hash), and use the random value to sample the next token by doing an 8-wise tournament, by assigning random bits to each of the top 8 preferred tokens, making pairwise comparisons, and keeping the token with a larger bit. (Yes, it seems complicated, but apparently it increases the watermarking accuracy compared to a straightforward nucleus9 sampling.)The negative of this approach is that you need to rerun the LLM, so you must keep all versions of all LLMs that you trained, forever.

评论 #42056417 未加载

评论 #42056424 未加载

samatman6 个月前

This is information-theoretically guaranteed to make LLM output worse.My reasoning is simple: the only way to watermark text is to inject some relatively low-entropy signal into it, which can be detected later. This has to a) work for "all" output for some values of all, and b) have a low false positive rate on the detection side. The amount of signal involved cannot be subtle, for this reason.That signal has a subtractive effect on the predictive-output signal. The entropy of the output is fixed by the entropy of natural language, so this is a zero-sum game: the watermark signal will remove fidelity from the predictive output.This is impossible to avoid or fix.

评论 #42056008 未加载

评论 #42056471 未加载

mateus16 个月前

Google is branding this in a positive light but this is just AI text DRM.

评论 #42051339 未加载

评论 #42051560 未加载

评论 #42051329 未加载

fny6 个月前

I think we just need to give up on this. What’s the harm? It’s not like some ground truth is fabricated.I’m far, far more concerned about photo, video, and audio verification. We need a camera that can guarantee a recording is real.

评论 #42056833 未加载

评论 #42056812 未加载

playingalong6 个月前

> the team tested it on 20 million prompts given to Gemini. Half of those prompts were routed to the SynthID-Text system and got a watermarked response, while the other half got the standard Gemini response. Judging by the “thumbs up” and “thumbs down” feedback from users, the watermarked responses were just as satisfactory to users as the standard ones.Three comments here:1. I wonder how many of the 20M prompts got a thumbs up or down. I don't think people click that a lot. Unless the UI enforces it. I haven't used Gemini, so I might be unaware.2. Judging a single response might be not enough to tell if watermarking is acceptable or not. For instance, imagine the watermarking is adding "However," to the start of each paragraph. In a single GPT interaction you might not notice it. Once you get 3 or 4 responses it might stand out.3. Since when Google is happy with measuring by self declared satisfaction? Aren't they the kings of A/B testing and high volume analysis of usage behavior?

评论 #42051363 未加载

tokioyoyo6 个月前

Correct me if I’m wrong, but wouldn’t it simply drive people to use LLMs that are not watermarking their content?

评论 #42051277 未加载

评论 #42051291 未加载

评论 #42051346 未加载

评论 #42051340 未加载

评论 #42051259 未加载

评论 #42051505 未加载

评论 #42051358 未加载

评论 #42051676 未加载

harimau7776 个月前

This strikes me as potentially a bad thing for regular people. For example, corporations call still use AI filtering to force job seekers to jump through hoops but job seekers won't be able to use AI to generate the cover letters and resumes that those hoops demand.

sharpshadow6 个月前

To archive the watermark they store every output which they create and let partners check against it. That’s how I understand the article.Then they also store everything which the partners upload to check if it’s created by them.If other AI players also would store everything they create and make it available in a similar way there could be indeed some working watermark.If one would use a private run AI to change the public run AI generated content to alter it there still would be a percentage similarity recognisable to hint that it might come from one of the public AIs.Timestamps would become quite relevant since much content would start to repeat itself at some point and the answers generated might be similar.

matteoraso6 个月前

By design, a watermark would make it easy to create a discriminator that distinguishes between LLM content and human content. In that case, just make a discriminator yourself and use regex to find and remove any of the watermarks.

js86 个月前

I think people are already doing that. I frequently hear people watermarking their speeches with phrases like "are we aligned on this?", or "let's circle back" and similar.

评论 #42051436 未加载

tomxor6 个月前

> Such modifications introduce a statistical signature into the generated text,Great so now people have to be worried about being too statistically similar to an arbitrary "watermark".

rany_6 个月前

I really want to be able to try Gemini without the AI watermark. IIRC they've used SynthID from the start and it makes me wonder if it's the source of all of Gemini's issues.Obviously Google claims that it doesn't cause any issues but I'd think that OpenAI and other competitors would have something similar to SynthID if it didn't impact performance.

评论 #42056205 未加载

lowbloodsugar6 个月前

I want AI to use just the right word when it’s writing for me. If it’s going to nerf itself to not choose the perfect word so it can be watermarked, then why would I use that product? I’ll go somewhere else. And if it does use just the right word, then how is that different from a great human writer?

评论 #42057265 未加载

nprateem6 个月前

Google are obviously pushing this as a way to root out AI blog spam.If only they can get other providers to use it because of 'safety' or something they won't have to change their indexer much. Otherwise page rank is dead due to the ease of creating content farms.

评论 #42057395 未加载

ajwin6 个月前

Do LLM's always pick the most probable next word? I would have thought this would lead to having the same output for every input? How does this deal with the randomness that you get from prompting the same thing over and over?

评论 #42056246 未加载

评论 #42056344 未加载

playingalong6 个月前

> It has also open-sourced the tool and made it available to developers and businesses, allowing them to use the tool to determine whether text outputs have come from their own large language models (LLMs), the AI systems that power chatbots. However, only Google and those developers currently have access to the detector that checks for the watermark.These two sentences next to each other don't make much sense. Or are misleading.Yeah. I know. Only the client is open source and it calls home.

评论 #42051344 未加载

villmann6 个月前

To what degree will AI-generated text and watermarking influence how human language evolves... I bet "delve" will become more frequent in the spoken language :)

tiffanyh6 个月前

OT: The publication (Spectrum by IEEE) has some really good content.It's starting to become a common destination for when I want to read about interesting things.

FilipSivak6 个月前

How is this supposed to work? By inserting special unicode characters?How can you watermark text?

评论 #42051335 未加载

评论 #42051294 未加载

评论 #42051303 未加载

评论 #42051374 未加载

评论 #42051327 未加载

评论 #42051403 未加载

评论 #42051354 未加载

cowmix6 个月前

"I hope this message finds you well." --- busted!

matthewmorgan6 个月前

Who is going to pay for watermarked output?

26 条评论

blintz6 个月前

评论 #42056346 未加载

评论 #42057932 未加载

评论 #42067042 未加载

评论 #42058851 未加载

评论 #42056302 未加载

bko6 个月前

评论 #42054451 未加载

评论 #42056186 未加载

评论 #42051484 未加载

评论 #42057291 未加载

namanyayg6 个月前

评论 #42051493 未加载

评论 #42051397 未加载

ruuda6 个月前

评论 #42058932 未加载

ksaj6 个月前

评论 #42097316 未加载

评论 #42051255 未加载

评论 #42051288 未加载

espadrine6 个月前

评论 #42056417 未加载

评论 #42056424 未加载

samatman6 个月前

评论 #42056008 未加载

评论 #42056471 未加载

mateus16 个月前

Google is branding this in a positive light but this is just AI text DRM.

评论 #42051339 未加载

评论 #42051560 未加载

评论 #42051329 未加载

fny6 个月前

评论 #42056833 未加载

评论 #42056812 未加载

playingalong6 个月前

评论 #42051363 未加载

tokioyoyo6 个月前

Correct me if I’m wrong, but wouldn’t it simply drive people to use LLMs that are not watermarking their content?

评论 #42051277 未加载

评论 #42051291 未加载

评论 #42051346 未加载

评论 #42051340 未加载

评论 #42051259 未加载

评论 #42051505 未加载

评论 #42051358 未加载

评论 #42051676 未加载

harimau7776 个月前

sharpshadow6 个月前

matteoraso6 个月前

js86 个月前

I think people are already doing that. I frequently hear people watermarking their speeches with phrases like "are we aligned on this?", or "let's circle back" and similar.

评论 #42051436 未加载

tomxor6 个月前

> Such modifications introduce a statistical signature into the generated text,Great so now people have to be worried about being too statistically similar to an arbitrary "watermark".

rany_6 个月前

评论 #42056205 未加载

lowbloodsugar6 个月前

评论 #42057265 未加载

nprateem6 个月前

评论 #42057395 未加载

ajwin6 个月前

评论 #42056246 未加载

评论 #42056344 未加载

playingalong6 个月前

评论 #42051344 未加载

villmann6 个月前

To what degree will AI-generated text and watermarking influence how human language evolves... I bet "delve" will become more frequent in the spoken language :)

tiffanyh6 个月前

OT: The publication (Spectrum by IEEE) has some really good content.It's starting to become a common destination for when I want to read about interesting things.

FilipSivak6 个月前

How is this supposed to work? By inserting special unicode characters?How can you watermark text?