TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

A Watermark for Large Language Models

4 点作者 lnyan超过 2 年前

1 comment

ggm超过 2 年前
I don&#x27;t want to be a negative nancy, but What chance by judicious comparison of textual outputs people can back-calculate the algorithm behind the watermark, and then either remove or defeat it?<p>E.G. if it uses inter-textual spacing, then re-flow pagination would erode it. If it uses optional textual marking, a semantic read could replace ; instances with other constructs. If it does dependent word order for sentence end, or start or other stylometric changes, that too is probably statistically detectable.<p>To function for text fragments it will have to have some bitrate in the word stream, and therefore be subject to loss of sufficient bits to prevent reconstruction of the hash code (or whatever) it is based on.<p>I want this to exist, but I worry its a signal more than just the originator can detect, and therefore potentially can be defeated.<p>I very much hope they aren&#x27;t proposing security by obscurity. If its public-private key based, then it will want the security community to review it with a fine tooth comb.<p>Oh dear. I see: <i>Methods for keeping the watermark algorithm secret but available via API are discussed in Section 5.</i><p>So they propose a large set of tokens, and propose a 50&#x2F;50 style selection of which have meaning and which don&#x27;t to increase the surface of cost to detect the tokens and &quot;flip&quot; them, therefore permitting them to argue the watermark may have been found, but could not be entirely eroded.<p>In an earlier HN thread I was (incorrectly) accused of using a GPT to write my responses. Shortly after, I read of others, and a few days after that Scott Aarenson talked about wanting to deploy a watermark&#x2F;signature method. I do not see myself as a trigger in that I am just sui generis an instance of why he said they need to: It turns out the simplistic &quot;you are a bot&quot; people have a horrendous false positive rate, and a lot of people who write like me are at risk of being labelled bots, when we aren&#x27;t.<p>(this whine was not produced with ChatGPT)
评论 #34520639 未加载