科技回声

1 comment

ggm超过 2 年前

I don't want to be a negative nancy, but What chance by judicious comparison of textual outputs people can back-calculate the algorithm behind the watermark, and then either remove or defeat it?E.G. if it uses inter-textual spacing, then re-flow pagination would erode it. If it uses optional textual marking, a semantic read could replace ; instances with other constructs. If it does dependent word order for sentence end, or start or other stylometric changes, that too is probably statistically detectable.To function for text fragments it will have to have some bitrate in the word stream, and therefore be subject to loss of sufficient bits to prevent reconstruction of the hash code (or whatever) it is based on.I want this to exist, but I worry its a signal more than just the originator can detect, and therefore potentially can be defeated.I very much hope they aren't proposing security by obscurity. If its public-private key based, then it will want the security community to review it with a fine tooth comb.Oh dear. I see: Methods for keeping the watermark algorithm secret but available via API are discussed in Section 5.So they propose a large set of tokens, and propose a 50/50 style selection of which have meaning and which don't to increase the surface of cost to detect the tokens and "flip" them, therefore permitting them to argue the watermark may have been found, but could not be entirely eroded.In an earlier HN thread I was (incorrectly) accused of using a GPT to write my responses. Shortly after, I read of others, and a few days after that Scott Aarenson talked about wanting to deploy a watermark/signature method. I do not see myself as a trigger in that I am just sui generis an instance of why he said they need to: It turns out the simplistic "you are a bot" people have a horrendous false positive rate, and a lot of people who write like me are at risk of being labelled bots, when we aren't.(this whine was not produced with ChatGPT)

评论 #34520639 未加载

1 comment

ggm超过 2 年前

评论 #34520639 未加载

A Watermark for Large Language Models

1 comment

A Watermark for Large Language Models

1 comment