TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

We urgently need the watermark equivalent of a “robots.txt” for Machine Learning

3 点作者 joshenders超过 2 年前

1 comment

ksaj超过 2 年前
I&#x27;d call it copyright.txt since images are not the only thing affected.<p>I&#x27;ve been trying to determine if github, which supplies ready-made license files for every repository, enforces these licenses as implemented when feeding co-pilot. It appears that it does not, which is really baffling since there is no excuse for ignoring licenses that they supply as part of the process.<p>One way I tested was to write a Hello World in Common Lisp and other languages. The code co-pilot generated very often comes directly from a course, and &quot;World&quot; was replaced by the name of the site or book the code was listed in. Did the license allow for this use?<p>You can opt-out of your code being used by co-pilot, but once it is already in the model, then what? And why require user input when the license file already states whether it is permitted or not?<p>The other thing is just how much crap code there is on github. What kind of quality should one expect when poor code is copied so much, increasing the likelihood co-pilot will treat it as gospel.<p>AI has a long ways to go yet.