TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Judge dismisses DMCA copyright claim in GitHub Copilot suit

381 点作者 samspenc11 个月前

27 条评论

munificent11 个月前
<i>&gt; Indeed, last year GitHub was said to have tuned its programming assistant to generate slight variations of ingested training code to prevent its output from being accused of being an exact copy of licensed software.</i><p>If I, a human, were to:<p>1. Carefully read and memorize some copyrighted code.<p>2. Produce new code that is textually identical to that. But in the process of typing it up, I randomly mechanically tweak a few identifiers or something to produce code that has the exact same semantics but isn&#x27;t character-wise identical.<p>3. Claim that as new original code without the original copyright.<p>I assume that I would get my ass kicked legally speaking. That reads to me exactly like deliberate copyright infringement with willful obfuscation of my infringement.<p>How is it any different when a machine does the same thing?
评论 #40921116 未加载
评论 #40921541 未加载
评论 #40921010 未加载
评论 #40922088 未加载
评论 #40921320 未加载
评论 #40920897 未加载
评论 #40923908 未加载
评论 #40923055 未加载
评论 #40925885 未加载
评论 #40921329 未加载
评论 #40922089 未加载
评论 #40924773 未加载
评论 #40924791 未加载
评论 #40925273 未加载
评论 #40922441 未加载
评论 #40925204 未加载
评论 #40923829 未加载
评论 #40926667 未加载
评论 #40931507 未加载
评论 #40926162 未加载
评论 #40922600 未加载
评论 #40920963 未加载
评论 #40923070 未加载
评论 #40923308 未加载
评论 #40922955 未加载
评论 #40924876 未加载
评论 #40922478 未加载
评论 #40926756 未加载
评论 #40927391 未加载
评论 #40920916 未加载
评论 #40922925 未加载
评论 #40921128 未加载
daedrdev11 个月前
&gt; The anonymous programmers have repeatedly insisted Copilot could, and would, generate code identical to what they had written themselves, which is a key pillar of their lawsuit since there is an identicality requirement for their DMCA claim. However, Judge Tigar earlier ruled the plaintiffs hadn&#x27;t actually demonstrated instances of this happening, which prompted a dismissal of the claim with a chance to amend it.<p>It sounds fair from how the article describes it
评论 #40920486 未加载
评论 #40921103 未加载
评论 #40928363 未加载
bityard11 个月前
This is pretty interesting, and I have conflicted feelings about the (seemingly obvious) outcome of this trial.<p>I wonder, if MS and OpenAI win, does that mean it will be legal for anyone to take the leaked source code for a proprietary product, train an LLM on it, and then ask the LLM to emit a version of it that is different enough to avoid copyright infringement?<p>That would be quite the double-edged sword for proprietary software companies.
评论 #40919817 未加载
评论 #40921107 未加载
评论 #40921197 未加载
评论 #40919918 未加载
评论 #40923551 未加载
评论 #40925636 未加载
评论 #40920663 未加载
评论 #40921988 未加载
评论 #40924018 未加载
评论 #40928700 未加载
评论 #40920111 未加载
评论 #40920952 未加载
hn_throwaway_9911 个月前
A slight aside, but this is the subtitle:<p>&gt; A few devs versus the powerful forces of Redmond – who did you think was going to win?<p>I hate that kind of obnoxious &quot;journalism&quot;. Sometimes the little guy is actually wrong. To clarify, I&#x27;m not commenting on the specifics of this case, I just hate how fake our online discourse has been by appealing to &quot;big guy evil&quot; before even bringing up the specifics of the case.
评论 #40921947 未加载
评论 #40922308 未加载
评论 #40921898 未加载
mvdtnz11 个月前
What were the plaintiffs even thinking when they submitted a claim based on identicality without being able to produce a single instance of copilot generating a verbatim copy. Even the research they submitted was unable to make a claim any stronger than &quot;it&#x27;s possibly in theory but we&#x27;ve never seen it&quot;.
评论 #40923468 未加载
epolanski11 个月前
I am not strongly opinionated on this, but the very fact Microsoft used all the code it could find, bar their own has always looked suspicious to me.
评论 #40925409 未加载
评论 #40922969 未加载
lumb6311 个月前
It seems to me that regardless of the outcome of this case, some developers do not want to have their code used to train LLMs. There may need to be a new license created to restrict this usage of software. Or, maybe developers will simply stop contributing open source. In today’s day and age, where open source code serves as a tool to pad Microsoft’s pockets, I certainly will not publish any of my software open source, despite how much I would like to (under GPL) in order to help fellow developers.<p>If I were Microsoft, I’d really be concerned that I’m going to kill my golden goose by causing a large-scale exodus from GitHub or open source development more generally. Another idea I’ve considered is publishing boatloads of useless or incorrect code to poison their training data.<p>As I see it, people should be able to restrict how people use something that they gave them. If some people prefer that their code is not used to train LLMs, there should be a way to enforce that.
评论 #40925666 未加载
评论 #40925673 未加载
评论 #40936992 未加载
评论 #40926042 未加载
perlgeek11 个月前
From the article:<p>&gt; The anonymous programmers have repeatedly insisted Copilot could, and would, generate code identical to what they had written themselves, which is a key pillar of their lawsuit since there is an identicality requirement for their DMCA claim. However, Judge Tigar earlier ruled the plaintiffs hadn&#x27;t actually demonstrated instances of this happening, which prompted a dismissal of the claim with a chance to amend it.<p>So, the problem is really one of the lack of evidence, which seems... like a pretty basic mistake from the plaintiffs?<p>They could&#x27;ve taken a screencap video back when Copilot still produced code more verbatim, and used that as evidence, I assume.
bsza11 个月前
Should we move to modified versions of FOSS licenses that forbid AI training?<p>Found this: <a href="https:&#x2F;&#x2F;github.com&#x2F;non-ai-licenses&#x2F;non-ai-licenses">https:&#x2F;&#x2F;github.com&#x2F;non-ai-licenses&#x2F;non-ai-licenses</a><p>Legally sound or not, these should at least prevent your code from being included in Copilot&#x27;s training data, hopefully without affecting any other use case. I&#x27;m going to use one of these next time I start a new project.
评论 #40923250 未加载
评论 #40925656 未加载
评论 #40922866 未加载
评论 #40923252 未加载
cellis11 个月前
I would like to ask an obvious question to the legally inclined here. How is this any different than remixing a song (lyrics&#x2F;audio)? It&#x27;s not &quot;identical&quot;, and doesn&#x27;t output &quot;verbatim&quot; lyrics or audio. What is the distinction between &lt;LLM&gt; and &lt;Singer&#x2F;Remixer who outputs remixed lyrics&#x2F;audio&gt;. By a quick Google search it seems remixes violate copyright.
评论 #40923438 未加载
评论 #40937807 未加载
评论 #40922943 未加载
MagicMoonlight11 个月前
The issue I have is that these models are inherently trained to duplicate stuff. You train them by comparing the output to the original.<p>If I made an “advanced music engine” which rips Taylor swift files and duplicates them, I would be sued to oblivion. Why does calling it an AI suddenly fix that?<p>They should have to train them on information they legally own.
评论 #40925440 未加载
snvzz11 个月前
All GitHub needs to do to make most happy is offer an opt-out toggle.<p>It still doesn&#x27;t.
评论 #40937034 未加载
slicktux11 个月前
Yet people keep feeding it their code by using GitHub as their repo… Just how we use the internet to share information; there’s just no escaping it.
passwordoops11 个月前
&quot;The lack of documents from the Windows maker is apparently down to &quot;technical difficulties&quot; in collecting Slack messages&quot;<p>Wait, I&#x27;m forced to use Teams at work but Microsoft employees are on Slack?!
yazzku11 个月前
&gt; The judge disagreed, however, on the grounds that the code suggested by Copilot was not identical enough to the developers&#x27; own copyright-protected work, and thus section 1202(b) did not apply.<p>How did they reach this conclusion? How can you prove that it never copies a code snippet verbatim, versus just showing that it does for one specific code snippet? The latter is a lot easier to show, but I don&#x27;t know what is it exactly that the prosecution claimed. I guess the size of the copy also matters in copyright violations?
评论 #40925374 未加载
nashashmi11 个月前
Big question: this thing called “training” AI off of data, how much of this is “training” and how much of this is “synthesizing”? It seems like if code is being copied and rephrased, it is synthetic. Not much “learning” and “training” going on here.
loceng11 个月前
This kind of argument makes me feel like it also supports the abolition of patents: eventually multiple other people will come up with the same obvious solution, which becomes obvious once a person spends enough time looking at a problem.
评论 #40920268 未加载
评论 #40921337 未加载
purpleblue11 个月前
Can you insist or put instructions that AIs do not train on your code? If they train on your code but don&#x27;t produce the exact same output, is there any protection you can have from that?
评论 #40921307 未加载
评论 #40929537 未加载
评论 #40922974 未加载
chrismsimpson11 个月前
If this is how the law is applied for code, are we to expect this is also how it will be applied for other data (e.g. audio a la Udio and Suno)?
albertTJames11 个月前
Looking good ! Go Copilot !
rolph11 个月前
copilot was apparently snipping license bearing comments, and applying &quot;semantic&quot; variations of the remaining code.<p>i would package the entire code as a series of comments, [ideally this would be snipped by the pliagarists] leaving a snippet of example code that no one of sound mind would allow to execute, being proffered by copilot.
评论 #40919851 未加载
评论 #40919854 未加载
WesternWind11 个月前
Wait... So Microsoft doesn&#x27;t use Microsoft Teams, it uses Slack?
评论 #40922291 未加载
sagarpatil11 个月前
Off topic: How does the judiciary decide which judge to choose for such highly technical case?
评论 #40923384 未加载
Tomte11 个月前
That‘s Matthew Butterick‘s case.
chidli123411 个月前
Microsoft has deep pockets. Judges aren&#x27;t objective. More at 11.
nancyp11 个月前
Linux&#x2F;OSS is cancer. Said who? Anything in public domain is for grab by them.<p>Until the open tech community is chicken enough to not boycott their no open source stuff such as github and linked in a proof nothing will happen.
评论 #40928167 未加载
pledess11 个月前
I thought &quot;the Copilot coding assistant was trained on open source software hosted on GitHub and as such would suggest snippets from those public projects to other programmers without care for licenses&quot; was explicitly allowed by the GitHub Terms of Service: <a href="https:&#x2F;&#x2F;docs.github.com&#x2F;en&#x2F;site-policy&#x2F;github-terms&#x2F;github-terms-of-service" rel="nofollow">https:&#x2F;&#x2F;docs.github.com&#x2F;en&#x2F;site-policy&#x2F;github-terms&#x2F;github-t...</a> &quot;If you set your pages and repositories to be viewed publicly, you grant each User of GitHub a nonexclusive, worldwide license to use, display, and perform Your Content through the GitHub Service.&quot; In other words, in addition to what&#x27;s allowed by the LICENSE file in your repo, you are also separately licensing your code &quot;to use ... through the GitHub Service&quot; and this would (in my interpretation) include use by Copilot for training, and use by Copilot to deliver snippets to any other GitHub user.
评论 #40920678 未加载
评论 #40920389 未加载
评论 #40920374 未加载