TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: License to Protect Training Data?

2 点作者 alhirzel大约 2 个月前
I am developing a system that will be used to inspect some data and identify things within it manually. I expect that, in some cases, these identifications will be used to train machine learning models. Is there an existing license that I can apply to the software that would require the end products of these outputs (i.e. the identifications and model weights) to be made public? Something like the GPL, but to democratize access to training data and models created downstream.<p>The application is in a niche scientific field and I am not worried about a lack of users, and I expect many users will align with the ethos I am proposing. I am simply wondering if a license or arrangement like this has been created already.

1 comment

pabs3大约 2 个月前
First question to ask yourself, are those identifications copyrightable?<p>Sounds like they are simply facts about the data, so probably not copyrightable. You could require a contract containing forced disclosure in order to download and use the software, but this would be very non-free and GPL incompatible.<p>If they were copyrightable, then they would be owned not by you, but by the person using your software to create the identifications. You could require copyright assignment for anyone who uses your software, but that would be very non-free and GPL incompatible.<p>You might also like to read this:<p><a href="https:&#x2F;&#x2F;salsa.debian.org&#x2F;deeplearning-team&#x2F;ml-policy" rel="nofollow">https:&#x2F;&#x2F;salsa.debian.org&#x2F;deeplearning-team&#x2F;ml-policy</a>