This has some kind of stupid custom license with it that from what I can tell only lets you use models you train on it for "internal use" (or tries to, it's fair use so whatever) . It's getting really shitty to see everyone trying to control how people can use their "contributions" - if it was for commercial reasons I'd understand but it's all this silly "AI harms" garbage. Treat collaborators like adults and let them decide how they want to use ostensibly public domain stuff.
According to the license, AllenAI can just take over (all) the rights and ownership for any derivative works by revoking your usage license, which they can also do at will.<p>Reasonably speaking, nobody can use this dataset for anything of value.
I really wonder who comes up with these "open-source" products with such licenses and why they even bother. I guess Marketing?