TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Nightshade: An offensive tool for artists against AI art generators

590 点作者 ink404超过 1 年前

64 条评论

ink404超过 1 年前
Paper is here: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2310.13828" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2310.13828</a>
542458超过 1 年前
This seems to introduce levels of artifacts that many artists would find unacceptable: <a href="https:&#x2F;&#x2F;twitter.com&#x2F;sini4ka111&#x2F;status&#x2F;1748378223291912567" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;sini4ka111&#x2F;status&#x2F;1748378223291912567</a><p>The rumblings I&#x27;m hearing are that this a) barely works with last-gen training processes b) does not work at all with more modern training processes (GPT-4V, LLaVA, even BLIP2 labelling [1]) and c) would not be especially challenging to mitigate against even should it become more effective and popular. The Authors&#x27; previous work, Glaze, also does not seem to be very effective despite dramatic proclamations to the contrary, so I think this might be a case of overhyping an academically interesting but real-world-impractical result.<p>[1]: Courtesy of &#x2F;u&#x2F;b3sn0w on Reddit: <a href="https:&#x2F;&#x2F;imgur.com&#x2F;cI7RLAq" rel="nofollow">https:&#x2F;&#x2F;imgur.com&#x2F;cI7RLAq</a> <a href="https:&#x2F;&#x2F;imgur.com&#x2F;eqe3Dyn" rel="nofollow">https:&#x2F;&#x2F;imgur.com&#x2F;eqe3Dyn</a> <a href="https:&#x2F;&#x2F;imgur.com&#x2F;1BMASL4" rel="nofollow">https:&#x2F;&#x2F;imgur.com&#x2F;1BMASL4</a>
评论 #39073648 未加载
评论 #39070858 未加载
评论 #39071249 未加载
评论 #39071063 未加载
评论 #39084388 未加载
评论 #39074931 未加载
评论 #39071545 未加载
gfodor超过 1 年前
Huge market for snake oil here. There is no way that such tools will ever win, given the requirements the art remain viewable to human perception, so even if you made something that worked (which this sounds like it doesn’t) from first principles it will be worked around immediately.<p>The only real way for artists or anyone really to try to hold back models from training on human outputs is through the law, ie, leveraging state backed violence to deter the things they don’t want. This too won’t be a perfect solution, if anything it will just put more incentives for people to develop decentralized training networks that “launder” the copyright violations that would allow for prosecutions.<p>All in all it’s a losing battle at a minimum and a stupid battle at worst. We know these models can be created easily and so they will, eventually, since you can’t prevent a computer from observing images you want humans to be able to observe freely.
评论 #39072466 未加载
评论 #39072800 未加载
评论 #39074285 未加载
评论 #39074368 未加载
评论 #39076330 未加载
评论 #39075787 未加载
评论 #39081563 未加载
评论 #39080162 未加载
评论 #39087851 未加载
评论 #39081417 未加载
评论 #39084550 未加载
评论 #39075946 未加载
minimaxir超过 1 年前
A few months ago I made a proof-of-concept on how finetuning Stable Diffusion XL on known bad&#x2F;incoherent images can actually allow it to output &quot;better&quot; images if those images are used as a negative prompt, i.e. specifying a high-dimensional area of the latent space that model generation should stay away from: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=37211519">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=37211519</a><p>There&#x27;s a nonzero chance that encouraging the creation of a large dataset of known tampered data can ironically <i>improve</i> generative AI art models by allowing the model to recognize tampered data and allow the training process to work around it.
评论 #39113022 未加载
eigenvalue超过 1 年前
This seems like a pretty pointless &quot;arms race&quot; or &quot;cat and mouse game&quot;. People who want to train generative image models and who don&#x27;t care about what artists think about it at all can just do some basic post-processing on the images that is just enough to destroy the very carefully tuned changes this Nightshade algorithm makes. Something like resampling it to slightly lower resolution and then using another super-resolution model on it to upsample it again would probably be able to destroy these subtle tweaks without making a big difference to a human observer.<p>In the future, my guess is that courts will generally be on the side of artists because of societal pressures, and artists will be able to challenge any image they find and have it sent to yet another ML model that can quickly adjudicate whether the generated image is &quot;too similar&quot; to the artist&#x27;s style (which would also need to be dissimilar enough from everyone else&#x27;s style to give a reasonable legal claim in the first place).<p>Or maybe artists will just give up on trying to monetize the images themselves and focus only on creating physical artifacts, similar to how independent musicians make most of their money nowadays from touring and selling merchandise at shows (plus Patreon). Who knows? It&#x27;s hard to predict the future when there are such huge fundamental changes that happen so quickly!
评论 #39073735 未加载
评论 #39075640 未加载
评论 #39080187 未加载
评论 #39087839 未加载
评论 #39073164 未加载
marcinzm超过 1 年前
This feels like it&#x27;ll actually help make AI models better versus worse once they train on these images. Artists are basically, for free, creating training data that conveys what types of noise does not change the intended meaning of the image to the artist themselves.
r3trohack3r超过 1 年前
The number of people who are going to be able to produce high fidelity art with off the shelf tools in the near future is unbelievable.<p>It’s pretty exciting.<p>Being able to find a mix of styles you like and apply them to new subjects to make your own unique, personalized, artwork sounds like a wickedly cool power to give to billions of people.
评论 #39072915 未加载
评论 #39073148 未加载
评论 #39072108 未加载
评论 #39071744 未加载
评论 #39073553 未加载
chris-orgmenta超过 1 年前
I want <i>progressive fees</i> on copyright&#x2F;IP&#x2F;patent usage, and worldwide gov cooperation&#x2F;legislation (and perhaps even worldwide ability to use works without obtaining initial permission, although let&#x27;s not go into that outlandish stuff)<p>I want a scaling license fee to apply (e.g. % pegged to revenue. This still has an indirect problem with different industries having different profit margins, but still seems the fairest).<p>And I want the world (or EU, then others to follow suit) to slowly reduce copyright to 0 years* after artists death if owned by a person, and 20-30 years max if owned by a corporation.<p>And I want the penalties for not declaring usage** &#x2F; not paying fees, to be incredibly high for corporations... 50% gross (harder) &#x2F; net (easier) profit margin for the year? Something that isn&#x27;t a slap on the wrist and can&#x27;t be wriggled out of <i>quite</i> so easily, and is actually an incentive not to steal in the first place.)<p>[*]or whatever society deems appropriate.<p>[**]Until auto-detection (for better or worse) gets good enough.<p>IMO that would allow personal use, encourages new entrants to market, encourages innovation, incentivises better behaviour from OpenAI et al.
评论 #39074752 未加载
评论 #39076132 未加载
alentred超过 1 年前
With this &quot;solution&quot; it looks like the world of art enters the cat-and-mouse game the ad blockers were playing for the last decade or two.
评论 #39070816 未加载
评论 #39070883 未加载
brucethemoose2超过 1 年前
What the article doesn&#x27;t illustrate is that it destroys fine detail in the image, even in the thumbnails of the reference paper: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2310.13828.pdf" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2310.13828.pdf</a><p>Also... Maybe I am naive, but it seems rather trivial to work around with a quick prefilter? I don&#x27;t know if tradition denoising would be enough, but worst case you could run img2img diffusion.<p>reply
评论 #39071269 未加载
jamesu超过 1 年前
Long-term I think the real problem for artists will be corporations generating their own high quality targeted datasets from a cheap labor pool, completely outcompeting them by a landslide.
评论 #39072226 未加载
评论 #39071476 未加载
评论 #39071778 未加载
Quanttek超过 1 年前
This is fantastic. If companies want to create AI models, they should license the content they use for the training data. As long as there are not sufficient legal protections and the EU&#x2F;Congress do not act, tools like these can serve as a stopgap and maybe help increase pressure on policymakers
评论 #39068494 未加载
评论 #39068328 未加载
eddd-ddde超过 1 年前
Isn&#x27;t this just teaching the models how to better understand pictures as humans do? As long as you feed them content that looks good to a human, wouldn&#x27;t they improve in creating such content?
评论 #39074784 未加载
GaggiX超过 1 年前
These methods like Glaze usually works by taking the original image chaging the style or content and then apply LPIPS loss on an image encoder, the hope is that if they can deceive a CLIP image encoder it would confuse also other models with different architecture, size and dataset, while changing the original image as little as possible so it&#x27;s not too noticeable to a human eye. To be honest I don&#x27;t think it&#x27;s a very robust technique, with this one they claim that a model instead of seeing for example a cow on grass the model will see a handbag, if someone has access to GPT4-V I want to see if it&#x27;s able to deceive actually big image encoders (usually more aligned to the human vision).<p>EDIT: I have seen a few examples with GPT-4 V and how I imagine it wasn&#x27;t deceived, I doubt this technique can have any impact on the quality of the models, the only impact that this could potentially have honestly is to make the training more robust.
garg超过 1 年前
Each time there is an update to training algorithms and in response poisoning algorithms, artists will have to re-glaze, re-mist, and re-nightshade all their images?<p>Eventually I assume the poisoning artifacts introduced in the images will be very visible to humans as well.
msp26超过 1 年前
&gt;Like Glaze, Nightshade is computed as a multi-objective optimization that minimizes visible changes to the original image.<p>It&#x27;s still noticeably visible.
评论 #39068608 未加载
popohauer超过 1 年前
I&#x27;m glad to see tools like Nightshade starting to pop up to protect the real life creativity of artists. I like AI art, but I do feel conflicted about its potential long term effects towards a society that no longer values authentic creativity.
评论 #39070992 未加载
peter_d_sherman超过 1 年前
To protect an individual&#x27;s image property rights from image generating AI&#x27;s -- wouldn&#x27;t it be simpler for the IETF (or other standards-producing group) to simply create an<p><i>AI image exclusion standard</i><p>, similar to <i>&quot;robots.txt&quot;</i> -- which would tell an AI data-gathering web crawler that a given image or set of images -- was off-limits for use as data?<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Robots.txt" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Robots.txt</a><p><a href="https:&#x2F;&#x2F;www.ietf.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.ietf.org&#x2F;</a>
评论 #39070819 未加载
评论 #39070833 未加载
ang_cire超过 1 年前
Setting aside the efficacy of this tool, I would be very interested in the legal implications of putting designs in your art that could corrupt ML models.<p>For instance, if I set traps in my home which hurt an intruder we are both guilty of crimes (traps are illegal and are never considered self defense, B&amp;E is illegal).<p>Would I be responsible for corrupting the AI operator&#x27;s data if I intentionally include adversarial artifacts to corrupt models, or is that just DRM to legally protect my art from infringement?<p>edit:<p>I replied to someone else, but this is probably good context:<p>DRM is legally allowed to disable or even corrupt the software or media that it is protecting, if it detects misuse.<p>If an adversarial-AI tool attacks the model, it then becomes a question of whether the model, having now incorporated my protected art, is now &quot;mine&quot; to disable&#x2F;corrupt, or whether it is in fact out of bounds of DRM.<p>So for instance, a court could say that the adversarial-AI methods could only actively prevent the training software from incorporating the protected media into a model, but could not corrupt the model itself.
评论 #39071133 未加载
评论 #39074455 未加载
评论 #39071261 未加载
评论 #39073650 未加载
评论 #39072593 未加载
评论 #39071279 未加载
fennecfoxy超过 1 年前
I find the AI training topic interesting, because it&#x27;s really data&#x2F;information that is involved. Forget about the fact that it&#x27;s images or stories or Reddit posts, it&#x27;s all data.<p>We are born and then exposed to the torrent of data from the world around us, mostly fed to us by other humans, this is what models are trying to tap.<p>Unfortunately our learning process is completely organic and takes decades and decades and decades; there&#x27;s no way to put a model through this easily.<p>Perhaps we need to seed the web with AI agents who converse and learn as much like regular human beings as possible and assemble the dataset that way. Although having an agent browse and find an image to learn to draw from is still gonna make people reee even if that&#x27;s exactly what a young and aspiring human artist would be doing.<p>Don&#x27;t talk about humans being sacred; we already voted to let corporations be people, for the 1% to exist and &quot;lobby&quot;, breaking our democracy so that they can get tax breaks and make corrupt under the table deals. None of us stopped that from happening...
zirgs超过 1 年前
Does it survive AI upscaling or img2img? If not - then it&#x27;s useless. Nobody trains AI models without any preprocessing. This is basically a tool for 2022.
gweinberg超过 1 年前
For this to work, wouldn&#x27;t you have to have an enormous number of artists collaborating on &quot;poisoning&quot; their images the same way (cow to handbag) while somehow keeping it secret form ai trainers that they were doing this? It seems to me that even if the technology works perfectly as intended, you&#x27;re effectively just mislabeling a tiny fraction of the training data.
评论 #39100814 未加载
ThinkBeat超过 1 年前
In so far as anger goes against AIs being trained on particular intellectual properties.<p>A made up scenario¹ is that a person who is training an AI, goes to the local library and checks out 600 books on art. The person then lets the AI read all of them. After which they are returned to the library and another 600 books are borrowed<p>Then we can imagine the AI somehow visiting a lot of museums and galleries.<p>The AI will now have been trained on the style and looks of a lot of art from different artists<p>All the material has been obtained in a legal manner.<p>Is this an acceptable use?<p>Or can an artist still assert that the AI was trained with their IP without consent?<p>Clearly this is one of the ways a human would go about learning about styles, techniques etc..<p>¹ Yes you probably cannot borrow 600 books at a time. How does the AI read the books? I dont know. Simplicity would be that the researcher takes a photo of each page. This would be extremmly slow but for this hypothetical it is acceptable.
评论 #39077402 未加载
enord超过 1 年前
I’m completely flabbergasted by the number of comments implying copyright concepts such as “fair use” or “derivative work” apply to trained ML models. Copyright is for _people_, as are the entailing rights, responsibilities and exemptions. This has gone far beyond anthropomorphising and we need to like get it together, man!
评论 #39071386 未加载
评论 #39072640 未加载
squidbeak超过 1 年前
I really don&#x27;t understand the anxiety of artists towards AI - as if creatives haven&#x27;t always borrowed and imitated. Every leading artist has had acolytes, and while it&#x27;s true no artist ever had an acolyte as prodigiously productive as AI will be, I don&#x27;t see anything different between a young artist looking to Picasso for cues and Stable Diffusion or DALL-E doing the same. Styles and methods haven&#x27;t ever been subject to copyright - and art would die the moment that changed.<p>The only explanation I can find for this backlash is that artists are actually worried just like the rest of us that pretty soon AI will produce higher quality more inventive work faster and more imaginatively than they can - which is very natural, but not a reason to inhibit an AI&#x27;s creative education.
评论 #39071892 未加载
评论 #39072185 未加载
评论 #39072132 未加载
ngneer超过 1 年前
I love it. This undermines the notion of ground truth. What separates correct information from incorrect information? Maybe nothing! I love how they acknowledge the never ending attack versus defense game. In stark contrast to &quot;our AI will solve all your problems&quot;.
ukuina超过 1 年前
Won&#x27;t a simple downsample-&gt;upsample be the antidote?
评论 #39071150 未加载
评论 #39070767 未加载
gmerc超过 1 年前
Doing the work to increase OpenAIs moat
评论 #39068317 未加载
drdrek超过 1 年前
Only protection is adding giant gaping vaginas to your art, nothing less will deter scraping. If the Email spam community showed us something in the last 40 years is that no amount of defensive tech measures will work except financial disincentives.
paul7986超过 1 年前
Any AI art&#x2F;video&#x2F;photography&#x2F;music&#x2F;etc generator company who generates revenue needs to add watermarks to let the public know its AI generator. This should be forced via legislation in all countries.<p>If they don&#x27;t then whatever social network or other services where things can shared&#x2F;viewed by large groups to millions &amp; are posted publicly need to be labeled &quot;We can not verify veracity of this content.&quot;<p>I want a real internet ..this AI stuff is just triple fold increasing fake crap on the Internet and in turn &#x2F; time our trust in it!
Duanemclemore超过 1 年前
For visual artists who don&#x27;t want visible artifacting in the art they feature online, would it be possible to upload these alongside your un-poisoned art, but have them only hanging out in the background? So say having one proper copy and a hundred poisoned copies in the same server, but only showing the un-poisoned one?<p>Might this &quot;flood the zone&quot; approach also have -some- efficacy against human copycats?
tigrezno超过 1 年前
Do not fight the AI, it&#x27;s a lost cause, embrace it.
xg15超过 1 年前
I wonder how this tool works if it&#x27;s actually model independent. My understanding so far was that in principle each possible model has <i>some</i> set of pathological inputs for which the classification will be different than what a user sees - but that this set is basically different for each model. So did they actually manage to build an &quot;universal&quot; poison? If yes, how?
评论 #39090697 未加载
iLoveOncall超过 1 年前
I wonder if this is illegal in some countries. In France for example, there is the following law: &quot;Obstructing or distorting the operation of an automated data processing system is punishable by five years&#x27; imprisonment and a fine of €150,000.&quot;.<p>If you ask me, this is 100% applicable in this case, so I wonder what a judge would rule.
dist-epoch超过 1 年前
Remember when the music industry tried to use technology to stop music pirating?<p>This will work about as well...<p>Oh, I forget, fighting music pirating was considered an evil thing to do on HN. &quot;pirating is not stealing, is copyright infringement&quot;, right? Unlike training neural nets on internet content which of course is &quot;stealing&quot;.
评论 #39068651 未加载
评论 #39072765 未加载
评论 #39070949 未加载
snerc超过 1 年前
I wonder if we know enough about any of these systems to make such claims. This is all predicated on the fact that this tool will be in widespread use. If it is somehow widely used beyond the folks who have seen it at the top of HN, won&#x27;t the big firms have countermeasures, ready to deploy?
k__超过 1 年前
How long will this work?
评论 #39068594 未加载
评论 #39068566 未加载
nnevatie超过 1 年前
The intention is good, from an AI-opponent&#x27;s perspective. I don&#x27;t think will work practically, though. The drawbacks for actual users of the image galleries, plus the level of complexity involved in poisoning the samples makes this unfeasible to implement at the scale required.
TenJack超过 1 年前
Wonder if the AI companies are already so far ahead that they can use their AI to detect and avoid any poisoning?
matteoraso超过 1 年前
Too little, too late. There&#x27;s already very large high quality datasets to train AI art generators.
consoleable超过 1 年前
The naive idea of wanting to protect artists is actually protecting the monopoly of big companies.<p>Some projects against this behavior:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;syncblob&#x2F;Obey-AI-Luddites">https:&#x2F;&#x2F;github.com&#x2F;syncblob&#x2F;Obey-AI-Luddites</a>
rvba超过 1 年前
The opening website is so poor - &quot;what is nightshade&quot; - then a whole paragraph that tells nothing, then another paragraph.. then no examples. This whole description should be reworked to be shorter and more to the point.
Zetobal超过 1 年前
Well, at least for sdxl it&#x27;s not working neither in LoRa nor dreambooth finetunes.
paulsutter超过 1 年前
Cute. The effectiveness of any technique like this will be short-lived.<p>What we really need is clarification of the extent that copyright protection extends to similar works. Most likely from an AI analysis of case law.
24karrotts_超过 1 年前
If you decrease quality of art, you give AI all the advantage in the market.
Kuinox超过 1 年前
&gt; More specifically, we assume the attacker:<p>&gt; • can inject a small number of poison data (image&#x2F;text pairs) to the model’s training dataset<p>I think thoes are bad assumption, labelling is more and more done by some labelling AI.
评论 #39090724 未加载
aussieguy1234超过 1 年前
The image generation models now are at the point where they can produce their own synthetic training images. So I&#x27;m not sure how big of an impact something like this would have.
ultimoo超过 1 年前
would it have been that hard to include a sample photo and how it looks with the nightshade filter side by side in a 3 page document describing how it would look in great detail
matt3210超过 1 年前
Put a TOC on all your content that says “by using my content for AI you agree to pay X per image” and then send them a bill once you see it in an AI.
评论 #39075419 未加载
devmor超过 1 年前
Baffling to see anyone argue against this technology when it is a non-issue to any model by simply acquiring only training data you have permission to use.
评论 #39069159 未加载
评论 #39074323 未加载
评论 #39071848 未加载
k__超过 1 年前
What are LLMs that was trained with public domain content only?<p>I would believe there is enough content out there to get reasonably good results.
mmaunder超过 1 年前
Trying to convince an AI it sees something and a human they don’t is probably a losing battle.
mattszaszko超过 1 年前
This timeline is getting quite similar to the second season of Pantheon.
Aeolun超过 1 年前
How is there not a single example on that website?
will5421超过 1 年前
I think the artists need to agree to stop making art altogether. That ought to get people’s attention. Then the AI people might (be socially pressured or legally forced to) put their tools away.
评论 #39081387 未加载
jdeaton超过 1 年前
Sounds like free adversarial data augmentation.
arisAlexis超过 1 年前
Wouldn&#x27;t this be applicable to text too?
mjfl超过 1 年前
Another way would be, for every 1 piece of art you make, post 10 AI generated arts, so that the SNR is really bad.
Albert931超过 1 年前
Artist are now fully dependent on Software Engineers for protecting the future of their career lol
评论 #39071428 未加载
whywhywhywhy超过 1 年前
Why are there no examples?
etchalon超过 1 年前
My hope is these type of &quot;poisoning tools&quot; become ubiquitous for all content types on the web, forcing AI companies to, you know, license things.
efitz超过 1 年前
This is the DRM problem again.<p>However much we might wish that it was not true, ideas are not rivalrous. If you share an idea with another person, they now have that idea too.<p>If you share words on paper, then someone with eyes and a brain might memorize them (or much more likely, just grasp and retain the ideas conveyed in the words).<p>If you let someone hear your music, then the ideas (phrasing, style, melody, etc) in that music are transferred.<p>If you let people see a visual work, then the stylistic and content elements of that work are potentially absorbed by the audience.<p>We have copyright to protect specific embodiments, but mostly if you try to share ideas with others without letting them use the ideas you shared, then you are in for a life of frustration and escalating arms race.<p>I completely sympathize with anyone who had a great idea and spent a lot of effort to realize it. If I invented&#x2F;created something awesome I would be hurt and angry if someone “copied” it. But the hard cold reality is that you cannot “own” an idea.
评论 #39071968 未加载
评论 #39071900 未加载
评论 #39073742 未加载
评论 #39071619 未加载
评论 #39071589 未加载
评论 #39071595 未加载
评论 #39073610 未加载
评论 #39072683 未加载
评论 #39072236 未加载
评论 #39073746 未加载
评论 #39074070 未加载
评论 #39072384 未加载
Devasta超过 1 年前
Delighted to see it. Fuck AI art.
gumballindie超过 1 年前
This is excellent. We need more tools like this, for text content as well. For software we need GPL 4 with ML restrictions (make your model open source or not at all). Potentially even DRM for text.