TechEcho

16 comments

jchwabout 2 months ago

I'm really curious to see how this evolves as time goes on. Hashcash was originally conceived to stop e-mail SPAM, and a lot has changed since then, namely, compute has become absolutely dirt cheap. Despite that, PoW-based anti-bot remains somewhat enticing because it doesn't necessarily harm accessibility the way that solutions like Cloudflare or reCAPTCHA can: It should be possible to pass, even on a VPN or Tor, even on less used web browsers like Ladybird or Servo, and even if you're not on a super powerful device (provided you are willing to wait for the PoW challenge to pass, but as long as you don't have all of these conditions at once you should get an "easy" challenge and it should be quick.)The challenge is definitely figuring out if this solution actually works at scale or not. I've played around with an implementation of Hashcash myself, using WebCrypto, but I worry because even using WebCrypto it is quite a lot slower than cracking hashes in native code. But seeing Anubis seemingly have some success makes me hopeful. If it gains broad adoption, it might just be enough of a pain in the ass for scrapers, while still being possible for automation to pass provided they can pay the compute toll (e.g. hopefully anything that's not terribly abusive.)On a lighter note, I've found the reception of Anubis, and in particular the anime-style mascot, to be predictably amusing.<a href="https://discourse.gnome.org/t/anime-girl-on-gnome-gitlab/27689" rel="nofollow">https://discourse.gnome.org/t/anime-girl-on-gnome-gitlab/276...</a>(Note: I'd personally suggest not going and replying here. Don't want to encourage brigading of any sort, just found this mildly amusing.)

评论 #43435443 未加载

kmeisthaxabout 2 months ago

Is there a way to alter text to poison AI training sets? I know there's Glaze and Nightshade for images but I've heard of nothing to poison text models. To be clear, this wouldn't be a defensive measure to stop scraping; it'd be an offensive honeypot: you'd want to make pages that have the same text but mutated slightly differently each time, so that AI scrapers preferentially load up on your statistically different text and then yield a poisoned model. Ideally the scraper companies will realize what's going on and stop scraping.

评论 #43433787 未加载

评论 #43429238 未加载

评论 #43431331 未加载

评论 #43430258 未加载

评论 #43467697 未加载

avodonosovabout 2 months ago

Ideas:- Make it generate cryptucurrency, so that the work is not wasted. Either to compensate for server expences hosting the content, or for some noble non-profit cause - all installations would collect the currency to a single account. Wasting the work is worse than these both options.- An easy way for good crawlers (like internet archive) to authenticate themselves. E.g. TLS client side authentication or simply an HTTP request header containing signature for the request (the signature in the header may be based on, for example, on their domain name and the TLS cert for that domain)

评论 #43430035 未加载

评论 #43431355 未加载

评论 #43431602 未加载

评论 #43432164 未加载

Trung0246about 2 months ago

For no js solution, I think some sort of using optical illusion as captcha could works, especially <a href="https://en.wikipedia.org/wiki/Magic_Eye" rel="nofollow">https://en.wikipedia.org/wiki/Magic_Eye</a> or something like <a href="https://www.youtube.com/watch?v=Bg3RAI8uyVw" rel="nofollow">https://www.youtube.com/watch?v=Bg3RAI8uyVw</a> which could cleverly hide captcha answer within animated noise mess.However these methods are not really accessibility-friendly tho.

nikisweetingabout 2 months ago

Doesn't seem to noticably slow down my test bot. Headful crawling already takes ~10sec/page so an extra 0.5sec is hardly that big a deal.

评论 #43434627 未加载

评论 #43428653 未加载

pvgabout 2 months ago

Discussion here <a href="https://news.ycombinator.com/item?id=43422929">https://news.ycombinator.com/item?id=43422929</a>

yjftsjthsd-habout 2 months ago

> to stop AI crawlersIt'll do that too, but it's really more of a general-purpose anti-bot, right? A generic PoW-wall.

评论 #43429092 未加载

barlogabout 2 months ago

When I visited Xe-sann's page, I was curious to see Jackal-chann challenges in action. This is Anubis.

akoboldfryingabout 2 months ago

Regarding the problem of how to let "good" bots through:You could use PKI: Drop the PoW if the client provides a TLS client certificate chain that asserts that <publicKey> corresponds to a private key that is controlled by <fullNamesAndAddressesOfThesePeople> (or just by, say, <people WhoControlThisUrl>, for Let's Encrypt-style automatable cert signing). This would be a slight hassle for good bot operators to set up, but not a very big deal. The result is that bad bots couldn't spoof good bots to get in.(Obviously this strategy generalises to handling human users too -- but in that case, the loss of privacy, as well as admin inconvenience, makes it much less palatable.)

评论 #43429088 未加载

bno1about 2 months ago

What stops a scraper from detecting Anubis and just removing "Mozilla" from the user-agent string?

评论 #43433980 未加载

评论 #43430627 未加载

xg15about 2 months ago

It's a great idea, but I fear if this keeps going viral like it did in the last few days, more bot authors will be motivated to add special handling for it and e.g change the user agent to a non-Mozilla one.

Trung0246about 2 months ago

The performance on mobile is kinda suck tho, took like 30 seconds to wait for PoW on difficulty 4 on Firefox Android. By that time I have to resist the urge to switch to do something else.

iszomerabout 2 months ago

This reminded me of an article I printed (yes, with paper) at my college more than 20 years ago, titled Parasitic Computing. I don't remember where it was originally published but I do think I might have stumbled upon it via kuro5hin (maybe); a quick search resulted the publication from Nature (though paywalled).- <a href="https://www.nature.com/articles/35091039" rel="nofollow">https://www.nature.com/articles/35091039</a>

Alifatiskabout 2 months ago

This is like wehatecaptchas.com

ranger_dangerabout 2 months ago

I would say it doesn't prevent anything, it just makes computers warm the planet more.

评论 #43428435 未加载

评论 #43435661 未加载

评论 #43428821 未加载

drpossumabout 2 months ago

[flagged]

评论 #43429389 未加载

16 comments

jchwabout 2 months ago

评论 #43435443 未加载

kmeisthaxabout 2 months ago

评论 #43433787 未加载

评论 #43429238 未加载

评论 #43431331 未加载

评论 #43430258 未加载

评论 #43467697 未加载

avodonosovabout 2 months ago

评论 #43430035 未加载

评论 #43431355 未加载

评论 #43431602 未加载

评论 #43432164 未加载

Trung0246about 2 months ago

nikisweetingabout 2 months ago

Doesn't seem to noticably slow down my test bot. Headful crawling already takes ~10sec/page so an extra 0.5sec is hardly that big a deal.

评论 #43434627 未加载

评论 #43428653 未加载

pvgabout 2 months ago

Discussion here <a href="https://news.ycombinator.com/item?id=43422929">https://news.ycombinator.com/item?id=43422929</a>

yjftsjthsd-habout 2 months ago

> to stop AI crawlersIt'll do that too, but it's really more of a general-purpose anti-bot, right? A generic PoW-wall.

评论 #43429092 未加载

barlogabout 2 months ago

When I visited Xe-sann's page, I was curious to see Jackal-chann challenges in action. This is Anubis.

akoboldfryingabout 2 months ago

评论 #43429088 未加载

bno1about 2 months ago

What stops a scraper from detecting Anubis and just removing "Mozilla" from the user-agent string?

评论 #43433980 未加载

评论 #43430627 未加载

xg15about 2 months ago

Trung0246about 2 months ago

The performance on mobile is kinda suck tho, took like 30 seconds to wait for PoW on difficulty 4 on Firefox Android. By that time I have to resist the urge to switch to do something else.

iszomerabout 2 months ago

Alifatiskabout 2 months ago

This is like wehatecaptchas.com

ranger_dangerabout 2 months ago

I would say it doesn't prevent anything, it just makes computers warm the planet more.

Anubis: Proof-of-work proxy to prevent AI crawlers

16 comments

Anubis: Proof-of-work proxy to prevent AI crawlers

16 comments