TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Proof-of-work to protect lore.kernel.org and git.kernel.org against AI crawlers

63 点作者 luu大约 1 个月前

17 条评论

cowboylowrez大约 1 个月前
&gt;Difficulty is set at 4 leading zeroes, unless you&#x27;re coming from US in which case there&#x27;s also a tariff of 5 more leading zeroes.<p>isn&#x27;t linux afraid of retaliatory tariffs? should I stock up on linuxes just in case? I&#x27;ve already beefed up toilet paper reserves.
评论 #43562477 未加载
评论 #43562906 未加载
评论 #43562619 未加载
perihelions大约 1 个月前
Any chance there&#x27;s some way, going forwards, to dual-purpose these webserver PoW&#x27;s, so they solve some socially beneficial compute problem at the same time? I recall reading ideas like that in the early days of cryptocurrency, before humans ruined it.<p>- Server: here&#x27;s a bit of a cancer protein<p>- Client: okay, here&#x27;s some compute<p>- Verifier: the compute checks out<p>- Server: okay, you are authorized to access cat.gif
评论 #43562504 未加载
评论 #43562785 未加载
notherhack大约 1 个月前
See <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43556521">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43556521</a><p>wherein a hosting company sees AI bot scans that appear to be coming from millions of unique addresses, thousands of ASNs, many residential and often with a single connection from an IP. The AI bots are proxying through either hacked IoT devices or apps that pay people pennies to let their phone be used as a proxy.<p>Likely your proof of work will be distributed to the proxies. It&#x27;ll just make millions of webcams and phones run a little hotter without slowing down the AI bots at all.
评论 #43569426 未加载
评论 #43564095 未加载
评论 #43574430 未加载
unsnap_biceps大约 1 个月前
It appears they&#x27;re using <a href="https:&#x2F;&#x2F;github.com&#x2F;TecharoHQ&#x2F;anubis" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;TecharoHQ&#x2F;anubis</a> for the proof of work proxy
评论 #43562725 未加载
xena大约 1 个月前
This is absolutely surreal to see in action! I hope that I can manage to afford to not have to do my dayjob anymore.
评论 #43562714 未加载
评论 #43567427 未加载
xxprogamerxy大约 1 个月前
I&#x27;m a bit skeptical if this will do the trick. These PoW challenges can be parallelized across different websites and may not be as off-putting as intended. Here some quick back-of-the-napkin math:<p>DeepMind&#x27;s MassiveText dataset was sourced from ~2.35B documents. A difficulty of 4 leading zeros requires an expected 16^4 SHA-256 hashes per site. Benchmarks [1] show an H100 at ~12k MH&#x2F;s, meaning it would take just ~3.5 hours to solve for all 2.35B pages.<p>[1] <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;Chick3nman&#x2F;e1417339accfbb0b040bcd0a0a9c6d54#file-h100_pcie_v6-2-6-benchmark-L306" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;Chick3nman&#x2F;e1417339accfbb0b040bcd0a0...</a>
评论 #43563261 未加载
skeptrune大约 1 个月前
I am really enjoying seeing this use-case for PoW gain popularity. Hopefully it normalizes the technique and it can start to become more common for anti-spam systems.
评论 #43562856 未加载
评论 #43562533 未加载
评论 #43562563 未加载
chr15m大约 1 个月前
This is infinitely better than using CloudFlare. I hope it works and more people adopt it.
评论 #43564215 未加载
评论 #43562475 未加载
sva_大约 1 个月前
&gt; Difficulty is set at 4 leading zeroes, unless you&#x27;re coming from US in which case there&#x27;s also a tariff of 5 more leading zeroes.<p>&gt; You can see it in action on this recently decommissioned system I&#x27;m using for testing purposes: <a href="https:&#x2F;&#x2F;ams.source.kernel.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ams.source.kernel.org&#x2F;</a><p>Something seriously wrong with it. When I run it with my normal German&#x2F;EU home connection, it does ~17k iterations. When I run it with a US Atlanta VPN, it only takes ~6k iterations.
评论 #43562690 未加载
评论 #43562561 未加载
sakras大约 1 个月前
Maybe I&#x27;m missing something, but why do people expect PoW to be effective against companies who&#x27;s whole existence revolves around acquiring more compute?
评论 #43562720 未加载
kklisura大约 1 个月前
So, market&#x2F;companies refused to regulate themselves (by adhering to the robots.txt) so we&#x27;re now forced to innovate some solutions against them.
abetusk大约 1 个月前
I think these solutions are really novel and interesting but I&#x27;d like to point out that this is literally one of the use cases for cryptocurrency, or microtransactions in general. Cryptocurrencies, at least the PoW ones, offload the proof-of-work so that it doesn&#x27;t need to be done in real time.<p>Paying fractions of a penny to view websites has minimal impact on average users but is punishing to spammers.
评论 #43562650 未加载
评论 #43562645 未加载
评论 #43562685 未加载
评论 #43562653 未加载
评论 #43564114 未加载
评论 #43562888 未加载
shanemhansen大约 1 个月前
I wonder how well this will actually work.<p>The core problem is that alot of crawlers aren&#x27;t spending their money. They are part of a botnet so they are just spending the victim&#x27;s money.<p>But hopefully most of the crawlers aren&#x27;t botnets or funded by free VC money so they have an economic incentive to avoid crawling systems requiring proof-of-work.
评论 #43565728 未加载
评论 #43562792 未加载
neurostimulant大约 1 个月前
I thought Anubis author doesn&#x27;t want you to remove the anime girl images? I guess kernel.org is exempted. gitlab.gnome.org still has the anime girl though.<p><a href="https:&#x2F;&#x2F;anubis.techaro.lol&#x2F;docs&#x2F;funding" rel="nofollow">https:&#x2F;&#x2F;anubis.techaro.lol&#x2F;docs&#x2F;funding</a>
评论 #43566297 未加载
lousken大约 1 个月前
i wonder how much traffic lore.kernel generates since it&#x27;s such a basic site how it was before crawling and after<p>also where is the anubis avatar, that&#x27;s so disappointing not to see it
hooverd大约 1 个月前
They are using white-labeled Anubis or stock Anubis?
bhouston大约 1 个月前
I am not sure we want to prevent AI crawlers but rather we want the crawlers to just not negatively affect the websites.<p>We want AI automation everywhere and crawling is important.
评论 #43562630 未加载
评论 #43562739 未加载
评论 #43562778 未加载