Updated rate limits for unauthenticated requests

108 pointsby xena15 days ago

<a href="https://github.com/orgs/community/discussions/159123">https://github.com/orgs/community/discussions/159123</a><a href="https://github.com/orgs/community/discussions/157887">https://github.com/orgs/community/discussions/157887</a>

31 comments

Zdh4DYsGvdjJ10 days ago

GitHub answered <a href="https://github.com/orgs/community/discussions/159123#discussioncomment-13148279">https://github.com/orgs/community/discussions/159123#discuss...</a>

TheNewsIsHere10 days ago

I don’t think the publication date (May 8, as I type this) on the GitHub blog article is the same date this change became effective.From a long-term, clean network I have been consistently seeing these “whoa there!” secondary rate limit errors for over a month when browsing more than 2-3 files in a repo.My experience has been that once they’ve throttled your IP under this policy, you cannot even reach a login page to authenticate. The docs direct you to file a ticket (if you’re a paying customer, which I am) if you consistently get that error.I was never able to file a ticket when this happened because their rate limiter also applies to one of the required backend services that the ticketing system calls from the browser. Clearly they don’t test that experience end to end.

评论 #43994582 未加载

gnabgib15 days ago

60 req/hour for unauthenticated users5000 req/hour for authenticated - personal15000 req/hour for authenticated - enterprise orgAccording to <a href="https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28" rel="nofollow">https://docs.github.com/en/rest/using-the-rest-api/rate-limi...</a>I bump into this just browsing a repo's code (unauth).. seems like it's one of the side effects of the AI rush.

评论 #43986163 未加载

评论 #43989695 未加载

评论 #43989394 未加载

评论 #44016083 未加载

评论 #43990749 未加载

PaulDavisThe1st10 days ago

Several people in the comments seem to be blaming Github for taking this step for no apparent reason.Those of us who self-host git repos know that this is not true. Over at ardour.org, we've passed the 1M-unique-IP's banned due to AI trawlers sucking our repository 1 commit at a time. It was killing our server before we put fail2ban to work.I'm not arguing that the specific steps Github have taken are the right ones. They might be, they might not, but they do help to address the problem. Our choice for now has been based on noticing that the trawlers are always fetching commits, so we tweaked things such that the overall http-facing git repo works, but you cannot access commit-based URLs. If you want that, you need to use our github mirror :)

评论 #43990760 未加载

评论 #43992086 未加载

评论 #43992979 未加载

评论 #43990918 未加载

评论 #43990541 未加载

评论 #43990345 未加载

whitehexagon9 days ago

If a company the size of MS isn't able handle the DOS caused by the LLM slurpers, then it really is game over for the open internet. We are going to need government approved ID based logins to even read the adverts at this rate.But this feels like a further attempt to create a walled garden around 'our' source code. I say our, but the first push to KYC, asking for phone numbers, was enough for me to delete all and close my account. Being on the outside, it feels like those walls get taller every month. I often see an interesting project mentioned on HN and clone the repo, but more and more times that is failing. Trying to browse online is now limited, and they recently disabled search without an account.For such a critical piece of worldwide technology infrastructure, maybe it would be better run by a not-for-profit independent foundation. I guess, since it is just git, anyone could start this, and migration would be easy.

评论 #43993219 未加载

评论 #43993474 未加载

评论 #43992865 未加载

jorams10 days ago

> These changes will apply to operations like cloning repositories over HTTPS, anonymously interacting with our REST APIs, and downloading files from raw.githubusercontent.com.Or randomly when clicking through a repository file tree. The first time I hit a rate limit was when I was skimming through a repository on my phone, and about the 5th file I clicked I was denied and locked out. Not for a few seconds either, it lasted long enough that I gave up on waiting then refreshing every ~10 seconds.

评论 #43989715 未加载

hardwaresofton9 days ago

Does it seem to anyone like eventually the entire internet will be login only?At this point knowledge seems to be gathered and replicated to great effect and sites that either want to monetize their content OR prevent bot traffic wasting resources seem to have one easy option.

评论 #43991515 未加载

jrochkind19 days ago

This means it's no longer safe to point to github-hosted repos in `git:` or `github:` dependencies in ruby bundler, yes?I forget because I don't use them, but weren't there some products meant as dependency package repositories that github had introduced at some point, for some platforms? Does this apply to them? (I would hope not unless they want to kill them?)This rather enormously changes github's potential place in ecosystems.What with the poor announcement/rollout -- also unusual for what we expect of github, if they had realized how much this effects -- I wonder if this was an "emergency" thing not fully thought out in response to the crazy decentralized bot deluge we've all been dealing with. I wonder if they will reconsider and come up with another solution -- this one and the way it was rolled out do not really match the ingenuity and competence we usually count on from github.I think it will hurt github's reputation more than they realize if they don't provide a lot more context, with suggested workarounds for various use cases, and/or a rollback. This is actually quite an impactful change, in a way that the subtle rollout seems to suggest they didn't realize?

Animats9 days ago

Are the scraper sites using a large number of IP addresses, like a distributed denial of service attack? If not, rather than explicit blocking, consider using fair queuing. Do all the requests from IP addresses that have zero requests pending. Then those from IP addresses with one request pending, and so forth. Each IP address contends with itself, so making massive numbers of requests from one address won't cause a problem.I put this on a web site once, and didn't notice for a month that someone was making queries at a frantic rate. It had zero impact on other traffic.

评论 #43993110 未加载

评论 #43992743 未加载

评论 #43992747 未加载

jrochkind110 days ago

Wow, I'm realizing this applies to even browsing files in the web UI without being logged in, and the limits are quite low?This rather significantly changes the place of github hosted code in the ecosystem.I understand it is probably a response to the ill-behaved decentralized bot-nets doing mass scraping with cloaked user-agents (that everyone assumes is AI-related, but I think it's all just speculation and it's quite mysterious) -- which is affecting most of us.The mystery bot net(s) are kind of destroying the open web, by the counter-measures being chosen.

thih910 days ago

What does “secondary” stand for here in the error message?> You have exceeded a secondary rate limit.Edit and self-answer:> In addition to primary rate limits, GitHub enforces secondary rate limits(…)> These secondary rate limits are subject to change without notice. You may also encounter a secondary rate limit for undisclosed reasons.<a href="https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28#about-secondary-rate-limits" rel="nofollow">https://docs.github.com/en/rest/using-the-rest-api/rate-limi...</a>

pogue10 days ago

I assume they're trying to keep ai bots from strip mining the whole place.Or maybe your IP/browser is questionable.

评论 #43986500 未加载

评论 #43981611 未加载

评论 #43981640 未加载

评论 #43981589 未加载

评论 #43983815 未加载

jhgg10 days ago

The truth is this won't actually stop AI crawlers and they'll just move to a large residential proxy pool to work around it. Not sure what the solution is honestly.

评论 #43990466 未加载

评论 #43992626 未加载

croemer10 days ago

The blog post is tagged with "improvement" - ironic for more restrictive rate limits.Also, neither the new nor the old rate limits are mentioned.

pdimitar9 days ago

A take that I'm not seeing in all the "LLM scrapers are heading to our site, run for your lives!" threads is this:Why can't people harden their software with guards? Proper DDoS protection? Better caching? Rewrite the hot paths in C, Rust, Zig, Go, Haskell etc.?It strikes me as very odd, the atmosphere of these threads. So much doom and gloom. If my site was hit by an LLM scraper I'd be like "oh, it's on!", a big smile, and I'll get to work right away. And I'll have that work approved because I'll use the occasion to convince the executives of the need. And I'll have tons of fun.Can somebody offer a take on why are we, the forefront of the tech sector, just surrendering almost without a single shot?

评论 #43998608 未加载

Zdh4DYsGvdjJ10 days ago

This was announced <a href="https://github.blog/changelog/2025-05-08-updated-rate-limits-for-unauthenticated-requests/" rel="nofollow">https://github.blog/changelog/2025-05-08-updated-rate-limits...</a>

评论 #43985665 未加载

评论 #43983828 未加载

londons_explore9 days ago

Most of these unauthenticated requests are read-only.All of public github is only 21TB. Can't they just host that on a dumb cache and let the bots crawl to their heart's content?

评论 #43991946 未加载

评论 #43993253 未加载

jarofgreen10 days ago

Also <a href="https://github.com/orgs/community/discussions/157887">https://github.com/orgs/community/discussions/157887</a> "Persistent HTTP 429 Rate Limiting on *.githubusercontent.com Triggered by Accept-Language: zh-CN Header" but the comments show examples with no language headers.I encountered this too once, but thought it was a glitch. Worrying if they can't sort it.

Euphorbium10 days ago

I remember getting this error a few months ago, this does not seem like a temporary glitch. They dont want llm makers to slurp all the data.

评论 #43981748 未加载

trallnag10 days ago

Good that tools like Homebrew that heavily rely on GitHub usually support environment variables like GITHUB_TOKEN

jrochkind110 days ago

Did I miss where it says what the new rate limits are? Or are they secret?

mmsc10 days ago

Even with authenticated requests, viewing a pull request and adding `.diff` to the end of the URL is currently ratelimited at 1 request per minute. Incredibly low, IMO.

评论 #43990841 未加载

spacephysics10 days ago

Probably to throttle scraping from AI competitors, and have them pay for the privilege as many other services have been doing

InfiniteLoup10 days ago

How would this affect Go dependencies?

评论 #43986153 未加载

watermelon010 days ago

Time for Mozilla (and other open-source projects) to move repositories to sourcehut/Codeberg or self-hosted Gitlab/Forgejo?

评论 #43992232 未加载

评论 #43984045 未加载

stevekemp10 days ago

Once again people post in the "community", but nobody official replies; these discussion-pages are just users shouting into the void.

knowitnone10 days ago

you mean you want to better track users

micw10 days ago

See also: <a href="https://github.com/orgs/community/discussions/159123">https://github.com/orgs/community/discussions/159123</a>

xnx10 days ago

It sucks that we've collectively surrendered the urls to our content to centralized services that can change their terms at any time without any control. Content can always be moved, but moving the entire audience associated with a url is much harder.

评论 #43985750 未加载

jarofgreen10 days ago

<a href="https://github.com/orgs/community/discussions/157887">https://github.com/orgs/community/discussions/157887</a> This has been going on for weeks and is clearly not a simple mistake.

评论 #43985658 未加载

评论 #43984261 未加载

评论 #43984255 未加载

radicality10 days ago

Just tried it on chrome incognito on iOS and do hit this 429 rate limit :S That sucks, it’s already bad enough when GitHub started enforcing login to even do a simple search.