TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How I sent 300k emails through Github's API in a matter of minutes

196 点作者 badlogic超过 11 年前

12 条评论

badlogic超过 11 年前
That took my server down i&#x27;m afraid. Good day today. Here&#x27;s a &quot;cached&quot; version.<p>To all watchers of the libgdx repository: i’m terribly sorry and hope i didn’t interfer with your work in any way<p>This is meant as a cautionary tale about using Github’s API on a repository with quite a few watchers (460 in this case).<p>Earlier this year we migrated our code from Google Code to Github. We didn’t have a good migration plan for the 1200 or so issues back then, so we kept them on Google Code. We now have about 1700 issues on the tracker<p>Today i finally wanted to tackle the issue tracker migration, using a Python script [1] i found on Github. The script requires one to specify a Github user account that owns the repository the issues will get migrated to. I did a dry run on a fork of the main repo using my Github account, fixed up some issues in the script, and validated things to the best of my abilities. Things looked good.<p>Then i ran it on the main repository. Luckily i was watching our IRC channel. After about 4 minutes, people started to scream. They each received 789 e-mails from Github. Every single issue i migrated, and every single comment of each issue triggered an e-mail notification to all watchers of the main repository.<p>This wasn’t apparent to me during the dry runs, as i used my own Github account. The script posts all issues&#x2F;comments with the user account i supplied, so naturally, i did not get any notification mails.<p>I stopped the script after 130 issues (4 minutes), and immediately started sending out apologies and a mail to Github support, to which i haven’t received an answer yet. I send roughly 300k mails through their servers in a matter of minutes. If i hadn’t watched IRC, i’d have send out about 4 million mails to 460 people within an hour.<p>Let me assure you that i’m extremely sorry about this incident. I know that things like this can interrupt daily workflows quite a bit, even if getting rid of those mails is not a Herculean task. I’d be rather upset if a repo maintainer pulled something like this on me. Please accept my deepest apologies.<p>The lesson for Github API users: think hard about the implications of automating tasks through the Github API if you have more than a few watchers.<p>The lesson for Github&#x2F;API designers: consider safe-guarding against such issues in your API, in case other idiots like me pull off something similar in the future.<p>[1] <a href="https://github.com/tgoyne/google-code-issues-migrator" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;tgoyne&#x2F;google-code-issues-migrator</a>
评论 #6387243 未加载
评论 #6387321 未加载
评论 #6387651 未加载
评论 #6387549 未加载
评论 #6387325 未加载
chrisacky超过 11 年前
It&#x27;s not your fault.<p>It&#x27;s actually a PITA to overcome issues like this on a technical level because you have to run something akin to buffer queue, that works similar to how &quot;debouncing&quot; works.<p>The best approach as I have found is to...<p>- You rate limit events as they happen... So you might let 5 events through (within 10 minutes), and then start to rate limit them by adding each item to a queue which you will merge down every 10 minutes (but that exponentially&#x2F;incrementally back off each time you exceed the 5 items, so the next queue takes 30 minutes before it&#x27;s popped, and then 90 minutes... etc)<p>- So for example, you might have an instant pop from queue where less than 10 events have been triggered within 10 minutes.<p>- Then if more than 10 events have been triggered, you add each item to a queue, and after X minutes, you pop each item off and send a bulk email.<p>----<p>It&#x27;s a real pain to manage such a system, because your &quot;typical&quot; job server, such as Gearman, doesn&#x27;t let you add a &quot;delay&quot; on jobs...<p>Ideally, you&#x27;d want to make sure that you ignore any new events for at least X minutes... So you are left with the only option of running <i>another</i> pseudo-queue system just to catalogue all of your throttled events.<p>Let&#x27;s talk strategy. How else do you guys handle instant email notifications?... without this spamming issue. PS. I&#x27;m referring to GitHub implementing this strategy, not the OP in case there was any confusion.
评论 #6387259 未加载
评论 #6387648 未加载
shortstuffsushi超过 11 年前
I caused a similar issue while running edits on a Confluence Wiki instance as an intern. I was helping our publications department add some macro to every single page of the site, which I found out they were doing by hand! A bit shocked, I told them I could write something to automate that in a matter of minutes.<p>Sure enough, several minutes later all of the pages were updated. All 50 or so pages in each of the 15 spaces. And everyone who had ever touched one of those pages got an emailed for that page.<p>The nice thing about the Confluence API is that you can specify &quot;minor&quot; updates to prevent exactly this scenario from happening.<p>I guess since GitHub is built on the git foundation, adding some sort of &quot;silent&quot; flag might not be as easily possible, but certainly it&#x27;s desirable.
adamnemecek超过 11 年前
I&#x27;m surprised that Github has not implemented a feature for repo migration from google code and sourceforge.
评论 #6387347 未加载
gexla超过 11 年前
Following just one busy repo can take over your inbox (ahem, Docker.) So, I&#x27;m sure your people have good filters in place so that they aren&#x27;t too distracted from a flood of messages from Github.
评论 #6387959 未加载
评论 #6387326 未加载
lnanek2超过 11 年前
Not his fault and pretty cool he was in touch with users of the library enough to catch it and stop it.<p>Always figured I&#x27;d do Cocos2d-x or Unity for any serious game I do next. I used Cocos2d before and written Unity plugins before. I even have a contractor working on a Unity project right now. Will have to give libgdx a few extra points when deciding in the future, though, for having a caring maintainer.<p>I actually wrote an OpenGL game engine for Android back before any of the later things came out like Replica Island, AndEngine, the Cocos2D port, etc.. Almost makes me wish I&#x27;d open sourced it. It did have some awesome stuff like batching all the sprites with similar draw states together into one draw call.
评论 #6387526 未加载
评论 #6389117 未加载
russell_h超过 11 年前
Do people actually let GitHub emails go to their inbox? After they got really aggressive about signing me up as a &quot;watcher&quot; to repos I found it necessary to just route all GitHub email to a a dedicated folder that I never read, then only whitelist repositories I actually care about.
评论 #6387766 未加载
评论 #6387659 未加载
TallboyOne超过 11 年前
Well, that is a big yikes, but crises (mostly) averted.
andrewljohnson超过 11 年前
I did this with my a company repo, when I wrote a script to migrate issues from a spreadsheet into GitHub. I only sent 50 issues * 15 people though.
shitlord超过 11 年前
Anyone have a mirror or cached copy of this page? This site hit the front page of hn&#x2F;proggit twice this week, and it was down both times.
评论 #6387878 未加载
评论 #6387990 未加载
scottcanoni超过 11 年前
Site is majorly foobared
heeton超过 11 年前
Recently: And a bottle of rum (<a href="http://www.amazon.co.uk/And-Bottle-Rum-History-Cocktails/dp/0307338622" rel="nofollow">http:&#x2F;&#x2F;www.amazon.co.uk&#x2F;And-Bottle-Rum-History-Cocktails&#x2F;dp&#x2F;...</a>)<p>I loved it. A history of rum, including all of the politics around it (like the role it played in the slave trade and American independence), great read :)
评论 #6387663 未加载