TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Why Deleting Sensitive Information from GitHub Doesn't Save You

313 点作者 jwcrux超过 10 年前

20 条评论

guiambros超过 10 年前
<i>&gt; In this post, I’m going to show exactly how hackers instantly harvest information committed to public Github repositories...</i><p>A few days ago I published my blog to GitHub, with my MailGun API key in the config file (stupid mistake, I know). In less than 12 hours, spammers had harvested the key AND sent a few thousand emails with my account, using my entire monthly limit.<p>Thankfully I was using the free MailGun account, which is limited to only 10,000 emails&#x2F;month, so there was no material damage. Their tech support was awesome in immediately blocking the account and notifying me, and then quickly helping to unblock the account after keys and passwords were changed, and repo made private.<p>I was exactly wondering how they were able to harvest GitHub content so quickly; it couldn&#x27;t be web scrapping or a random search. This article explains well how to drink from GitHub&#x27;s events firehose and the GHTorrent project, so everything makes sense now. Thanks for posting it.<p>EDIT: This other post[1] describes a similar situation. There are some folks monitoring ALL GitHub commits and getting psswords as they are commited, on the fly.<p>[1] <a href="http://www.devfactor.net/2014/12/30/2375-amazon-mistake/" rel="nofollow">http:&#x2F;&#x2F;www.devfactor.net&#x2F;2014&#x2F;12&#x2F;30&#x2F;2375-amazon-mistake&#x2F;</a>
评论 #8818524 未加载
olefoo超过 10 年前
There&#x27;s a fairly straight forward pattern for keeping sensitive credentials out of github. It comes straight from <a href="http://12factor.net/config" rel="nofollow">http:&#x2F;&#x2F;12factor.net&#x2F;config</a> store configuration data in the environment.<p>What I do for most projects is keep the tree containing the working directory in a directory that has some other items that don&#x27;t belong on github (like the project brief, my emacs bookmarks file, random notes related to the project etc. ) and in that directory there is a .credentials file containing a set of export statements somewhat like:<p><pre><code> export AWS_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXX export AWS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX export AWS_USER_ID=############ </code></pre> If I&#x27;m feeling extra paranoid, I&#x27;ll encrypt that into a blob that I only decrypt when I&#x27;m working on said project.<p>Then at startup the app goes looking for it&#x27;s config in the environment. This does create issues for some environments ( solving this for docker is trivial ) but you can usually pass environment variables to whatever is executing your code reasonably securely. Now it&#x27;s not perfect, and environments can sometimes be revealed externally if an attacker is determined and clever and focused on your app for some reason.<p>But it does give you a hygienic procedure that keeps your credentials that are equivalent to an open draw on your bank account out of public repositories.
评论 #8818446 未加载
评论 #8819861 未加载
评论 #8818628 未加载
tomphoolery超过 10 年前
It should be noted that GitHub&#x27;s article on removing sensitive data is still applicable if you haven&#x27;t pushed anything back to GitHub yet. Remember that a commit is just an entry into your repo, it doesn&#x27;t synchronize with `origin&#x2F;master` until you tell it to. So if the user has not pushed to GitHub yet, but has committed in their local Git repo, they should follow GitHub&#x27;s guide and not worry about changing any keys.
评论 #8818198 未加载
评论 #8818192 未加载
PhantomGremlin超过 10 年前
If you ever put <i>anything</i> out on the Internet, not just to GitHub, consider it to be public information. Forever. You might be able to convince archive.org to remove it, but there are hundreds of players out there who aren&#x27;t as ethical.<p>Ben Franklin figured this out many years ago:<p><pre><code> Three can keep a secret, if two of them are dead.</code></pre>
revelation超过 10 年前
So many words for one simple principle: if sensitive data has been publicly accessible or transferred in plaintext over the internet, consider it compromised, logged stored and abused.<p>The only recourse is to immediately change or revoke access.
nutanc超过 10 年前
I think this problem is widespread enough and there are enough idiots out there(me included),that there should be a feature request for Github to provide a prompt in case Github detects sensitive information in the code hosted.
评论 #8818680 未加载
评论 #8818449 未加载
评论 #8818552 未加载
评论 #8818916 未加载
akerl_超过 10 年前
To be clear, the guide from GitHub that&#x27;s linked at the top of this article clearly states that you should consider the sensitive data compromised. Cleaning it out of the repo is a good move, but it&#x27;s a companion move to rotating out those creds or whatever for new ones.
评论 #8818201 未加载
femto113超过 10 年前
My advice: USE PRIVATE REPOS! At $7&#x2F;month Github&#x27;s micro plan with 5 repos is just $1.40&#x2F;repo-month. This is the cheapest insurance you can get against the nearly inevitable mistake of committing something sensitive.
评论 #8818620 未加载
评论 #8818657 未加载
xasos超过 10 年前
Always use environment variables. They are probably the best way to safeguard your API keys.
评论 #8818326 未加载
xasos超过 10 年前
It always amazes me to see the sheer amount of API keys left around in GitHub repositories. You can search anything like Twilio API Key and come out with hundreds of thousands of results. I wonder to what extent these keys have been exploited.
baxter001超过 10 年前
A script to post random key containing config-like files to public repos and waste these guy&#x27;s bandwidth&#x2F;light them up on amazon&#x27;s blacklist radar would be a cool idea.
DenisM超过 10 年前
On MacOS theres keychain - it&#x27;s a designated place for storing secrets.<p>On windows I create a batch file at a fixed location with all the credentials in it. A script simply runs this batch file and reads the env cars to get values. A compiled program parses the batch file with regex to find required values. This works remarkably well for keeping credentials out of the code base.<p>Hope that hels someone.
icymatter超过 10 年前
Github has very good cache. In the past, when I deleted a repository I still was able to access some diff and commit information from my own activity pages. I had to request Github team to clear that page manually.
jquast超过 10 年前
I&#x27;m very certain this is a hacker&#x27;s account configured to follow a great deal of projects and people (2k projects, 1.3k users) for this very purpose -- a suspicious [redacted] [unknown] profile, <a href="https://github.com/trnsz" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;trnsz</a>
jpetersonmn超过 10 年前
First time I tried to use github I uploaded my gmail password which I was using to send myself an email when something failed. I figured that there would be bots that would scoop up that information right away. Luckily I realized what I had done before people could get into my gmail.
jpdlla超过 10 年前
Thinking of actually working on a tool for this. Will have a blacklist of &quot;searches&quot; that might contain sensitive data and perhaps notifying via the email of the committer or creating an issue on the repo. Anyone else want to get involved?
godzillabrennus超过 10 年前
Millions of emails for developers and no one harvesting this info thought it wise to obfuscate it in some way?
评论 #8818220 未加载
评论 #8818363 未加载
tlrobinson超过 10 年前
Are there any open source git hooks that will scan your code for known credential formats?
morkfromork超过 10 年前
What would happen if you published a few billion fake credentials?
rilita超过 10 年前
TLDR: It won&#x27;t save you because people could have copied the information before you deleted it.<p>Duh?
评论 #8819911 未加载