TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

GitHub Turned into an Enterprise Under Microsoft?

10 pointsby aliostadover 5 years ago
I was requested to remove a training file from my Deep Learning Language detection repo (only 64 stars but still). The repo used Deep Learning to detect programming language of a file or snippet. The files and snippets were harvested from public files and snippets of github and stackoverflow. The repo was taken down even after I removed the file from the git history. More info and screenshots here: https://twitter.com/aliostad/status/1222440190821781506?s=20

3 comments

zegerjanover 5 years ago
The reason you couldn&#x27;t delete the blob, is because someone forked your repository and GitHub uses git alternates for deduplication of fork networks.<p>I think you could ask GitHub if you can recreate your repository without the offending blob, and you should be good again.
评论 #22179239 未加载
tastroderover 5 years ago
github.com was always a corporate entity and subject to DMCA takedowns.<p>The notice link so others don&#x27;t have to hand-type it: <a href="https:&#x2F;&#x2F;github.com&#x2F;github&#x2F;dmca&#x2F;blob&#x2F;master&#x2F;2020&#x2F;01&#x2F;2020-01-27-ibm.md" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;github&#x2F;dmca&#x2F;blob&#x2F;master&#x2F;2020&#x2F;01&#x2F;2020-01-2...</a><p>I feel like that other twitter user &#x2F; BSA &#x2F; IBM as the originators of that takedown notice are more useful targets of animosity here.
评论 #22179173 未加载
aliostadover 5 years ago
Here is the terminal output of what I did to remove the file from the git history:<p>~&#x2F;g&#x2F;aliostad bfg --delete-files 1703 deep-learning-lang-detection.git<p>Using repo : &#x2F;Users&#x2F;alikheyrollahi&#x2F;github&#x2F;aliostad&#x2F;deep-learning-lang-detection.git<p>Found 72811 objects to protect Found 2 commit-pointing refs : HEAD, refs&#x2F;heads&#x2F;master<p>Protected commits -----------------<p>These are your protected commits, and so their contents will NOT be altered:<p>* commit ac12aa68 (protected by &#x27;HEAD&#x27;) - contains 8 dirty files : - data&#x2F;stackoverflow-snippets&#x2F;cpp&#x2F;1703 (3.0 KB) - data&#x2F;stackoverflow-snippets&#x2F;csharp&#x2F;1703 (835 B) - ...<p>WARNING: The dirty content above may be removed from other commits, but as the <i>protected</i> commits still use it, it will STILL exist in your repository.<p>Details of protected dirty content have been recorded here :<p>&#x2F;Users&#x2F;alikheyrollahi&#x2F;github&#x2F;aliostad&#x2F;deep-learning-lang-detection.git.bfg-report&#x2F;2020-01-27&#x2F;22-24-03&#x2F;protected-dirt&#x2F;<p>If you <i>really</i> want this content gone, make a manual commit that removes it, and then run the BFG on a fresh copy of your repo.<p>Cleaning --------<p>Found 69 commits Cleaning commits: 100% (69&#x2F;69) Cleaning commits completed in 304 ms.<p>Updating 1 Ref --------------<p>Ref Before After --------------------------------------- refs&#x2F;heads&#x2F;master | ac12aa68 | c51406cc<p>Updating references: 100% (1&#x2F;1) ...Ref update completed in 13 ms.<p>Commit Tree-Dirt History ------------------------<p>Earliest Latest | | .................................................DDDDDDDDDDm<p>D = dirty commits (file tree fixed) m = modified commits (commit message or parents changed) . = clean commits (no changes to file tree)<p><pre><code> Before After ------------------------------------------- First modified commit | a4a1bbac | cb32cfbf Last dirty commit | 45322921 | 6b9e8d5d </code></pre> Deleted files -------------<p>Filename Git id --------------------------------------------------- 1703 | 530293d7 (614 B), 98c9b646 (3.0 KB), ...<p>In total, 47 object ids were changed. Full details are logged here:<p>&#x2F;Users&#x2F;alikheyrollahi&#x2F;github&#x2F;aliostad&#x2F;deep-learning-lang-detection.git.bfg-report&#x2F;2020-01-27&#x2F;22-24-03<p>BFG run is complete! When ready, run: git reflog expire --expire=now --all &amp;&amp; git gc --prune=now --aggressive<p>-- You can rewrite history in Git - don&#x27;t let Trump do it for real! Trump&#x27;s administration has lied consistently, to make people give up on ever being told the truth. Don&#x27;t give up: <a href="https:&#x2F;&#x2F;www.aclu.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.aclu.org&#x2F;</a> --<p>~&#x2F;g&#x2F;aliostad cd deep-learning-lang-detection.git ~&#x2F;g&#x2F;a&#x2F;deep-learning-lang-detection.git git reflog expire --expire=now --all &amp;&amp; git gc --prune=now --aggressive Enumerating objects: 89539, done. Counting objects: 100% (89539&#x2F;89539), done. Delta compression using up to 8 threads Compressing objects: 100% (89537&#x2F;89537), done. Writing objects: 100% (89539&#x2F;89539), done. Total 89539 (delta 28336), reused 61123 (delta 0) ~&#x2F;g&#x2F;a&#x2F;deep-learning-lang-detection.git git push Enter passphrase for key &#x27;&#x2F;Users&#x2F;alikheyrollahi&#x2F;.ssh&#x2F;id_rsa&#x27;: Enumerating objects: 89539, done. Counting objects: 100% (89539&#x2F;89539), done. Delta compression using up to 8 threads Compressing objects: 100% (61201&#x2F;61201), done. Writing objects: 100% (89539&#x2F;89539), 40.83 MiB | 1.01 MiB&#x2F;s, done. Total 89539 (delta 28336), reused 89539 (delta 28336) remote: Resolving deltas: 100% (28336&#x2F;28336), done. To github.com:aliostad&#x2F;deep-learning-lang-detection.git + ac12aa680...c51406cc8 master -&gt; master (forced update) ~&#x2F;g&#x2F;a&#x2F;deep-learning-lang-detection.git cd ..