TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Git filter-repo: much easier/faster alternative to filter-branch

60 点作者 dmart超过 4 年前

6 条评论

acemarke超过 4 年前
I did something similar a couple years ago. I needed to rewrite an entire repo&#x27;s history so that I could reformat all the code _and_ modernize all the JS syntax throughout every commit in the repo&#x27;s history (dating back to 2013).<p>I started with `filter-branch`, but found it way too slow (especially on a corporate-controlled Windows machine, where starting up additional processes seems to have a lot of overhead). I concluded that I needed to run the entire filtering logic in a single process to avoid that overhead. Started writing my own with `pygit2`, but then found a repo called `pylter-branch` which did most of that same &quot;loop through commits and reprocess&quot; work for me - I just had to add a lot of additional logic on top for the specific reprocessing I wanted to do.<p>Ended up being able to reprocess about 15K commits in around 4.5 hours. Given the amount of processing I was doing, that was pretty good.<p>I did an extensive writeup [1] on the problem statement, investigation, and techniques I used if anyone&#x27;s interested.<p>[0] <a href="https:&#x2F;&#x2F;github.com&#x2F;sergelevin&#x2F;pylter-branch" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;sergelevin&#x2F;pylter-branch</a><p>[1] <a href="https:&#x2F;&#x2F;blog.isquaredsoftware.com&#x2F;2018&#x2F;11&#x2F;git-js-history-rewriting&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.isquaredsoftware.com&#x2F;2018&#x2F;11&#x2F;git-js-history-rew...</a>
评论 #24680446 未加载
drothlis超过 4 年前
&gt; If commit messages refer to other commits by ID (e.g. &quot;this reverts commit 01234567890abcdef&quot;, &quot;In commit 0013deadbeef9a...&quot;), those commit messages should be rewritten to refer to the new commit IDs<p>Finally, a tool that does this! I wish git rebase could do it.
MeteorMarc超过 4 年前
Git filter-repo is mainly from a single author newren, who also is a significant contributor to git itself: <a href="https:&#x2F;&#x2F;github.com&#x2F;git&#x2F;git&#x2F;graphs&#x2F;contributors" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;git&#x2F;git&#x2F;graphs&#x2F;contributors</a> The repo seems open to PR&#x27;s though, given the many commits from one-time contributors.<p>[Edited] A question: how would one check the integrity of the resulting git repo (apart from an obvious rsync check of the source files in master)? Say, I would like to create a branch from an older commit and rewrite its history with &quot;git rebase -i&quot;; that could fail on a a repo without full integrity.
fmorel超过 4 年前
I used this earlier this year to split a monorepo into a submodule + 4 other repos. It made it incredibly easy to rewrite history to move files going to different repos, then remove the history of any files not staying in a particular repo.<p>And it was <i>fast</i>. Also, being able to pass in files containing rewrite rules allowed me to easily do multiple dry-runs of the code split ahead of time so I was able to minimize the code freeze for our team.
tomohawk超过 4 年前
This looks so much nicer than filter-branch, which is slow beyond belief.<p>We&#x27;ve used bfg (which at least performs in reasonable time) to filter out files containing viruses or security tokens&#x2F;passwords, but there are some definite limitations.
bobbydreamer超过 4 年前
Wow these are really interesting projects