Not on Git, but I was curious and grepped through the Siemens S7 repository I maintain at work; we've been using the same comment practice since forever, with the date in ISO8601 format. (Since before ISO 8601 even was a thing!)<p>Oldest I found?<p>1986-06-17: Trygve glemte å sjekke om vi deler på null. Fikset.<p>(Trygve forgot to check whether we divide by zero. Fixed.)
You don't need to blame every file[1]. Use `git rev-list` to find your oldest commit:<p><pre><code> git rev-list --reverse --date-order HEAD | head -1 # or
git rev-list --reverse --author-date-order HEAD | head -1
</code></pre>
To see the files in that commit:<p><pre><code> git ls-tree -lr <commit-id>
</code></pre>
To see a particular file:<p><pre><code> git show <commit-id>:/path/to/file # or
git cat-file -p <commit-id>:/path/to/file
</code></pre>
[1] Caveat: I suppose this doesn't account for files which no longer exist or that have been completely re-written.<p><a href="https://git-scm.com/docs/git-rev-list" rel="nofollow">https://git-scm.com/docs/git-rev-list</a><p><a href="https://git-scm.com/docs/git-ls-tree" rel="nofollow">https://git-scm.com/docs/git-ls-tree</a><p><a href="https://git-scm.com/docs/git-show" rel="nofollow">https://git-scm.com/docs/git-show</a><p><a href="https://git-scm.com/docs/git-cat-file" rel="nofollow">https://git-scm.com/docs/git-cat-file</a><p><a href="https://git-scm.com/docs/gitrevisions" rel="nofollow">https://git-scm.com/docs/gitrevisions</a>
I like leaving something like gitlens on so I can see the super old lines ad-hoc when I naturally come across them. It's fun to get glimpses of the past.
It's probably almost always going to be a boring config line(s) in the initial commit?<p>A section header in a pylintrc or Cargo.toml, a Django settings.py var, etc. Or even an import/var in a file that's core enough to still exist, import logging and LOGGER = ... for example.
Our code base still has ghost comments about code being just so because the NeXT compiler won't accept it any other way. No one has the heart to remove them.
In our monorepo (of 101470 Java files, according to<p><pre><code> find . -name '*.java' | wc -l
</code></pre>
), I shudder to think how long that would take. For large repos, I imagine you could get quite a bit faster by only considering files created before the oldest date you've found so far.
If any of the lines form the repo's first commit happen to be untouched, then that's a huge short-cut: those lines are the oldest. Finding one of those lines manually is a pretty easy task. Enumerating them all accurately, less so.
Not sure why all the lines of code. This is much shorter:<p><pre><code> git ls-files|xargs -n 1 git blame --date=format:%Y%m%d -f |grep -Eo '\d{8}.*' |sort -r | head -n 1 | sed 's/^[^)]*) \t//'
</code></pre>
(on MacOSX)
I wrote a similar maybe hacky script using `git blame` on every file. In our main application, we still have a couple lines from the initial commit in 2011.