TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How short can Git abbreviate? (2013)

55 点作者 pncnmnp大约 1 年前

6 条评论

jwilk大约 1 年前
Today&#x27;s histogram of commit abbreviation lengths:<p><pre><code> $ git rev-list --all --abbrev=0 --abbrev-commit | awk &#x27;{ a[length] += 1 } END { for (len in a) print len, a[len] }&#x27; 5 84 6 692222 7 527029 8 43802 9 2791 10 181 11 8</code></pre>
sgbeal大约 1 年前
Just FYI: showing hash prefix collision counts is a built-in feature of the Fossil SCM. e.g. the collisions of one particularly well-known repo can be seen at:<p>&lt;<a href="https:&#x2F;&#x2F;sqlite.org&#x2F;src&#x2F;hash-collisions" rel="nofollow">https:&#x2F;&#x2F;sqlite.org&#x2F;src&#x2F;hash-collisions</a>&gt;<p>and fossil&#x27;s own can be seen at:<p>&lt;<a href="https:&#x2F;&#x2F;fossil-scm.org&#x2F;home&#x2F;hash-collisions" rel="nofollow">https:&#x2F;&#x2F;fossil-scm.org&#x2F;home&#x2F;hash-collisions</a>&gt;
banish-m4大约 1 年前
For grins, I recently wrote a brute force program to inject a nonce into a shell script that also was the crc32 hash of a resulting modified script containing itself. One script had 2 hash collisions that satisfied this property in the entire search domain.
评论 #40129731 未加载
pcthrowaway大约 1 年前
I realize the author probably won&#x27;t see this since the post is ancient, but it looks like there&#x27;s a bug on the output: The &quot;how many total objects are ambiguous at that abbreviation [length]&quot; isn&#x27;t quite right. It&#x27;s actually &quot;how many total possible words of that length can reference 2 or more existing objects, (and therefore are insufficient to disambiguate the matching objects of that repo)&quot;
aronhegedus大约 1 年前
Reminds me of this project: <a href="https:&#x2F;&#x2F;github.com&#x2F;trichner&#x2F;gitc0ffee">https:&#x2F;&#x2F;github.com&#x2F;trichner&#x2F;gitc0ffee</a> which is used to take a commit, append some header to it to force the hash to collide with some prefix, such as `c0ffee`<p>&gt;6 character prefix: less than a second<p>&gt;8 character prefix: in the order of one or more minutes<p>No affiliation with this, but I tested it, and it was fast!
ramses0大约 1 年前
I&#x27;ve actually used this as an interview question for several years (and has been my first&#x2F;only &quot;interview-GPT&quot; question), as it&#x27;s a problem I ran into at work, can be solved by someone new to programming, and has lots of headroom and alternates for efficiency, optimization, and alternative implementation discussions.<p>The use case I ran into was an initial + updated &quot;dictionary of values&quot; use case. Imagine: `{ &quot;foo:bar:baz&quot;: 1.23, &quot;bleep:bloop:blah&quot;: 4.56, ...etc.. }`.<p>I wanted to send the first batch as a &quot;full dictionary&quot;, and then send updated batches as: `{ &quot;aa&quot;: 1.23, &quot;bb&quot;: 4.56 }` whereby the client should already be able to reverse: `aa` =&gt; `foo:bar:baz`, and `bb`: `bleep:bloop:blah` since they could md5 the original key, and then match that to the &quot;compressed&quot; unique key.<p>For the interview question, asking &quot;what&#x27;s the minimum unique prefix length&quot;, and exactly in the context of &quot;how short can you abbreviate a git commit&quot; was an excellent, low-friction, easily understandable problem.<p>I was shocked when the initial GPT-3 nailed the naive implementation, and in retrospect, not-surprised when its superficially-correct &quot;optimized&quot; solution was only superficially correct. ;-) Better than 50% of the interviewees I&#x27;d asked this question of.