Version numbers are just part of a name; we can't rely on them, any more than we can rely on package names (e.g. anyone can make a package with the name "aws-sdk"; that doesn't mean they can be trusted with our AWS credentials!)<p>To actually get dependencies for our software, we need two mechanisms:<p>- (a) Some way to precisely specify what we depend on<p>- (b) Some mechanism to fetch those dependencies<p>Many package managers (NPM, Maven, etc.) use a third-party server for both, e.g.<p>- (a) We depend on whatever npm.org returns when we ask for FOO<p>- (b) Fetch dependency FOO by attempting to HTTP GET <a href="https://npm.org/FOO;" rel="nofollow">https://npm.org/FOO;</a> fail if it's not 200 OK<p>Delegating so much trust to a HTTP call isn't great; so there's an alternative approach based on "lock files":<p>- (a) We depend on the name FOO with this hash (usually 'trust on first use', where we find the hash by doing an initial HTTP GET, etc. and store the resulting hash)<p>- (b) Fetch dependency FOO by looking in these local folders, or checking out these git repos, or doing a HTTP GET against these caches, or against these mirrors, or leeching this torrent, etc. Fail if we can't find anything which matches our hash.<p>The interesting thing about using lock files and hashes, is that our hash of dependency FOO depends on the contents of its lock file; and that content depends on the contents of FOO's dependencies, including <i>their</i> lock files; and so on.<p>Hence a lock file is a Merkle tree, which pins all of the transitive dependencies of a package: changing any of those dependencies (e.g. to update) requires altering all of the lock files in-between that dependency and our package. That, in turn, alters our lock file, and hence our package's hash.<p>The author is complaining that such dependency-cascades require a whole bunch of version numbers to get updated. I think it's better to keep track of these things separately: use your version number as documentation, of major/minor/patch changes; and keep track of dependency trees using a separate, cryptographically-secure hash. The thing is, we <i>already have</i> such hashes: they're called git commit IDs!<p>Other advantages of identifying transitive dependencies with hashes:<p>- They're not sequential. Our package isn't "out of date" just because we're using hash 1234 instead of 1235. All that matters are the version numbers. In other words, we're distinguishing between "real" updates (a version number changed) and "propagation" (version numbers stayed the same, but a dependency hash changed).<p>- They're unstructured; e.g. they give us no information about "major" versus "minor" changes, etc. (and hence no need to decide whether an update is one or the other!)<p>- They can be auto-generated; e.g. we might forget to update our version number, but there's no way we can forget to update our git commit ID!<p>- They're eventually-consistent: it doesn't matter how updates 'propagate' through each package; each sub-tree will converge to the same hash (NOTE: for this to work we must only take the content hash, not the full history like a git commit ID!).<p>For example, take the following ("diamond") dependency tree:<p><pre><code> +--> B --+
| |
Our package --> A --+ +--> D
| |
+--> C --+
</code></pre>
When D publishes a new version, B and C should update their lock-files; then A should update its lock-file; then we should update our lock-file. However, this may happen in multiple ways:<p>- B and C update; A updates (getting new hashes from B and C)<p>- B updates; A updates; C updates; A updates<p>- C updates; A updates; B updates; A updates<p>Using version-numbers (or git commit IDs!) would result in different A packages (one increment versus two increments; or commit IDs with different histories). Using content hashes will give A the same hash/lock-file in all three cases. This also means we're free to propagate updates whenever we like, rather than waiting for things to 'stabilise'; and it's safe to use private forks/patches for propagating updates if we like, without fear of colliding version numbers.<p>Note that some of this propagation can be avoided if our build picks a single version of each dependency (e.g. Python requires this for entries in its site-packages directory; and Nixpkgs uses laziness and a fixed-point to defer choosing dependencies until the whole set of packages has been defined)