1.5.29-rc2 was tagged on 9 Nov 2023 [1] and, as an example, did not contain "N_("CJK Unified Ideographs Extension I")," in src/ibusunicodegen.h [2].<p>Commit 228f0a77b2047ade54e132bab69c0c03f0f41aae from 28 Feb 2023 introduced this change instead. It's the same person who tagged 1.5.29-rc2 and committed 228f0a77b2047ade54e132bab69c0c03f0f41aae which is typically an indication the maintainer tar'd their checked out git folder and accidentally included changes not get committed.<p>The question raised is whether anyone is auditing these differences before the checksummed tarballs are added to package repositories.<p>[1] <a href="https://github.com/ibus/ibus/releases/tag/1.5.29-rc2">https://github.com/ibus/ibus/releases/tag/1.5.29-rc2</a><p>[2] <a href="https://github.com/ibus/ibus/blob/0ad8e77bd36545974ad8acd0a5283cf72bc7c8ad/src/ibusunicodegen.h">https://github.com/ibus/ibus/blob/0ad8e77bd36545974ad8acd0a5...</a><p>[3] <a href="https://github.com/ibus/ibus/commit/228f0a77b2047ade54e132bab69c0c03f0f41aae">https://github.com/ibus/ibus/commit/228f0a77b2047ade54e132ba...</a>
I've enabled "trusted publishing" as it is called for python packages (publishing to cheeseshop/PyPI).<p>However, what they call trusted publishing is just a configuration where PyPI tells github that publishing from a particular workflow name on a particular repository is ok, without further tokens. So PyPI "trusts" github actions and the maintainer is out of the loop.<p>All good? Well, if you trust Github!<p>It would be a lot better to me if both the maintainer and github were involved, something like - the maintainer signs off on some artifacts, and the github action verifies they are reproducible exactly by a workflow, and /then/ it's published.
Why is this a thing? Can't packages use specific tags from the git repo? It seems so incredibly stupid to allow this, throwing out all of the "oh but it's open source you can review it" arguments in one go if the source displayed on GitHub is not what ends up used...
This has <i>really</i> been bugging me about npm.<p>Anyone can publish a open source repo and add it to npmjs. Users going to the page on npmjs will see that the repo with the code is github.com/myrepo.<p>But when I do `npm i myrepo` there is no guarantee that what is being pulled is in any way similar to what is in the linked repo. Creating a false feeling that the code <i>could</i> be reviewed.<p>At the very least, Github should not allow this for code on their platform (ie, if npmjs has it listed as the repo of a project they should scan that the project actually builds to the content or notify npm who should have it flagged). Or npmjs should regularly scan the same - checking that the code you would get by compiling matches the code being offered<p>Bear in mind that if a bad actor can get in at the level of the NPM user, even if the user is not running with elevated privileges (which it often is since you need to be a superuser to listen on port 80 or to read SSL certificates, and PM2 with handling for SSL is way above the ability of many devs, sigh), they can scan for vulnerabilities, and perhaps open themselves a very big hole.
What makes maintainers of major distros still rely on questionable tarballs to build packages? It's not as if these essential programs don't have a public, authoritative git repository.<p>Is it because of inertia, because we've been using tarballs since before VCS was a thing? Is it to reduce the burden of package maintainership, by letting upstream do most of the transpiling and autotools plumbing work? Is it because some assets and test data are only included in the tarballs? Why are they not committed, either to the same repo or (if upstream wishes to keep the main repo small) some other repo?<p>People have been calling for reproducible builds for years now, but reproducible builds don't mean anything if they are not built from authoritative sources with immutable commit IDs.
I've noted similar vulnerabilities in every common package management system, including NuGet, Cargo, and NPM. There are giant gaps in security a truck could be driven through.<p>Collectively, we <i>must</i> start taking the "bill of materials" that goes into our software much more seriously, including the chain of custody from source to binary.<p>Start with linking each package directly to the Git commit hash it came from.<p>Then a good start would be enforced reproducible builds. Don't trust the uploader to package their code. Run this step <i>in the package management system</i> from the provided source hash.<p>Pure functional build systems would be a heck of a lot better, with no side effects of any kind permitted. No arbitrary scripts or arbitrary code during build. No file reads, network access, or <i>API calls of any kind</i> outside of the compiler toolkit itself.<p>Then, I would like to see packages go through specially instrumented compilers that generate an "ingredient label" for each package. Transitive dependencies. Uses of unsafe constructs in otherwise safe languages. Lists of system calls used. Etc...<p>You're about to say that clever hackers can work around that by using dynamic calling, or call-by-name, or some other clever trick.<p>Sure, that's possible, but then the system can <i>report that</i>. If some random image codec or compression library uses any platform API <i>at all</i>, then it's instantly suspect. These things should be as pure as the driven snow: <i>Bytes in, decoded data out.</i>
We <i>could</i> create and hash the tarball from git and compare it to the released tarball, even if it wasn't originally signed. We could even use Merkle trees (perhaps combining with find) to ensure that individual files were unchanged.<p>At least then we could verify that the tarball was derived from that exact git commit.
so glad pypi actively removed support for package signatures and even sends you an annoying email if your scripts dare to actually upload a signature anyway
Does Nix/Guix solve this?<p>I have been skeptical of "rewrite everything into rust", but... maybe we should at least rewrite everything into Nix?