I see a lot of people saying things like "this is why package signing is important" and "we need to know who the developers are" and "we need to audit everything." Some of that is true to some degree, but let me ask you this: why do we consider it acceptable that code you install through a package manager implicitly gets to do <i>anything</i> to your system that you can do? That seems silly! Surely we can do better than that?<p>This article from Agoric is extremely relevant here, from a previous such incident (re: the event-stream package on npm): <a href="https://medium.com/agoric/pola-would-have-prevented-the-event-stream-incident-45653ecbda99" rel="nofollow">https://medium.com/agoric/pola-would-have-prevented-the-even...</a><p>Put simply: in many cases, the dependencies you install don't need nearly as much authority as we give them right now. Maybe some of these packages need network access (I see a few named "logger" which might be shipping logs remotely) but do they need unrestricted filesystem access? Probably not! (They don't necessarily even need unrestricted network access either; what they're communicating with is likely pretty well-known.)
This doesn't surprise me. Horrify.. yes.<p>I've noticed more dev teams succumbing to the temptation of easiness that many modern package managers provide (NPM, Cargo, Ivy, etc.) - especially as someone who has to work with offline systems on a regular basis.<p>Because of that ease there are fewer tools and tutorials out there to support offline package management. There are more for using caches, though these are often along the lines of either 'the package manager will do this for you and it just works (but in case it doesn't, delete node_modules or cargo clean and re-try)', or stand up a dependency server on your own machine with these proxy settings (which has it's own security issues and is frequently disallowed by IT cybersecurity policies).<p>As an example, many blog articles I found a while back suggest using yumdownloader from the yum-utils package. This is unfortunately not reliable, as there are some packages that get skipped.<p>I have found I need to script reading a list of dependencies from a file; then for each dependency: create a directory for it, use repotrack to download its RPM and it's transitive dependency RPMs in the dependency's directory; then the script aggregates all the RPMs into one directory, removes the OS installed RPMs, uses createrepo to turn that directory into a RPM repository, and then makes an USF ISO image out of the directory for transfer onto offline system and installation.
I'm surprised the reverse fully-qualified domain name (FQDN) model used by Java isn't more widely adopted. If you want to upload artifacts to the main repository (Maven Central) you first need to show ownership of a particular domain. For example, via a DNS TXT record (example [1]). Would make these kind of attacks a lot more difficult.<p>[1] <a href="https://issues.sonatype.org/browse/OSSRH-61509" rel="nofollow">https://issues.sonatype.org/browse/OSSRH-61509</a>
I’m cackling at how great this is. This is what happens when you trust the internet forever and just scarf down any old thing at build time. Of course it’ll get exploited! That’s what evil people do.
This post seems like a good time to note that by default, there's no direct way to verify that what you are downloading from dockerhub is the exact same thing that exists on dockerhub [1].<p>Discovered after seeing a comment on HN about a bill of materials for software, i.e., a list of "approved hashes" to ensure one can audit exactly what software is being installed, which in turn led me to this issue.<p>[1] - <a href="https://github.com/docker/hub-feedback/issues/1925" rel="nofollow">https://github.com/docker/hub-feedback/issues/1925</a>
Imagine we navigated the web using a command line tool called “goto” which works exactly like a package manager. If I want to open my bank’s site, I type “goto mybank” .<p>I could easily find myself in trouble, because:<p>- There’s no autocomplete or bookmarks, so typos are easy.<p>- If “mybank” is a name provided by my company’s name server, I could find myself redirected to the public “mybank” entry because Mr. Not-A-Hacker says his name entry is more up to date (or because I forgot to tell ‘goto’ to check the company name server.)<p>- There’s no “green padlock” to check while I’m actively using the destination site. (Though at this point it’s too late because a few moments after I hit enter the destination site had the same access to my machine & network that I do from my current terminal.)<p>- A trusted site may later become malicious, which is bad due to the level of unrestricted and unmonitored access to my PC the site can have.<p>- Using scripting tricks, regular sandboxed browser websites can manipulate my clipboard so I paste something into ‘goto’ that I didn’t realize would be in my clipboard, making me navigate to some malicious site and giving it full access to my machine (if ‘sudo’ as added to the front).<p>This is just a few cases off the top of my head. If ‘goto’ was a real thing, we’d laugh it into being replaced by something more trustable.<p>How have current package managers not had these vulnerabilities fixed yet? I don’t understand.
<a href="https://security.googleblog.com/2021/02/know-prevent-fix-framework-for-shifting.html" rel="nofollow">https://security.googleblog.com/2021/02/know-prevent-fix-fra...</a><p>At Google, we have those resources and go to extraordinary lengths to manage the open source packages we use—including keeping a private repo of all open source packages we use internally
Ex-Amazon SDE here.<p>> a unique design flaw of the open-source ecosystems<p>This is a big generalization.<p>Inside Amazon, as well as in various Linux distributions, you cannot do network traffic at build time and you can only use dependencies from OS packages.<p>Each library has its own package and the code and licensing is reviewed. The only open source distribution that I know to have similar strict requirements is Debian.<p>[I'm referring to the internal build system, not Amazon Linux]<p>[Disclaimer: things might have changed after I left the company]
This was inevitable from the moment we let build systems and runtime systems fetch things automatically and unsupervised from public repos. This is the simplest and most blatant approach yet, but taking ownership of existing projects and adding malicious code is an ongoing problem. Even deleting a public project can have the effect of a DOS attack.<p>When I first used maven, I was appalled by how hard it was to prevent it from accessing maven central. And horrified to see karaf trying to resolve jars from maven central at run time. What a horrible set of defaults. This behaviour should be opt-in, disabled by default, not opt-out through hard to discover and harder to verify configuration settings.
I'm flabbergasted by how silly this is. Bump the version and the package manager chooses yours online vs. the private one. Amazing. How silly and how expensive is this going to be as this blatant security issue is going ripple on for the next months to come.
Pulling packages down at build time seems ludicrous to me, I can understand it in a development environment, but I don't understand how "Pull packages from the public internet and put them into our production codebase" past any kind of robustness scrutiny.<p>I guess it's a case of the ease of use proving too great, so convenient in fact that we just kind of swept the implications under the rug.
That is insane that any company allowed this to happen.<p>""That said, we consider the root cause of this issue to be a design flaw (rather than a bug) in package managers that can be addressed only through reconfiguration," a Microsoft spokesperson said in the email."<p>No, npm has scopes for a reason, why would that not fix this issue?
It won't be just companies. It'll be developers, sysops, etc who npm install a bazillion of packages, because the core language and libaries are not enough. Those people have keys, credentials and access to the internal networks.
The article mentions that RubyGems is vulnerable to this, and that Shopify in particular downloaded and ran a gem named "shopify-cloud", but I'm curious as to how this is possible given a "normal" bundler pure-lockfile setup, or more generally the source-block directives I've seen in most Gemfiles.<p>That is, given a Gemfile.lock like, e.g.<p><pre><code> GIT
remote: https://github.com/thoughtbot/appraisal
revision: 5675d17a95cfe904cc4b19dfd3f1f4c6d54d3502
specs:
appraisal (2.1.0)
bundler
rake
thor (>= 0.14.0)
</code></pre>
How would Bundler ever try and download the `appraisal` gem from RubyGems?<p>The Gemfile section is more explicable. While newer Gemfiles look like this:<p><pre><code> source "http://our.own.gem.repo.com/the/path/to/it" do
gem 'gemfromourrepo'
end
# or
gem 'gemfromourrepo', source: "http://our.own.gem.repo.com/the/path/to/it"
</code></pre>
Older Gemfiles apparently looked like the following:<p><pre><code> source 'https://rubygems.org'
source 'http://our.own.gem.repo.com/the/path/to/it'
gem 'gemfromrubygems1'
gem 'gemfromrubygems2'
gem 'gemfromourrepo'
</code></pre>
Which seems obviously vulnerable to the dependency confusion issue mentioned.<p>So is the understanding that Shopify's CI systems were running `bundle upgrade` or another non-lockfile operation? (possibly as a greenkeeper-like cron job?) Or is `--pure-lockfile` itself more subtly vulerable?
This attack demonstrates one of the problems outlined in the Nix thesis[0], that is the problem of nominal dependencies. That is, dependencies of the dependencies, build flags and so on are not taking into account, and in particular, the source of a package.<p>Nix makes it possible to query the <i>entire</i> build time and runtime dependency graph of a package, and because network access during build time is disabled, such a substitution attack would be harder to pull off.<p>The declarations for how the source is downloaded is specified declaratively and can be pinned to a specific commit of a specific Git repository, for instance.<p>[0] <a href="https://edolstra.github.io/pubs/phd-thesis.pdf" rel="nofollow">https://edolstra.github.io/pubs/phd-thesis.pdf</a>
Why would you want your CI to depend on an external source. Say a legit upgrade happened, but it has a breaking change. Now your build is broken.<p>Fixed versions for as many things as you can (including OS images, apt packages, Docker images, etc) lead to changes in your CI under your control.<p>Sure, you have to upgrade manually or by a script. But isn't plain build stability worth it? Not even talking about security.
> The packages had preinstall scripts that automatically launched a script to exfiltrate identifying information from the machine as soon as the build process pulled the packages in.<p>Pre and post install scripts in NPM packages are such a terrible idea. Even when it’s not malware, it usually just a nagging donation request with a deliberate “sleep 5” to slow down your build and keep the text displayed.
The real solution is to design and build software components that can be <i>finished</i>, so they can be ruthlessly vetted - rather than the endless churn of updates.
I don't understand why there is this issue. We publish our internal npm packages in the @company namespace and we own this namespace on the public npm registry. Problem solved, isn't it?
npm in particular has been problematic for a long time:<p><a href="https://naildrivin5.com/blog/2019/07/10/the-frightening-state-security-around-npm-package-management.html" rel="nofollow">https://naildrivin5.com/blog/2019/07/10/the-frightening-stat...</a><p><a href="https://techbeacon.com/security/check-your-dependencies-githubs-npm-finds-nasty-trojan-packages" rel="nofollow">https://techbeacon.com/security/check-your-dependencies-gith...</a><p><a href="https://thenewstack.io/npm-password-resets-show-developers-need-better-security-practices/" rel="nofollow">https://thenewstack.io/npm-password-resets-show-developers-n...</a>
> I have been fascinated by the level of trust we put in a simple command like this one<p>sigh... am I the only one that <i>likes</i> environments where you can run simple commands to install stuff and you can generally trust your package managers? All the security folks love to act dumbfounded when people trust things, but post-trust environments have <i>terrible</i> UX in my experience. I hate 2FA, for example, because now I have to tote my phone around at all times in order to be able to access any of my accounts. If I lose my phone or my phone is stolen while travelling, I'm hosed until I can figure out how to get back in.<p>> So can this blind trust be exploited by malicious actors?<p>Yes, it can. Trust can <i>always</i> be exploited by malicious actors, and no amount of software can change that. And it creates a world that sucks over time. Show me a post-trust, highly secure environment that isn't a major PITA to use. And not just for computers. I'm sure you could use social engineering to abuse trust of customer service reps (or just people in general) and do bad things, and the end result will be a world where people are afraid do any favors for other people because of the risk of getting burned by a "malicious actor".
Does this work with AOT compiled languages? Surely the fake packages that get uploaded don't know the structure of the internal libraries enough, so for something like Cargo this would just cause in your build suddenly failing mysteriously & easy to spot. A build.rs could probably do some damage to your build systems temporarily for the 1 or 2 days (if not hours) it takes for engineers to track down what's happening.
What I don’t get from the article is the reasoning behind the design that the central repository “wins” over the local/override repository.<p>How was that design chosen, not just once but in all 3 of those large package ecosystems. Did pypi/gems/node borrow their design from each other given their similarity in other aspects?<p>Are there any situations where this behavior is desired?<p>Does any of the other ecosystems have flaws like this (nuget, cargo..)?
I never understood why these package repositories don't include some (opt-in?) integrity checking option using digital signatures. If I download code that executes on my machine there should be at least the option to establish some level of trust. We have been doing that with linux distro package managers for decades. Seems like common sense to me.
I've often wondered about this, even in the accidental case of someone registering a package you use internally.<p>And I know it's not perfect, but in Python if you use Poetry means you get a poetry.lock file with package hashes built in, so that's something.
I teach and one of my students, with little IT experience, asked me last week about the security of package management. I found myself using the many eyeballs argument. It only takes one set of bad eyeballs.<p>It seems to me that down through the years ease of deployment trumps security. npm, mongodb, redis, k8s.<p>Or maybe sysadmin has just become outdated? Maybe front of house still needs a grumpy caretaker rather than your friendly devops with a foot in both camps.<p>We can now even outsource our security to some impersonal third-party so they can 'not' monitor our logs.<p>EOG # end of grump
Sadly I've had to fix this at more than one company.<p>It's a bit of cognitive dissonance having to explain why downloading random shit from the internet during the build is a bad idea, yet here we are.
Here's the application called deptrust I submitted to the Mozilla Builders program (didn't get in :P) to address this problem space before I had to focus more on my current job. Please let me know if there are any collaborators who would like to work on this together someday!<p><a href="https://docs.google.com/document/d/1EW6uSZB0_D0qZuDSGuxujuVEBkCbpWznv7gRcbFvUZs/edit" rel="nofollow">https://docs.google.com/document/d/1EW6uSZB0_D0qZuDSGuxujuVE...</a>
I know that node has `package-lock.json` and `yarn.lock`, which include integrity checks. Are these checks decorative only? How could npm have been affected by this issue?
Npm ecosystem already has the solution. Use namespaces @yelp/infra-js where @yelp is the npm user.<p>It's not possible for an attacker to publish on that name in the public npm
These install hooks... Why are they needed at all and why can't package (de)installation be without side effects ?<p>I'm sure the hooks are needed for things NPM can't do by itself, but they shouldn't run by default. That puts pressure on developers to avoid them, and puts pressure on NPM to add whatever functionality is missing from package.json in a safe way.<p>(and have npmjs.com search rank packages without scripts above those that do)
I have to build some CSS libraries that sadly use npm for building. The way I approach this is through rubber gloves: I create custom docker containers with npm and a specific set of dependencies, frozen in time. This way I can at least get reproducible and reliable builds.<p>This doesn't mean I'm not vulnerable to dependency attacks, but it at least limits the window, because I update these dependencies very, very rarely.
To mitigate this kind of supply chain attacks for python, we have created following tool [1], that will check python packages on Artifactory instance you specify and create packages with the same name on the PyPi.<p>[1] <a href="https://github.com/pan-net-security/artifactory-pypi-scanner" rel="nofollow">https://github.com/pan-net-security/artifactory-pypi-scanner</a>
Like some other commenters, I too initially balked at the apparent misuse of "supply chain attack" but the linked paper provides a good definition,<p><i>A software supply chain attack is characterized by the injection of malicious code into a software package in order to compromise dependent systems further down the chain.</i><p>Backstabber’s Knife Collection: A Review of Open Source Software Supply Chain Attacks<p><a href="https://link.springer.com/chapter/10.1007%2F978-3-030-52683-2_2" rel="nofollow">https://link.springer.com/chapter/10.1007%2F978-3-030-52683-...</a><p>To be clear, just calling this a "supply chain attack" and omitting "software" is going to cause confusion with traditional supply chains.<p>The analogy is not quite apt: in a software build system you have complete visibility into the dependency tree, so this attack is less useful, whereas with hardware suppliers you are relying on the security of your vendor.
This seems to be tending towards the generic problem of permissions that we have seen previously elsewhere.<p>For example in the case of Facebook, it used to be that users would accept permissions without considering them, and in-turn, various apps would access their data in bad faith.<p>Likewise for mobile apps.<p>Eventually Facebook removed many of the overtly powerful permissions entirely, likewise with the mobile operating systems.<p>In the case of mobile, the concept of "runtime permissions" was also introduced that required explicit approval to be granted at the time of authorization.<p>On Android, location access now prompts the user in the notification area informing the user of an app that accessed their location.<p>Can some of these ideas be borrowed to the package/dependency management world? "The package you are about to install requires access to your hard drive including the following folders: x/y/z/all?
This is both a security bug and a reproducibility bug. If anyone outside your network can break your build, your build is broken! It's mission critical to have a working build.<p>The way Nix handles this is that every external resource is cached and hashed, and every reference to an external resource must have a hash integrity check. If someone swaps out a package on a web server somewhere, rebuilds keep working because they don't need to re-fetch (because the hash wasn't changed by an operator), and fresh builds fail with an error indicating the hash is invalid, which should trigger an investigation (in practice, this is exceedingly rare, and IMO always deserves attention).<p>I dream for when build reproducibility is considered table stakes like version control.
I think JFrog and Azure won the prize for product placement on this one. When the article listed “Azure Artifactory” I wondered if Azure was “sherlocking” JFrog, but no, they have a partnership. Given the SolarWinds vector I expect more investment in tooling security.
The upstream article was posted yesterday, here<p>"Dependency Confusion: RCE via internal package name squatting "
<a href="https://news.ycombinator.com/item?id=26081149" rel="nofollow">https://news.ycombinator.com/item?id=26081149</a><p>"Dependency Confusion:
How I Hacked Into Apple, Microsoft and Dozens of Other Companies,
The Story of a Novel Supply Chain Attack,
Alex Birsan" <a href="https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610" rel="nofollow">https://medium.com/@alex.birsan/dependency-confusion-4a5d60f...</a>
PGP signing of packages should be table stakes for publishing to a public repository. If unsigned packages are accepted by a public repository to reduce friction for newbies, such packages should be hidden by default.<p>Then, build tools should be configurable such that they only pull in dependencies signed by PGP keys drawn from a whitelist.<p>Finally, companies need to maintain private repositories of vetted dependencies and avoid pulling from public repositories by default — and this requirement needs to be configurable from the project's build spec and captured in version control.
For npm enterprise, it looks like setting the scope (e.g. @acmecorp/internal-pkg) would mitigate the public and private confusion. For Verdaccio, an open source light weight npm registry, it first checks if a private package is available before searching the public npm registry (however, their best practices say to use a prefix for private packages <a href="https://verdaccio.org/docs/en/best" rel="nofollow">https://verdaccio.org/docs/en/best</a> )
I don't use npm much, but once I'm out of the initial development phase with any package manager and am "feature complete" we generally lock versions down so at least we're always pulling a specific version in.<p>And, of course, on production build machines, all packages are local.<p>This isn't just for "security" -- it's to ensure we can always build the same bits we shipped, and to avoid any surprises when something has a legitimate update that breaks something else.
My favorite supply chain attack is still the chip vendors. Even if you come up with a hardware security module in your chip to verify the code that's running on it, that can be (and has been) hacked too. Sleeping dragons could be lying in wait in billions of devices and nobody would know unless they went out of their way to do a low-level analysis.
I've been wishing npm/pypi/apt etc would improve for ages, but it seems like infrastructure improves one disaster at a time, software one hack at a time. I'm only annoyed I didn't do it myself.<p>The pypi maintainer is being ridiculous, it is much better to have this guy poke MSFT than have the Russians do it, he's doing them a favour.
The only really shocking part of this is that Artifactory is vulnerable to this. I expect developers to be lazy about build security because I've seen it over and over again at multiple companies, but Artifactory's whole purpose is to provide secure build dependency management.<p>I'll be rethinking using Artifactory in my infrastructure.
I used this version trick in nuget, but the other way around.<p>To update existing non-maintained public packages, mostly because they were on. Net framework and a lot moved to .net core.<p>In visual studio you can set the priority of where packages have to be checked. My own package repo has a higher priority.<p>I never thought about using it as an attack vector though.
Does NPM offer cryptographic hash pinning of packages the way that PyPI does?* why is this not more widely used?<p>* <a href="https://flawed.net.nz/2021/02/02/PyPI-Security-State/" rel="nofollow">https://flawed.net.nz/2021/02/02/PyPI-Security-State/</a>
Diffend allows you to manage the risks that come with using open-source third party dependencies by providing malware detecting security scanning and a risk management platform for your Ruby dependencies.<p><a href="https://diffend.io/" rel="nofollow">https://diffend.io/</a>
This brings a whole new level of awareness to package files where a simple typo can mean your machine can be rooted. From now on I'll always be terrified whenever changing any of my <i>package.json</i>, <i>Gemfile</i> or <i>requirements.txt</i> files.
Why didn't npmjs/rubygems just check failed lookup requests for "shopify-cloud" etc and block those for a while to prevent damage, and notify the companies (doing their best)? Seems like low hanging solution.
It surprises me a bit the way they refer to in-house dependencies purely by version number. When we have internal dependencies in e.g. package.json, it's always referred to by an explicit url and git ref.
> After spending an hour on taking down these packages, Ingram stressed that uploading illicit packages on PyPI puts an undue burden on the volunteers who maintain PyPI.<p>I dunno, feels like fair game to me
And still people won’t vendor their dependencies, so changes to dependencies are never reviewed.<p>To paraphrase family guy: you’re making this harder than it needs to be.
this shouldn't be a problem with golang right? because it uses an id when go mod is used. I'm rusty on go since I haven't used it in over 2 years but I believe this shouldn't affect it?
luckily I reserved my company's namespace in packagist few months ago. each package manager works differently and it is hard to know inner workings of all package managers
We need a blockchain for source. It is obvious and we just haven't come to terms with it yet. Then anyone can run anything provided they have the right key.