Using alpine with Python in production is usually a mistake. Python is considerably slower with musl libc than with glibc and on top of that no one is providing built wheels for musl-based platforms so you need to drag in the entire buildchain to package most projects. Who wants to build numpy from scratch as part of their build pipeline? And having a C compiler present is not great as it allows exploits to escalate their seriousness considerably.<p>Anyway most of these security scanners are bunkum:<p><a href="https://pythonspeed.com/articles/docker-security-scanner/" rel="nofollow">https://pythonspeed.com/articles/docker-security-scanner/</a><p>There are not, in fact, a huge number of known and unpatched vulnerabilities in debian stable. If your tool is finding numerous problems, it is time to look more closely at your tool.
Sure slimmer fixes part of the problem, because it reduces the attack surface, but that's not really the issue.<p>Somehow everyone just assume that because something is in a container, then it won't need patching. What really happened is that we moved the patch management responsibility from operations to development. The developers just didn't notice.<p>One issue could be that it breaks many people mental model of containers. Container images are frequently used as a "Works on my machine" and is just bundled up and shipped.<p>It's even broken on Docker Hub. What's the point of an python:3.9 image, when it's never actually updated? Developers base their own Dockerfiles on these base images, but often forget that they need to add an OS update step. I don't understand why images, like the python ones, aren't continuously updated centrally.
First, the author is being fooled by false positives due to bad scanner settings. Basically there are a huge number of CVEs that are meaningless, and closed by some distros but not others in the CVE databases. The result is a spew of "OMG LOOK AT THIS BADNESS" which security scanners vendors like because it makes them look useful, but is actually just noise.<p>Long version: <a href="https://pythonspeed.com/articles/docker-security-scanner/" rel="nofollow">https://pythonspeed.com/articles/docker-security-scanner/</a><p>Second, the author isn't installing security updates! You need to run `apt-get upgrade`, even for official images.<p>Long version: <a href="https://pythonspeed.com/articles/security-updates-in-docker/" rel="nofollow">https://pythonspeed.com/articles/security-updates-in-docker/</a><p>Third, Alpine has some issues in some cases, although for Go at least neither of these issues is usually relevant so Alpine is fine.<p>1. musl can be subtly incompatible with some applications, with annoying bugs. Personal experience: if you using minikube in a WeWork office, Alpine-based (or really, musl-based) containers would fail to resolve DNS inside Kubernetes due to a concatenation of circumstances that was mostly the fault of WeWork's ops team but which glibc handled better than musl. The problem has since been fixed by WeWork, AFAIK.<p>2. For Python specifically, binary precompiled packages (wheels) won't work on Alpine, which means you ahve to recompile the whole universe, which means container builds are slow. There's a PEP which might get this fixed, but for now, not worth it. <a href="https://pythonspeed.com/articles/alpine-docker-python/" rel="nofollow">https://pythonspeed.com/articles/alpine-docker-python/</a>
I'm all for smaller images but his final distroless Python image to run Flask is going to fall apart as soon as his Flask app needs to connect to a database.<p>That's because the official Python PostgreSQL DB package requires C dependencies which get built and referenced when you install the pg package. You also need certain system libraries to exist on your system in order to build them, such as libpq-dev on a Debian based system.
Luckily most of those vulnerabilities will be dormant, impossible to reach and exploit. Hopefully.<p>I don't think there's going to be any change in how people package containers. So perhaps what is needed is a dependency-walking Link-Time-Optimization-like tool that can trim down, perhaps by masking rather than removing, the dead code?
One thing to watch out for, when using container scanning tools, is how they handle "unfixed" vulnerabilities in images based on Debian/Ubuntu.<p>Both those distros maintain a list of CVEs that they know of but don't have a patch for. Traditional VA tools (e.g. Nessus) default to not flagging those, but a lot of container scanning tools will default to showing them, so you end up seeing wildly different results. (some more details <a href="https://raesene.github.io/blog/2020/11/22/When_Is_A_Vulnerability_Not_A_Vulnerability/" rel="nofollow">https://raesene.github.io/blog/2020/11/22/When_Is_A_Vulnerab...</a>)<p>Whether you consider this a problem is ofc, dependent on your threat model, but it's one to consider.<p>(full disclosure, I work for a company that makes Trivy , but not on that project :) )
i am someone who does not use containers, and every time I try to get into it (knowing how popular they are now), I just get... stuck. In feelings of wrongness, among other things.<p>One of which is that I really don't understand the "security story" with regard to patches for vulnerabilities etc. How one is meant to know when a patch is required and how one applies it, what the "conventional" or "best practice" workflow for this is.<p>This article is making me think maybe there's nothing I'm missing....?
I'm struggling with this now too. It gets even crazier when you have dozens of microservices, each in their own container.<p>Imagine being mandated by InfoSec to scan ~24 images or ~10 GB every release.<p>The images are a mixture of python services and some upstream images like redis and mysql. If anyone has an idea on how to make this less painful, I'm all ears.
<p><pre><code> FROM scratch
COPY hello /
CMD ["/hello"]
</code></pre>
This just raises the question of what the point of containerisation for applications with no dependancies.
<p><pre><code> FROM scratch
CMD ['/some_golang_or_rust_program_without_dynlibs']
</code></pre>
There you go. Never actually tried it but people here told it should work.
I don’t need slimmer containers, I need slimmer VMs.<p>Honest question: I’m using Vagrant and virtualbox to locally reproduce my cloud infrastructure. I can’t do this just with containers (Some of my servers do not run containers, so I install packages via Ansible and apt, I use systemd, etc.). How do you reproduce locally infrastructure with just containers?
I think most would be served well by using Alpine images. In my experience their quality is quite high. Unfortunately as mentioned, musl is a non-starter for a number of uses.