I've been writing containers for 10+ years and this last few years I've started using supervisord as pid 1 that manages multiple processes inside the container for various things that CAN'T function as disparate microservices in the event that one fails/updated/etc a lot more.<p>And man I love it. It's totally against the 12 microservice laws and shoudl NOT be done in most cases, but when it comes to troubleshooting- I can exec into a container anywhere restart services because supervisord sits there monitoring for the service (say mysql) to exit and will immediately restart it. And because supervisor is pid1 as long as that never dies your container doesn't die. You get the benefit of the containerization and servers without the pain of both, like having to re-image/snapshot a server once you've thoroughly broken it enough vs restarting a container. I can sit there for hours editing .conf files trying to get something to work without ever touching my dockerfile/scripts or restarting a container.<p>I don't have to make some changes, update the entrypoint/dockerfile, push build out, get new image, deploy image, exec in..<p>I can sit there and restart mysql, postgres, redis, zookeeper, as much as I want until I figure out what I need done in one go and then update my scripts/dockerfiles THEN prepare the actual production infra where it is split into microservices for reliability and scaling, etc.<p>I've written a ton of these for our QA teams so they can hop into one container and test/break/qa/upgrade/downgrade everything super quick. Doesn't give you FULL e2e but it's not we'd stop doing what tests we already do now.<p>I mention this because it was something I did once a long long time ago but completely forgot something that you could do until I recently went that route and it really does have some useful scenarios.<p><a href="https://gdevillele.github.io/engine/admin/using_supervisord/" rel="nofollow">https://gdevillele.github.io/engine/admin/using_supervisord/</a><p>I'm also really tired of super tiny containers that are absolute nightmares to troubleshoot when you need to. I work on prod infra so I need to get something online immediately when a fire is happening and having to call debug containers or manually install packages to troubleshoot things is such a show stopper. I know they're "attack vectors" but I have a vetted list of aliases, bash profiles and troubleshooting tools like jq mtr etc that are installed in every non-scratch container. My containers are all standardized and have the exact same tools, logs, paths, etc. so that everyone hopping into one knows what they can do.<p>If you're migrating your architecture to ARM64 those containers spin up SO fast that the extra 150-200mb of packages to have a sane system to work on when you have a fire burning under you is worth it. For some scale the cross datacenter/cluster/region image replication would be problematic but you SHOULD have a container caching proxy in front of EVERY cluster anyway. Or at least at the datacenter/rack. It could be a container ON your clusters with it's storage volume a singular CEPH cluster, etc.