Docker in Production: A History of Failure (2016)

112 pointsby flanflyalmost 4 years ago

23 comments

user5994461almost 4 years ago

Original author from 5 years ago. Surprised to see this here 5 years later.Docker really used to crash a lot back in the days, mostly due to buggy storage drivers. If you were on Debian or CentOS it's very likely that you experienced crashes (though a lot of developers didn't care or didn't understand the reasons the system went unresponsive).There was notably a new version of Debian (with a newer kernel) published the year after my experience. It's a lot more stable now.My experience is that by 2018-2019, Docker had mostly vanished as a buzzword, people were only talking about Kubernetes and looking for kubernetes experience.edit: at that time Docker didn't have a way to clear images/containers, it was added after the article and follow up articles, I will never know if it was a coincidence but I like to think there is a link. I think writing the article was worth it if only for this reason.

评论 #27978614 未加载

评论 #27978618 未加载

pbecottealmost 4 years ago

Even in 2016, I had been running production services in Docker successfully. Its interesting to me that they see the problem "Docker isn't designed to store data" without also seeing the solution "the docker copy-on-write filesystem isn't designed to be written to production- but volume mounts are". I hadn't seen docker crashing hosts (still haven't) - but I'm guessing that was caused by using the storage drivers.The complaints about their development practices are valid (and haven't really improved), but even then the technology worked well so long as you understood its limitations.

zz865almost 4 years ago

Our big project has moved from physical servers to Openshift. Its taken a lot of work, much more than expected. The best thing is that developers like it on their resume, which is a bigger benefit than you'd think as we've kept some good people on the team. For users I see zero benefit. CI pipeline is just more complicated and probably slower.Cost wise it was cheaper for a while but now RedHat are bumping up licensing costs so now I think is about the same costs.Overall it seems like a waste of time, but has been interesting.

评论 #27976732 未加载

评论 #27976685 未加载

评论 #27979410 未加载

debarshrialmost 4 years ago

It was year 2016, kubernetes did not have jobs, cronjobs, statefulsets. Pods would get stuck in terminating state or container creating state. Networking in kubernetes was wonky. AWS did not have support for EKS. It used be painful.It is year 2021, 1000s of new startups around kubernetes, more features, more resource types. Pods would still get stuck in terminating state or container creating state. It still pretty painful.

评论 #27974230 未加载

评论 #27974900 未加载

评论 #27976382 未加载

评论 #27974443 未加载

评论 #27974157 未加载

akanetalmost 4 years ago

I once gave a lunch talk at Docker, inc, which included several slides of suggestions of features to add to Docker. Prominently featured was the request for a native command to clean old images. An engineer in attendance remarked incredulously that he could not believe that users would not know how to pipe docker ls into xargs docker rmi.

评论 #27979479 未加载

评论 #27979087 未加载

评论 #27979163 未加载

评论 #27979042 未加载

评论 #27979073 未加载

theamkalmost 4 years ago

Back in 2016 during the original discussion of this article, amount said it very well in [0]:"If you hit this many problems with any given tech, I would suggest you should be looking for outside help from someone that has experience in the area."- Yes, "clean old images" was not implemented back then. His hack is not that bad, and one can filter out in-use images if they want to pretty easily. Anyway, docker does have "docker image prune" now.- Storage driver history discussion is entirely incorrect. No, docker did not invent overlayfs nor overlayfs2. There was a whole big drama of aufs not mainlining, but it was mostly in context of live cd's, not docker.But the big missing thing is: you should not store important data in docker images, Docker is designed to work with transient container. If you have a database, or a high-performance data store, you use volumes, and those _bypass_ docker storage drivers completely.- The database story is completely crazy... judging by their comments, they decided to store the database data in the docker container for some reason and got all the expected problems (unable to recover, hard to migrate, etc....). It is not clear why they didn't put database data on the volume, there is a 2016 StackOverflow question discussing it [0].Also, "Docker is locking away [...] files through its abstraction [...] It prevents from doing any sort of recovery if something goes wrong." Really? I did recovery with docker, the files are under /var/lib/docker in the directory named with guid, a simple "find" command can locate them.- By default, Docker uses Linux networking and yes, the configuration is complex so it adds overhead. That's why there is --net=host option (which was there for a long time) which just bypasses that all.[0] <a href="https://news.ycombinator.com/item?id=12872636" rel="nofollow">https://news.ycombinator.com/item?id=12872636</a>[1] <a href="https://stackoverflow.com/questions/40167245/how-to-persist-data-using-postgres-docker-image" rel="nofollow">https://stackoverflow.com/questions/40167245/how-to-persist-...</a>

clipradiowalletalmost 4 years ago

I know this article is from 2016...but my feelings about it(the article) are unchanged. Some people do not like new things, and they will blog about it in some form or fashion. Maybe their reasoning is valid, maybe it's not - it doesn't matter. Meanwhile...businesses have, and continue, to pay top $$$ for people that will help them do these things. If you want to collect this $$$, get on board.In a few years, the things businesses want to pay $$$ will change. New blog articles about "this new stuff is bad!" will appear, and new job postings paying above-market $$$ will appear also. You can either rail on about the bad(or good) changes, and how it's just everything-old-is-new-again....or you can get with the program, and get paid. In another few years, rinse and repeat.

评论 #27974843 未加载

评论 #27974460 未加载

_joelalmost 4 years ago

In 2016 I started at a company that had no build procedures and deployed to a variety of linux versions, developed on windows. It was a nightmare for administration, no automation, no monitoring. I implemented containers and most of the process was getting the developers on board. Having technical sessions with them to understand what they needed and ease them into the plan so they felt enfranchished. Doing this vastly increased productivity, devs could take off the shelf compose files that were written for common projects (it was a GIS shop) and meant they could concentrate on delivering code. It helped no end.Sure there's issues (albeit a lot fewer as time progressed) with docker but for what it gained in productivity and developer's sanity, it was very welcome.

评论 #27979522 未加载

crummybowleyalmost 4 years ago

The issue is not docker, the issue is you treat your servers like pets.Folks need to start building systems that destroy all and re-image fresh. Any other way you are just setting your self up for failure.

评论 #27978714 未加载

评论 #27976270 未加载

manishsharanalmost 4 years ago

This is a blogpost from 2016 . However if we switch to more recent times, my experience with AWS ECS and Fargate has been fairly boring. There was a learning curve to get it to work with cloudformation, vpcs, iam and load balancer .

评论 #27976291 未加载

stevebmarkalmost 4 years ago

> Docker is meant to be stateless. Containers have no permanent disk storage, whatever happens is ephemeral and is gone when the container stops.It's interesting that this misconception made it into a clearly knowledgeable article. Containers have state on the writeable layer that is persisted between container stops and starts.

评论 #27976949 未加载

评论 #27975881 未加载

mianosalmost 4 years ago

It seems a lot has not changed:Docker gradually exhausts disk space on BTRFS Open ghost opened this issue on 23 Oct 2016 <a href="https://github.com/moby/moby/issues/27653" rel="nofollow">https://github.com/moby/moby/issues/27653</a>Still comments this week showing it happens still.

KronisLValmost 4 years ago

The article seems to mention problems with AUFS, overlay and possibly overlay2 as well.However one of the things that i haven't quite understood, is why people use Docker volumes that much in the first place, or even think that they need to use additional volume plugins in most deployments?If it's a relatively simple deployment, that has some persistent data and it's clear on which nodes the containers could be scheduled (either by label or by hostname), what would prevent someone from just using bind mounts ( <a href="https://docs.docker.com/storage/bind-mounts/" rel="nofollow">https://docs.docker.com/storage/bind-mounts/</a> )?And if you need to store it on a separate machine, why not just use NFS on the host OS to mount the directory which you will bind mount? Or, alternatively, why not just use GlusterFS or Ceph for that sort of stuff, instead of making Docker attempt to manage it?For example, Docker Swarm fails to launch containers if the bind mount path doesn't exist, but that bit can also be addressed by creating the necessary directory structure with something like Ansible - and then you're not only able to not worry about volumes and the risk of them ever becoming corrupt, but you also have the ability to inspect the contents of the container storage on the actual host. Say, if there are some configuration files that need altering (seeing as not all of the containerized software out there follows 12 Factor principles with environment configuration either), or you just want to do some backups for the data that you've stored in a granular fashion.

belteralmost 4 years ago

Posted many times before but this is the only one with comments:<a href="https://news.ycombinator.com/item?id=12872304" rel="nofollow">https://news.ycombinator.com/item?id=12872304</a>

ChrisArchitectalmost 4 years ago

Anything new since this?A history of re-submitted, previously discussed posts:<a href="https://news.ycombinator.com/item?id=12872304" rel="nofollow">https://news.ycombinator.com/item?id=12872304</a>

cygnedalmost 4 years ago

We had a fun issue with Docker yesterday: suddenly, services in our Swarm did not start, apparently because a config could not be mounted. They were running fine for over two years and nobody had touched anything in the Swarm config.Turned out, AWS had decided to upgrade Docker on the server and that version (20.x) is not able to launch services in the Swarm. We have downgraded to 18 now, which works, but is not a long-term solution.

rubyist5evaalmost 4 years ago

Podman and Kubernetes are like a match made in heaven. Docker was a good first try for most people, but there is so much better technology that exists now.

robertwt7almost 4 years ago

I felt the same way when deploying to production with docker manually back then.But honestly after we used k8s we everything just works. Although I realise that GKE is actually moving to containerd for newer clusters? Not sure what made the decision, but last time I had to restart a container manually due to my stupid mistake, the api doesn’t seem to be much different

bfrogalmost 4 years ago

Heh I just ran into an issue the other day with a coworker where Ubuntu had a patched kernel auto update and break everything. Yep, it’s a sand castle

aaccountalmost 4 years ago

Any sort of containerization is for people who don't know how to properly package software.

jwildeboeralmost 4 years ago

Guy who claims to run systems in the hft space, responsible for millions of trades with high values, can’t be bothered to actually pay for support, relies on community and blames everyone but himself for being left alone with his mess. Not sorry.

Theodoresalmost 4 years ago

Symfony console broke Magento 2 today. Same story.

mberningalmost 4 years ago

Docker is the chaos monkey incarnate.