Why Puppet, Chef, Ansible aren't good enough

362 pointsby iElectric2about 11 years ago

33 comments

jes5199about 11 years ago

A thing that this article is hinting at that I think might be more fundamental to making good automation principles: idempotency.Most of unix's standard set of tools (both the /bin programs and the standard C libraries) are written to make changes to state - but automation tools need to assure that you reach a certain state. Take "rm" as a trivial example - when I say `rm foo.txt`, I want the file to be gone. What if the file is already gone? Then it throws an error! You have to either wrap it in a test, which means you introduce a race condition, or use "-f" which disables other, more important, safeguards. An idempotent version of rm - `i_rm foo.txt` or `no_file_called! foo.txt` would would include that race-condition-avoiding logic internally, so you don't have to reinvent it, and bail only if anything funny happened (permission errors, filesystem errors). I does not invoke a solver to try to get around edge cases (e.g., it won't decide to remount the filesystem writeable so that it change an immutable fs...)Puppet attempts to create idempotent actions to use as primitives, but unfortunately they're written in a weird dialect of ruby and tend to rely on a bunch of Puppet internals in poor separation-of-concern ways (disclaimer: I used to be a Puppet developer) and I think that Chef has analogous problems.Ansible seems to be on the right track. It's still using Python scripts to wrap the non-idempotent unix primitives - but at least it's clean, reusable code.Are package managers idempotent the way they're currently written? Yes, basically. But they have a solver, which means that when you say "install this" it might say "of course, to do that, I have to uninstall a bunch of stuff" which is dangerous. So Kožar's proposal is somewhere in the right direction - since it seems like you wouldn't have to ever (?) uninstall things, but it's making some big changes to the unix filesystem to accomplish it, and then it's not clear to me how you know which versions of what libs to link to and stuff like that. There's probably smaller steps we could take today, when automating systems. Is there a "don't do anything I didn't explicitly tell you to!" flag for apt-get ?

评论 #7381321 未加载

评论 #7380786 未加载

评论 #7382773 未加载

评论 #7381529 未加载

评论 #7381826 未加载

reitzensteinmabout 11 years ago

Personally, I use Fabric for automation, and it's got all the problems the author says; if you get the machine into an unknown state, you're better off just wiping it and starting fresh.However, with the rise of virtual machines, that's a trivial operation in many cases. Even on bare metal hardware it's not a big deal, as long as you can tolerate a box disappearing for an hour (and if you can't, your architecture is a ticking time bomb).In fact, starting from a clean slate each time basically makes Fabric scripts a declarative description of the system state at rest... if you squint hard enough.

评论 #7379215 未加载

评论 #7380006 未加载

geerlingguyabout 11 years ago

So, basically, replace yum, apt, etc. with a 'stateless package management system'. That seems to be the gist of the argument. Puppet, Chef and Ansible (he left out Salt and cfengine!) have little to do with the actual post, and are only mentioned briefly in the intro.They would all still be relevant with this new packaging system.For some reason, this came to mind: <a href="https://xkcd.com/927/" rel="nofollow">https://xkcd.com/927/</a>

评论 #7379777 未加载

评论 #7379201 未加载

评论 #7379095 未加载

评论 #7378913 未加载

0xbadcafebeeabout 11 years ago

The more complex the process you use to automate tasks, the more difficult it is to troubleshoot and maintain, and the more impossible it is to inevitably replace parts of it with a new system. <a href="https://xkcd.com/1319/" rel="nofollow">https://xkcd.com/1319/</a> is not just a comic, it's a truism.I am basically a Perl developer by trade, and have been building and maintaining customized Linux distributions for large clusters of enterprise machines for years. I would still rather use shell scripts to maintain it all than Perl, or Python, or Ruby, or anything else, and would rather use a system of 'stupid' shell scripts than invest more time in another complicated configuration management scheme.Why use shell? It forces you to think simpler, and it greatly encourages you to extend existing tools rather than create your own. Even when you do create your own tools with it, they can be incredibly simple and yet work together to manage any aspect of a system at all. And of course, anyone can maintain it [especially non-developers].As an example of how incredibly dumb it can be to reinvent the wheel, i've worked for a company that wanted a tool that could automate any task, and that anyone could use. They ended up writing a large, clunky program with a custom configuration format and lots of specific functions for specific tasks. It came to the point where if I needed to get something done I would avoid it and just write expect scripts, because expect was simpler. Could the proprietary program have been made as simple as expect? Sure! But what the hell would be the point of creating and maintaining something that is already done better in an existing ages-old tool?That said, there are certain tasks i'd rather leave to a robust configuration management system (of which there are very few in the open source world [if any] that contain all the functionality you need in a large org). But it would be quite begrudgingly. The amount of times i've ripped out my hair trying to get the thing to do what I wanted it to do while in a time and resource crunch is not something i'd like to revisit.

评论 #7379860 未加载

评论 #7379703 未加载

评论 #7381524 未加载

评论 #7379432 未加载

评论 #7380116 未加载

评论 #7382121 未加载

评论 #7380060 未加载

rbcabout 11 years ago

He left out Cfengine. That's a big gap. It's been around since 1993. He also focused on package management and the provisioning process. I feel like there is more to automation than that. Continuous deployment, process management and distributed scheduling come to mind. As a plus, he does seem to get that just using system images (like Amazon AMI's) can be pretty limited.I think the complexity of automation is more a symptom of the problem space than the tools. It's just a hairy problem. Computer Science seems to largely focus on the single system. Managing "n" systems requires additional scaffolding for mutual authentication and doing file copies between different systems. It also requires the use of directory services (DNS, LDAP, etc…)I like the analogy of comparing the guitar player to a symphony orchestra. When you play the guitar alone, it's easy to improvise, because you don't need to communicate your intent to the players around you. When a symphony does a performance, there is a lot of coordination that needs to be done. Improvisation is much more difficult. That is where Domen is right on target, we can do better. Our symphony needs a better conductor.

评论 #7380071 未加载

评论 #7384554 未加载

onalarkabout 11 years ago

If you've ever needed version X.Y of Package Z on a system, and all of its underlying dependencies, or newer versions than what your operating system supports, you know exactly what Domen is talking about.It's a good write-up. The idea of a stateless, functional, package management system is really important in places like scientific computing, where we have many pieces of software, relatively little funding to improve the quality of the software, and still need to ensure that all components can be built and easily swapped for each other.The HashDist developers (still in early beta: <a href="https://github.com/hashdist/hashdist" rel="nofollow">https://github.com/hashdist/hashdist</a> ) inherited a few ideas from Nix, including the idea of prefix builds. The thing about HashDist is that you can actually install it in userspace over any UNIXy system (for now, Cygwin, OS X, and Linux), and get the exact software configuration that somebody else was using across a different architecture.

评论 #7379175 未加载

shaggyabout 11 years ago

The linked article is about package management, not configuration management. Whoever set the title of this post didn't understand the point of the article. From the comments, people seem to confuse and conflate configuration management, job automation and package management. To run a successful infrastructure at any scale you need all three.

评论 #7379022 未加载

评论 #7379149 未加载

评论 #7380515 未加载

vidarhabout 11 years ago

Part of the solutions it to never update "live" machines, but to put everything in VMs, and maintain state outside of the VM images (shared filesystems etc), and build and deploy whole new VM images.Doing updates of any kind to a running system is unnecessarily complex when we have all the tools to treat entire VMs/containers as build artefacts that can be tested as a unit.

eranationabout 11 years ago

I'm still failing to understand what solution is there out there that handles web application deployments (especially JVM ones) in an idempotent way. Including pushing the WAR file, upgrading the database across multiple nodes etc. Perhaps there are built-in solutions for Rails / Django / Node.js applications, but I couldn't find a best practice way to do this for JVM deployments. E.g. there is no "package" resource for Puppet that is a "Java Web Application" that you could just ask to be in a certain versions.How do you guys do this for Rails apps? Django apps? is this only an issue with Java web apps?

评论 #7383388 未加载

matlockabout 11 years ago

At least some parts of this post touch on immutable infrastructure, basically just replacing faulty systems and rebuilding them from scratch everytime you need to change it. Relatively easy with AWS and Packer (or other cloud providers) and super powerful. I've written about this a while ago on our blog: <a href="http://blog.codeship.io/2013/09/06/the-codeship-workflow-part-4-immutable-infrastructure.html" rel="nofollow">http://blog.codeship.io/2013/09/06/the-codeship-workflow-par...</a>

评论 #7384670 未加载

asuffieldabout 11 years ago

How does this system handle shared libraries and security updates to common components?This is not a new idea - the "application directory" dates back to riscos as far as I'm aware. It's been carefully examined many times over the decades, and hasn't been widely adopted because it leads to massive duplication of dependencies, everything in the system has to be changed to be aware of it, and there are less painful ways to solve or avoid the same problems.

评论 #7379273 未加载

kzahelabout 11 years ago

I think I find myself in a minority that thinks "sudo apt-get install nginx" is much simpler and who doesn't care about edge cases. If there's an edge case, something is wrong with my machine and it should die.

评论 #7378920 未加载

评论 #7378947 未加载

评论 #7379901 未加载

评论 #7378897 未加载

mmcclellanabout 11 years ago

This is an insightful article for devops "teams", That said, a single devop resource can get a hell of a long way in a homogenous Ubuntu LTS environment, apt packaging, Ansible and Github.I know, I know 640k will be enough for anybody, but is anybody's startup really failing because of nginx point releases?

csenseabout 11 years ago

I miss DOS, when there was a one-to-one correspondence between applications and filesystem directories.Now Windows programs want to put stuff in C:\Progra~1\APPNAME, C:\Progra~2\APPNAME, C:\Users\Applic~1\APPNAME, C:\Users\Local\Roaming\Proiles\AaghThisPathIsHuge, and of course dump garbage into the Registry and your Windows directory as well. And install themselves on your OS partition without any prompting or chance to change the target. And you HAVE to do the click-through installation wizard because everything's built into an EXE using some proprietary black magic, or downloaded from some server in the cloud using keys that only the official installer has (and good luck re-installing if the company goes out of business and the cloud server shuts down). Whereas in the old days you could TYPE the batch file and enter the commands yourself manually, or copy it and make changes. And God forbid you should move anything manually -- when I copied Steam to a partition that wasn't running out of space, it demanded to revalidate, which I couldn't do because the Yahoo throwaway email I'd registered with had expired. (Fortunately nobody had taken it in the meantime and I was able to re-register it.)I've been using Linux instead for the past years. While generally superior to Windows, its installation procedures have their own set of problems. dpkg -S firefox tells me that web browser shoves stuff in the following places:<pre><code> /etc/apport /etc/firefox /usr/bin /usr/lib/firefox /usr/lib/firefox-addons /usr/share/applications /usr/share/apport /usr/share/apport/package-hooks /usr/share/doc /usr/share/pixmaps /usr/share/man/man1 /usr/share/lintian/overrides </code></pre> I don't mean to pick on this specific application; rather, this is totally typical behavior for many Linux packages.Some of these directories, e.g. /usr/bin, are a real mess because EVERY application dumps its stuff there:<pre><code> $ ls /usr/bin | wc -l 1840 </code></pre> Much of the entire reason package managers have to exist in the first place is to try to get a handle on this complexity.I welcome the NixOS approach, since it's probably as close as we can get to the one-directory-per-application ideal without requiring application changes.

评论 #7383774 未加载

bdcravensabout 11 years ago

I've been playing with Rubber for a Rails app. It's nowhere near as capable as Chef, but for the needs of most Rails apps deploying multiple servers and services to AWS, it's extremely capable. I'd put it somewhere between Chef and Heroku as far as being declarative and being magical.

zobzuabout 11 years ago

deterministic builds? pdebuild. mock. this exists since practically forever.as far as the "stateless" thing, this could have been explained in a far simple manner IMO.1) No library deps:"all system packages are installed in /mystuff/version/ with all their libs, then symlinked to /usr/bin so that we have no dependencies" (that's not new either but it never took off on linux)2) fewer config deps "only 4 variables can change the state of a config mgmt system's module, those are used to know if a daemon should restart for example"So yeah. it's actually not stateless. And hey, stateless is not necessarily better. It's just less complicated (and potentially less flexible for the config mgmt part).Might be why the author took so long to explain it without being too clear.

gulfieabout 11 years ago

<a href="https://www.usenix.org/legacy/publications/library/proceedings/sec96/hollander/" rel="nofollow">https://www.usenix.org/legacy/publications/library/proceedin...</a>(speaking from second hand knowledge) They don't go into much of the really interesting detail in the paper. The awesome part of all of that was that everything an application required to function was under it's own tree, you never had any question of providence, or if the shared libraries would work right on the given release of the OS you might be using. And it worked from any node on the global network. This problem has been solved, most people didn't get the memo.

manish_gillabout 11 years ago

Anyone have a good introductory article about these tools (and others like Vagrant etc)? I keep hearing about them, but so far, have been managing a single VPS fine with ssh+git, with supervisord thrown in. Am I missing out by not using these?

评论 #7379058 未加载

评论 #7382018 未加载

评论 #7379093 未加载

评论 #7379145 未加载

评论 #7383394 未加载

renoxabout 11 years ago

His example of replacing the database (stateful) by the network(stateless) for email checking is poor: it makes the implicit supposition that the network is as reliable as the database is.. What happen when one email is lost?

评论 #7379356 未加载

评论 #7380042 未加载

Goladusabout 11 years ago

This looks really interesting but I don't see it as a magic bullet for configuration management. There seem to be a lot of advantages on the package management side but configuration management is a lot more than that.Generally the whole point of a configuration file is to allow administrative users to the change the behavior of the application. Treating the configuration file as an "input" is a relatively trivial difference and doesn't really address most of the problems admins face.

coherentponyabout 11 years ago

Should one really be setting LD_LIBRARY_PATH like that? I thought the preferred way to deal with library search at run time was to rpath it in at compile time.

评论 #7382120 未加载

bqeabout 11 years ago

This is why I'm building Squadron[1]. It has atomic releases, rollback, built-in tests, and all without needing to program in a DSL.It's in private beta right now, but if you're curious, let me know and I'll hook you up.[1]: <a href="http://www.gosquadron.com" rel="nofollow">http://www.gosquadron.com</a>

评论 #7381806 未加载

contingenciesabout 11 years ago

Inspired by this post (by a fellow Gentoo user, no less!) I finally published my extended response on the same theme, which has been written over some months: <a href="https://news.ycombinator.com/item?id=7384393" rel="nofollow">https://news.ycombinator.com/item?id=7384393</a>

kaiviabout 11 years ago

Talking about automating apt-get, yum and the like, is there a way to cache frequently downloaded packages on developer machine in the same local network?For instance, I have a bunch of disposable VMs, and I don't want them to download the same gigabytes every time I run init-deploy-test.

评论 #7379013 未加载

评论 #7379023 未加载

mariusmgabout 11 years ago

On Windows i use Boxstarter or a simple powershell script that invokes Chocolatey (must be already installed).I had a look at Puppet/Chef.......wow those really look complicated for something that should really be simple.

评论 #7380256 未加载

leftrightupdownabout 11 years ago

hey, i noticed this on HN so just to share my thoughts. How about incorporating it a bit more simple, just pushing commands with some checks like these guys do? It is a bit more low-level but automation comes only once and should be easier to change. Here is video from their site <a href="http://www.youtube.com/watch?v=FBQAhsDeM-s" rel="nofollow">http://www.youtube.com/watch?v=FBQAhsDeM-s</a>

syongarabout 11 years ago

State is the entire value and utility of a computer.

评论 #7379114 未加载

评论 #7378976 未加载

评论 #7379032 未加载

lewaldmanabout 11 years ago

Were would be Salt on that bag (<a href="http://www.saltstack.com/" rel="nofollow">http://www.saltstack.com/</a>)?

KaiserProabout 11 years ago

This isn't stateless, the state has been moved from the package manager/filesystem to a string held in the *INCLUDE.This is nasty.

评论 #7381384 未加载

telmichabout 11 years ago

Has anyone reading this article checked out cdist?I like it very much, you can guess why...

greatsuccessabout 11 years ago

Another "functional languages make everything better", load of crap.

评论 #7383472 未加载

dschiptsovabout 11 years ago

What problem does it solve besides "I am so clever and just learnt the word 'nondeterministic'?"I would suggest another blog post about monadic (you know, type checked, guaranteed safe) packages (uniques sets of pathnames), statically linked, each file in a unique cryptohashed read-only mounted directory, sorry, volume. Under unique Docker instance, of course, with its own monolithic kernel, cryptohashed and read only.Oh, Docker is a user space crap? No problem, we could run multiple Xens with unique ids.

liveoneggsabout 11 years ago

author has a shallow or non-existent understanding of making pkgs for an operating system, doing system's administration, or the automation tools mentioned.

评论 #7378962 未加载