Poll: What do you use for Unix process management/monitoring?

142 pointsby gosurialmost 12 years ago

I'm setting up production infrastructure for an new project and would love to know what you use for process management

51 comments

sehropealmost 12 years ago

Clickable:monit: <a href="http://mmonit.com/monit" rel="nofollow">http://mmonit.com/monit</a>supervisord: <a href="http://supervisord.org" rel="nofollow">http://supervisord.org</a>daemonize: <a href="http://bmc.github.com/daemonize" rel="nofollow">http://bmc.github.com/daemonize</a>runit: <a href="http://smarden.org/runit/" rel="nofollow">http://smarden.org/runit/</a>perp: <a href="http://b0llix.net/perp/" rel="nofollow">http://b0llix.net/perp/</a>DJB's daemontools: <a href="http://cr.yp.to/daemontools.html" rel="nofollow">http://cr.yp.to/daemontools.html</a>systemd: <a href="http://www.freedesktop.org/wiki/Software/systemd" rel="nofollow">http://www.freedesktop.org/wiki/Software/systemd</a>god: <a href="http://godrb.com" rel="nofollow">http://godrb.com</a>upstart: <a href="http://upstart.ubuntu.com" rel="nofollow">http://upstart.ubuntu.com</a>

moonbootsalmost 12 years ago

I use runit in production for <a href="http://typing.io" rel="nofollow">http://typing.io</a>. I appreciate runit's strong unix philosophy (shell scripts instead of dsls). However, I'm starting to experiment with systemd because of features like properly tracking and killing services [1]. This feature would be useful with a task like upgrading an nginx binary without dropping connections [2]. This isn't possible with runit (and most process monitors) because nginx double forks, breaking its supervision tree.[1] <a href="http://0pointer.de/blog/projects/systemd-for-admins-4.html" rel="nofollow">http://0pointer.de/blog/projects/systemd-for-admins-4.html</a>[2] <a href="http://wiki.nginx.org/CommandLine#Upgrading_To_a_New_Binary_On_The_Fly" rel="nofollow">http://wiki.nginx.org/CommandLine#Upgrading_To_a_New_Binary_...</a>

评论 #6088370 未加载

评论 #6087720 未加载

评论 #6088174 未加载

评论 #6087783 未加载

harrytuttlealmost 12 years ago

You forgot one:sysv init!All of my systems' processes are managed by it and have been for at least two decades.Occasionally I do these periodic tasks as well, which are handled by a thing called "cron".Yes this is sarcasm. There is a lot of wheel-reinventing done these days which is entirely unnecessary if you consider the long-forgotten "Unix philosophy".

评论 #6088541 未加载

评论 #6088511 未加载

knappealmost 12 years ago

Monit is by far your best bet. Easy to install, packaged on most distros and it performs reactive monitoring as opposed to most traditional monitoring systems like Nagios. Plus you can open up a web interface if you want to allow for easy browsing of monitored processes.EDIT: I liked it so much (and it was so easy) that I wrote a blog post expounding how much I liked it and how to use it. <a href="http://moduscreate.com/monit-easy-monitoring/" rel="nofollow">http://moduscreate.com/monit-easy-monitoring/</a>

评论 #6087518 未加载

jacques_chesteralmost 12 years ago

Folk on Solaris / Illumos would probably like SMF to be added to the list.I'd go on to mention various z/OS subsystems, but that's a bit esoteric even for HN :D(Process management ties into my larger rant that nobody has properly combined it with configuration management. But nobody has time for this nonsense.)

jacobquickalmost 12 years ago

Nothing. The answer is you don't monitor production processes directly, ever, it's a waste of your time and effort. Certainly this sort of foolishness should not be used to page an employee off-hours if that's where you're headed.The only thing you need to monitor is: whether a server answers the network request it was designed to. Outside of that you might optionally want to know whether the disk is full, ram is maxed (thus putting linux into swap) or if the cpu runs too high to cope with losing some servers at peak, but really that's all optional if you're in ec2 and can just spin up more servers on a moment's notice.You can gather all this data for yourself with Newrelic or if you want you can send data to graphite or if you're old-fashioned you can use Icinga in place of Nagios because it keeps history in a database. If the developers want to know about the process for the application they implemented you can put Newrelic on the server for them, and put the system Newrelic thing on there too, just don't pay attention to it or pretend it's important until something breaks.The important catch here, the thing that is critical to this whole line of thinking: you have to have thought things through before you built them, focused on having one service per os and real redundancy throughout the environment, and then critically your kick should be fast enough that if a server has some kind of problem in production you don't fix it you just re-kick it. That means your kick throws the os on there, then triggers salt or ansible or chef to configure every single detail and then triggers a deploy of internally-developed applications. That means you have to test the kick to death before you can rely on it to rebuild you something live. If the problem is recurring you can use immediate tools, jdump or whatever, to get some data, give it to the application's developers, and let them try to recreate it in staging while you go ahead and re-kick the prod server and go back to writing documentation for lesser ops to not read, drinking at your desk, reading hackernews, acting as a cia listening post for cat pictures or whatever else passes the time.

otterleyalmost 12 years ago

Some points of note:1. daemontools and runit are practically identical. I do prefer runit somewhat, as svlogd has a few more features than multilog (syslog forwarding, more timestamp options), and sv has a few more options than svc (it can issue more signals to the supervised process).2. Among the criteria I look for in a process manager are: (1) the ability to issue any signal to a process (not just the limited set of TERM and HUP), and (2) the ability to run some kind of test as a predicate for restarting or reloading a service. The latter is especially useful to help avoid automating yourself into an outage. As far as I'm aware, none of the above process supervisors can do that, so I tend to eschew them in favor of initscripts and prefer server implementations that are reliable enough not to need supervision.

评论 #6087978 未加载

shuzchenalmost 12 years ago

I've been using supervisord for most everything (doesn't hurt that I'm primarily a python guy), but I'm slowly testing out Mozilla's circus (<a href="https://github.com/mozilla-services/circus" rel="nofollow">https://github.com/mozilla-services/circus</a>) and it's been going great so far.

评论 #6088066 未加载

jfbalmost 12 years ago

I don't, currently, because I don't have to monitor services, but if I did, I think I'd likely use daemontools, based on the fact that djb really, really understands how to write Unix software.

评论 #6087866 未加载

评论 #6088392 未加载

JoshTriplettalmost 12 years ago

systemd handles service monitoring natively, as well as socket management and many aspects of container management. It's a superset of most of the tools listed.

评论 #6087542 未加载

评论 #6088656 未加载

评论 #6087216 未加载

bifrostalmost 12 years ago

SNMP for centralized systems monitoring, cronjobs for "daemon management". I dislike the tools that take up large system footprints, sometimes larger than the things they're supposed to manage themselves, to do simple things. No, they're not always extensible. When you can accomplish the same thing with a 4-5 line shellscript that takes less than 100k of ram to run, that you'd do with a giant process that takes 40-50MB of ram, I know which one I prefer (generally).

评论 #6088245 未加载

Spidleralmost 12 years ago

Systemd for most Servers, svc / rundir ( a version of daemontools ) from busybox on embedded services.The code is designed to be shot, and will always recover after a restart. rundir/svc will neatly reap a process and re-start it again. And can be used separately.

_deliriumalmost 12 years ago

Do any of these integrate with cgroups? I've found myself wanting to specify some rules about resource usage on occasion, and cgroups seems conceptually nice, but I'm not sure how to work it in nicely to my other tools, short of writing custom shell scripts to manipulate /proc/cgroups.

评论 #6087625 未加载

telalmost 12 years ago

Anyone with any experience using Angel (<a href="https://github.com/MichaelXavier/Angel" rel="nofollow">https://github.com/MichaelXavier/Angel</a>)?

评论 #6088568 未加载

eknkcalmost 12 years ago

Have been using monit to keep a couple of node.js services online / monitor their PIDs and HTTP interfaces. It's been a positive experience so far.

snikchalmost 12 years ago

Please add God[1] to the list, we use this in production.Also I'll put in a shameless plug for my side project, a service management tool for multi project development; Hack On[2][1] <a href="http://godrb.com/" rel="nofollow">http://godrb.com/</a> [2] <a href="https://github.com/snikch/hack" rel="nofollow">https://github.com/snikch/hack</a>

评论 #6087254 未加载

评论 #6087240 未加载

jingoalmost 12 years ago

How many of those are actually derivatives of daemontools?

__david__almost 12 years ago

<a href="https://github.com/caldwell/daemon-manager" rel="nofollow">https://github.com/caldwell/daemon-manager</a>I've been dogfooding it in a production environment for a couple years and it's been pretty solid.

elasticdogalmost 12 years ago

I wish that s6 (skarnet.org's small and secure supervision software suite) [1] were more-widely packaged and available on distros. It's very much in the same vein as daemontools, but with some improvements. While certainly biased, the author wrote a pretty good breakdown and comparison of why s6 was developed [2].[1] <a href="http://www.skarnet.org/software/s6/" rel="nofollow">http://www.skarnet.org/software/s6/</a>[2] <a href="http://www.skarnet.org/software/s6/why.html" rel="nofollow">http://www.skarnet.org/software/s6/why.html</a>

jensnockertalmost 12 years ago

launchd, but I can pretty safely assume that I am one of the few here running servers on OS X.

flexdalmost 12 years ago

I use supervisord now, before I would use mon/mongroup [1] which is just a tiny C program to monitor stuff.I have also used god at some point, but I kept having trouble. I can't remember exactly what was wrong but it never quite worked correctly for me. Probably PEBCAK.[1] <a href="https://github.com/jgallen23/mongroup" rel="nofollow">https://github.com/jgallen23/mongroup</a>

eldavidoalmost 12 years ago

upstart because I wouldn't use a job control system in prod that isn't included with the base distribution (do any base unix distros use monit or supervisord?) It's just too much useless work to rewrite job control logic for daemons when the OS already gives them to you, and I've been quite surprised with the feature completeness of upstart.

评论 #6089274 未加载

brokenparseralmost 12 years ago

^C, ^Z, ps, fg, bg

wusheralmost 12 years ago

Bluepill <a href="https://github.com/arya/bluepill" rel="nofollow">https://github.com/arya/bluepill</a>

pranavrcalmost 12 years ago

Histogram so far:<a href="http://quickhist.onloop.net/monit=75,supervisord=107,daemonize=2,runit=30,perp=2,DJB%27s%20daemontools=33,systemd=54,god=18,upstart=76/Unix%20process%20management%20tools%20-%20Hacker%20News%20Poll%20Jul%20%2713" rel="nofollow">http://quickhist.onloop.net/monit=75,supervisord=107,daemoni...</a>

calpatersonalmost 12 years ago

I use upstart, but am not happy with it for a number of reasons. Two important ones: "restart" does not reread the configuration file and the DSL is poorly done (the "respawn" stanza and others).I haven't looked recently at alternatives, but I'm open to it.

praguebakerralmost 12 years ago

nagios + few licences of new relics

评论 #6087170 未加载

loonyalmost 12 years ago

I'm using htop. Very easy but maybe not enough features for what you are looking for ?

Qantouriscalmost 12 years ago

For daemons: none of the above (init) I then monitor it with Zabbix. I assume services don't crash, and hey they don't (not that I know of in any case).Unless you did really mean process and not daemon, then it's supervisord.

epynonymousalmost 12 years ago

personally i've had great success with supervisord, no success with god, good experiences with monit, but i'm curious, whatever happened to good ol' linux watchdog?

anderspeterssonalmost 12 years ago

Currently using upstart, only because it's default in ubuntu.

kunaialmost 12 years ago

No launchd love? Okay...

chillitomalmost 12 years ago

Reactive monitoring via Riemann: <a href="http://riemann.io/" rel="nofollow">http://riemann.io/</a>We use this to monitor services at the application level.

fiffigalmost 12 years ago

Pacemaker if you need to keep it alive no matter what.Systemd was pretty stable until user mode flat out broke in 205. I use it to manage my entire desktop session.

mafroalmost 12 years ago

Older HN post for reference:<a href="https://news.ycombinator.com/item?id=1368855" rel="nofollow">https://news.ycombinator.com/item?id=1368855</a>

kuahyeowalmost 12 years ago

We use <a href="https://github.com/willbryant/niet" rel="nofollow">https://github.com/willbryant/niet</a> at work

busteralmost 12 years ago

Supervisord but i'd love to check on systemd and its capabilities for that in the future..

ricardobeatalmost 12 years ago

forever (<a href="https://github.com/nodejitsu/forever/" rel="nofollow">https://github.com/nodejitsu/forever/</a>) has worked great for me, but doesn't make any sense if you're not running node.js applications.

评论 #6089650 未加载

maplebedalmost 12 years ago

I've used monit since forever, but have really come to like runit more recently.

aredridelalmost 12 years ago

SMF

D9ualmost 12 years ago

top

kiallmacinnesalmost 12 years ago

upstart does the job nicely.

jusobalmost 12 years ago

monit for active monitoring + munin for trends

luke-stanleyalmost 12 years ago

htop

Aqueousalmost 12 years ago

'ps' :)

评论 #6100199 未加载

aredridelalmost 12 years ago

sysv init

aredridelalmost 12 years ago

docker

aredridelalmost 12 years ago

forever

aredridelalmost 12 years ago

launchd

aaronsnoswellalmost 12 years ago

top

asdasfalmost 12 years ago

Kinda sad that "nothing" isn't on the list. I just use software that isn't broken, so it doesn't need to be constantly restarted.

评论 #6088157 未加载

评论 #6088420 未加载

评论 #6089268 未加载

评论 #6088350 未加载

51 comments

sehropealmost 12 years ago

moonbootsalmost 12 years ago

评论 #6088370 未加载

评论 #6087720 未加载

评论 #6088174 未加载

评论 #6087783 未加载

harrytuttlealmost 12 years ago

评论 #6088541 未加载

评论 #6088511 未加载

knappealmost 12 years ago

评论 #6087518 未加载

jacques_chesteralmost 12 years ago

jacobquickalmost 12 years ago

otterleyalmost 12 years ago

评论 #6087978 未加载

shuzchenalmost 12 years ago

评论 #6088066 未加载

jfbalmost 12 years ago

I don't, currently, because I don't have to monitor services, but if I did, I think I'd likely use daemontools, based on the fact that djb really, really understands how to write Unix software.

评论 #6087866 未加载

评论 #6088392 未加载

JoshTriplettalmost 12 years ago

systemd handles service monitoring natively, as well as socket management and many aspects of container management. It's a superset of most of the tools listed.

评论 #6087542 未加载

评论 #6088656 未加载

评论 #6087216 未加载

bifrostalmost 12 years ago

评论 #6088245 未加载

Spidleralmost 12 years ago

_deliriumalmost 12 years ago

评论 #6087625 未加载

telalmost 12 years ago

Anyone with any experience using Angel (<a href="https://github.com/MichaelXavier/Angel" rel="nofollow">https://github.com/MichaelXavier/Angel</a>)?

评论 #6088568 未加载

eknkcalmost 12 years ago

Have been using monit to keep a couple of node.js services online / monitor their PIDs and HTTP interfaces. It's been a positive experience so far.

snikchalmost 12 years ago

评论 #6087254 未加载

评论 #6087240 未加载

jingoalmost 12 years ago

How many of those are actually derivatives of daemontools?

__david__almost 12 years ago

elasticdogalmost 12 years ago

jensnockertalmost 12 years ago

launchd, but I can pretty safely assume that I am one of the few here running servers on OS X.

flexdalmost 12 years ago

eldavidoalmost 12 years ago

评论 #6089274 未加载

brokenparseralmost 12 years ago

^C, ^Z, ps, fg, bg

wusheralmost 12 years ago

Bluepill <a href="https://github.com/arya/bluepill" rel="nofollow">https://github.com/arya/bluepill</a>

pranavrcalmost 12 years ago

calpatersonalmost 12 years ago

praguebakerralmost 12 years ago

nagios + few licences of new relics

评论 #6087170 未加载

loonyalmost 12 years ago

I'm using htop. Very easy but maybe not enough features for what you are looking for ?

Qantouriscalmost 12 years ago

epynonymousalmost 12 years ago

personally i've had great success with supervisord, no success with god, good experiences with monit, but i'm curious, whatever happened to good ol' linux watchdog?

anderspeterssonalmost 12 years ago

Currently using upstart, only because it's default in ubuntu.

kunaialmost 12 years ago

No launchd love? Okay...

chillitomalmost 12 years ago

Reactive monitoring via Riemann: <a href="http://riemann.io/" rel="nofollow">http://riemann.io/</a>We use this to monitor services at the application level.

fiffigalmost 12 years ago

Pacemaker if you need to keep it alive no matter what.Systemd was pretty stable until user mode flat out broke in 205. I use it to manage my entire desktop session.

mafroalmost 12 years ago

Older HN post for reference:<a href="https://news.ycombinator.com/item?id=1368855" rel="nofollow">https://news.ycombinator.com/item?id=1368855</a>

kuahyeowalmost 12 years ago

We use <a href="https://github.com/willbryant/niet" rel="nofollow">https://github.com/willbryant/niet</a> at work

busteralmost 12 years ago

Supervisord but i'd love to check on systemd and its capabilities for that in the future..

ricardobeatalmost 12 years ago

评论 #6089650 未加载

maplebedalmost 12 years ago

I've used monit since forever, but have really come to like runit more recently.

aredridelalmost 12 years ago

SMF

D9ualmost 12 years ago

top

kiallmacinnesalmost 12 years ago

upstart does the job nicely.

jusobalmost 12 years ago

monit for active monitoring + munin for trends

luke-stanleyalmost 12 years ago

htop

Aqueousalmost 12 years ago

'ps' :)

评论 #6100199 未加载

aredridelalmost 12 years ago

sysv init

aredridelalmost 12 years ago

docker

aredridelalmost 12 years ago

forever

aredridelalmost 12 years ago

launchd

aaronsnoswellalmost 12 years ago

top

asdasfalmost 12 years ago

Kinda sad that "nothing" isn't on the list. I just use software that isn't broken, so it doesn't need to be constantly restarted.

评论 #6088157 未加载

评论 #6088420 未加载

评论 #6089268 未加载

评论 #6088350 未加载