Poll: What do you use for Unix process management/monitoring?

142 点作者 gosuri将近 12 年前

I'm setting up production infrastructure for an new project and would love to know what you use for process management

51 条评论

sehrope将近 12 年前

Clickable:monit: <a href="http://mmonit.com/monit" rel="nofollow">http://mmonit.com/monit</a>supervisord: <a href="http://supervisord.org" rel="nofollow">http://supervisord.org</a>daemonize: <a href="http://bmc.github.com/daemonize" rel="nofollow">http://bmc.github.com/daemonize</a>runit: <a href="http://smarden.org/runit/" rel="nofollow">http://smarden.org/runit/</a>perp: <a href="http://b0llix.net/perp/" rel="nofollow">http://b0llix.net/perp/</a>DJB's daemontools: <a href="http://cr.yp.to/daemontools.html" rel="nofollow">http://cr.yp.to/daemontools.html</a>systemd: <a href="http://www.freedesktop.org/wiki/Software/systemd" rel="nofollow">http://www.freedesktop.org/wiki/Software/systemd</a>god: <a href="http://godrb.com" rel="nofollow">http://godrb.com</a>upstart: <a href="http://upstart.ubuntu.com" rel="nofollow">http://upstart.ubuntu.com</a>

moonboots将近 12 年前

I use runit in production for <a href="http://typing.io" rel="nofollow">http://typing.io</a>. I appreciate runit's strong unix philosophy (shell scripts instead of dsls). However, I'm starting to experiment with systemd because of features like properly tracking and killing services [1]. This feature would be useful with a task like upgrading an nginx binary without dropping connections [2]. This isn't possible with runit (and most process monitors) because nginx double forks, breaking its supervision tree.[1] <a href="http://0pointer.de/blog/projects/systemd-for-admins-4.html" rel="nofollow">http://0pointer.de/blog/projects/systemd-for-admins-4.html</a>[2] <a href="http://wiki.nginx.org/CommandLine#Upgrading_To_a_New_Binary_On_The_Fly" rel="nofollow">http://wiki.nginx.org/CommandLine#Upgrading_To_a_New_Binary_...</a>

评论 #6088370 未加载

评论 #6087720 未加载

评论 #6088174 未加载

评论 #6087783 未加载

harrytuttle将近 12 年前

You forgot one:sysv init!All of my systems' processes are managed by it and have been for at least two decades.Occasionally I do these periodic tasks as well, which are handled by a thing called "cron".Yes this is sarcasm. There is a lot of wheel-reinventing done these days which is entirely unnecessary if you consider the long-forgotten "Unix philosophy".

评论 #6088541 未加载

评论 #6088511 未加载

knappe将近 12 年前

Monit is by far your best bet. Easy to install, packaged on most distros and it performs reactive monitoring as opposed to most traditional monitoring systems like Nagios. Plus you can open up a web interface if you want to allow for easy browsing of monitored processes.EDIT: I liked it so much (and it was so easy) that I wrote a blog post expounding how much I liked it and how to use it. <a href="http://moduscreate.com/monit-easy-monitoring/" rel="nofollow">http://moduscreate.com/monit-easy-monitoring/</a>

评论 #6087518 未加载

jacques_chester将近 12 年前

Folk on Solaris / Illumos would probably like SMF to be added to the list.I'd go on to mention various z/OS subsystems, but that's a bit esoteric even for HN :D(Process management ties into my larger rant that nobody has properly combined it with configuration management. But nobody has time for this nonsense.)

jacobquick将近 12 年前

Nothing. The answer is you don't monitor production processes directly, ever, it's a waste of your time and effort. Certainly this sort of foolishness should not be used to page an employee off-hours if that's where you're headed.The only thing you need to monitor is: whether a server answers the network request it was designed to. Outside of that you might optionally want to know whether the disk is full, ram is maxed (thus putting linux into swap) or if the cpu runs too high to cope with losing some servers at peak, but really that's all optional if you're in ec2 and can just spin up more servers on a moment's notice.You can gather all this data for yourself with Newrelic or if you want you can send data to graphite or if you're old-fashioned you can use Icinga in place of Nagios because it keeps history in a database. If the developers want to know about the process for the application they implemented you can put Newrelic on the server for them, and put the system Newrelic thing on there too, just don't pay attention to it or pretend it's important until something breaks.The important catch here, the thing that is critical to this whole line of thinking: you have to have thought things through before you built them, focused on having one service per os and real redundancy throughout the environment, and then critically your kick should be fast enough that if a server has some kind of problem in production you don't fix it you just re-kick it. That means your kick throws the os on there, then triggers salt or ansible or chef to configure every single detail and then triggers a deploy of internally-developed applications. That means you have to test the kick to death before you can rely on it to rebuild you something live. If the problem is recurring you can use immediate tools, jdump or whatever, to get some data, give it to the application's developers, and let them try to recreate it in staging while you go ahead and re-kick the prod server and go back to writing documentation for lesser ops to not read, drinking at your desk, reading hackernews, acting as a cia listening post for cat pictures or whatever else passes the time.

otterley将近 12 年前

Some points of note:1. daemontools and runit are practically identical. I do prefer runit somewhat, as svlogd has a few more features than multilog (syslog forwarding, more timestamp options), and sv has a few more options than svc (it can issue more signals to the supervised process).2. Among the criteria I look for in a process manager are: (1) the ability to issue any signal to a process (not just the limited set of TERM and HUP), and (2) the ability to run some kind of test as a predicate for restarting or reloading a service. The latter is especially useful to help avoid automating yourself into an outage. As far as I'm aware, none of the above process supervisors can do that, so I tend to eschew them in favor of initscripts and prefer server implementations that are reliable enough not to need supervision.

评论 #6087978 未加载

shuzchen将近 12 年前

I've been using supervisord for most everything (doesn't hurt that I'm primarily a python guy), but I'm slowly testing out Mozilla's circus (<a href="https://github.com/mozilla-services/circus" rel="nofollow">https://github.com/mozilla-services/circus</a>) and it's been going great so far.

评论 #6088066 未加载

jfb将近 12 年前

I don't, currently, because I don't have to monitor services, but if I did, I think I'd likely use daemontools, based on the fact that djb really, really understands how to write Unix software.

评论 #6087866 未加载

评论 #6088392 未加载

JoshTriplett将近 12 年前

systemd handles service monitoring natively, as well as socket management and many aspects of container management. It's a superset of most of the tools listed.

评论 #6087542 未加载

评论 #6088656 未加载

评论 #6087216 未加载

bifrost将近 12 年前

SNMP for centralized systems monitoring, cronjobs for "daemon management". I dislike the tools that take up large system footprints, sometimes larger than the things they're supposed to manage themselves, to do simple things. No, they're not always extensible. When you can accomplish the same thing with a 4-5 line shellscript that takes less than 100k of ram to run, that you'd do with a giant process that takes 40-50MB of ram, I know which one I prefer (generally).

评论 #6088245 未加载

Spidler将近 12 年前

Systemd for most Servers, svc / rundir ( a version of daemontools ) from busybox on embedded services.The code is designed to be shot, and will always recover after a restart. rundir/svc will neatly reap a process and re-start it again. And can be used separately.

_delirium将近 12 年前

Do any of these integrate with cgroups? I've found myself wanting to specify some rules about resource usage on occasion, and cgroups seems conceptually nice, but I'm not sure how to work it in nicely to my other tools, short of writing custom shell scripts to manipulate /proc/cgroups.

评论 #6087625 未加载

tel将近 12 年前

Anyone with any experience using Angel (<a href="https://github.com/MichaelXavier/Angel" rel="nofollow">https://github.com/MichaelXavier/Angel</a>)?

评论 #6088568 未加载

eknkc将近 12 年前

Have been using monit to keep a couple of node.js services online / monitor their PIDs and HTTP interfaces. It's been a positive experience so far.

snikch将近 12 年前

Please add God[1] to the list, we use this in production.Also I'll put in a shameless plug for my side project, a service management tool for multi project development; Hack On[2][1] <a href="http://godrb.com/" rel="nofollow">http://godrb.com/</a> [2] <a href="https://github.com/snikch/hack" rel="nofollow">https://github.com/snikch/hack</a>

评论 #6087254 未加载

评论 #6087240 未加载

jingo将近 12 年前

How many of those are actually derivatives of daemontools?

__david__将近 12 年前

<a href="https://github.com/caldwell/daemon-manager" rel="nofollow">https://github.com/caldwell/daemon-manager</a>I've been dogfooding it in a production environment for a couple years and it's been pretty solid.

elasticdog将近 12 年前

I wish that s6 (skarnet.org's small and secure supervision software suite) [1] were more-widely packaged and available on distros. It's very much in the same vein as daemontools, but with some improvements. While certainly biased, the author wrote a pretty good breakdown and comparison of why s6 was developed [2].[1] <a href="http://www.skarnet.org/software/s6/" rel="nofollow">http://www.skarnet.org/software/s6/</a>[2] <a href="http://www.skarnet.org/software/s6/why.html" rel="nofollow">http://www.skarnet.org/software/s6/why.html</a>

jensnockert将近 12 年前

launchd, but I can pretty safely assume that I am one of the few here running servers on OS X.

flexd将近 12 年前

I use supervisord now, before I would use mon/mongroup [1] which is just a tiny C program to monitor stuff.I have also used god at some point, but I kept having trouble. I can't remember exactly what was wrong but it never quite worked correctly for me. Probably PEBCAK.[1] <a href="https://github.com/jgallen23/mongroup" rel="nofollow">https://github.com/jgallen23/mongroup</a>

eldavido将近 12 年前

upstart because I wouldn't use a job control system in prod that isn't included with the base distribution (do any base unix distros use monit or supervisord?) It's just too much useless work to rewrite job control logic for daemons when the OS already gives them to you, and I've been quite surprised with the feature completeness of upstart.

评论 #6089274 未加载

brokenparser将近 12 年前

^C, ^Z, ps, fg, bg

wusher将近 12 年前

Bluepill <a href="https://github.com/arya/bluepill" rel="nofollow">https://github.com/arya/bluepill</a>

pranavrc将近 12 年前

Histogram so far:<a href="http://quickhist.onloop.net/monit=75,supervisord=107,daemonize=2,runit=30,perp=2,DJB%27s%20daemontools=33,systemd=54,god=18,upstart=76/Unix%20process%20management%20tools%20-%20Hacker%20News%20Poll%20Jul%20%2713" rel="nofollow">http://quickhist.onloop.net/monit=75,supervisord=107,daemoni...</a>

calpaterson将近 12 年前

I use upstart, but am not happy with it for a number of reasons. Two important ones: "restart" does not reread the configuration file and the DSL is poorly done (the "respawn" stanza and others).I haven't looked recently at alternatives, but I'm open to it.

praguebakerr将近 12 年前

nagios + few licences of new relics

评论 #6087170 未加载

loony将近 12 年前

I'm using htop. Very easy but maybe not enough features for what you are looking for ?

Qantourisc将近 12 年前

For daemons: none of the above (init) I then monitor it with Zabbix. I assume services don't crash, and hey they don't (not that I know of in any case).Unless you did really mean process and not daemon, then it's supervisord.

epynonymous将近 12 年前

personally i've had great success with supervisord, no success with god, good experiences with monit, but i'm curious, whatever happened to good ol' linux watchdog?

anderspetersson将近 12 年前

Currently using upstart, only because it's default in ubuntu.

kunai将近 12 年前

No launchd love? Okay...

chillitom将近 12 年前

Reactive monitoring via Riemann: <a href="http://riemann.io/" rel="nofollow">http://riemann.io/</a>We use this to monitor services at the application level.

fiffig将近 12 年前

Pacemaker if you need to keep it alive no matter what.Systemd was pretty stable until user mode flat out broke in 205. I use it to manage my entire desktop session.

mafro将近 12 年前

Older HN post for reference:<a href="https://news.ycombinator.com/item?id=1368855" rel="nofollow">https://news.ycombinator.com/item?id=1368855</a>

kuahyeow将近 12 年前

We use <a href="https://github.com/willbryant/niet" rel="nofollow">https://github.com/willbryant/niet</a> at work

buster将近 12 年前

Supervisord but i'd love to check on systemd and its capabilities for that in the future..

ricardobeat将近 12 年前

forever (<a href="https://github.com/nodejitsu/forever/" rel="nofollow">https://github.com/nodejitsu/forever/</a>) has worked great for me, but doesn't make any sense if you're not running node.js applications.

评论 #6089650 未加载

maplebed将近 12 年前

I've used monit since forever, but have really come to like runit more recently.

aredridel将近 12 年前

SMF

D9u将近 12 年前

top

kiallmacinnes将近 12 年前

upstart does the job nicely.

jusob将近 12 年前

monit for active monitoring + munin for trends

luke-stanley将近 12 年前

htop

Aqueous将近 12 年前

'ps' :)

评论 #6100199 未加载

aredridel将近 12 年前

sysv init

aredridel将近 12 年前

docker

aredridel将近 12 年前

forever

aredridel将近 12 年前

launchd

aaronsnoswell将近 12 年前

top

asdasf将近 12 年前

Kinda sad that "nothing" isn't on the list. I just use software that isn't broken, so it doesn't need to be constantly restarted.

评论 #6088157 未加载

评论 #6088420 未加载

评论 #6089268 未加载

评论 #6088350 未加载

51 条评论

sehrope将近 12 年前

moonboots将近 12 年前

评论 #6088370 未加载

评论 #6087720 未加载

评论 #6088174 未加载

评论 #6087783 未加载

harrytuttle将近 12 年前

评论 #6088541 未加载

评论 #6088511 未加载

knappe将近 12 年前

评论 #6087518 未加载

jacques_chester将近 12 年前

jacobquick将近 12 年前

otterley将近 12 年前

评论 #6087978 未加载

shuzchen将近 12 年前

评论 #6088066 未加载

jfb将近 12 年前

I don't, currently, because I don't have to monitor services, but if I did, I think I'd likely use daemontools, based on the fact that djb really, really understands how to write Unix software.

评论 #6087866 未加载

评论 #6088392 未加载

JoshTriplett将近 12 年前

systemd handles service monitoring natively, as well as socket management and many aspects of container management. It's a superset of most of the tools listed.

评论 #6087542 未加载

评论 #6088656 未加载

评论 #6087216 未加载

bifrost将近 12 年前

评论 #6088245 未加载

Spidler将近 12 年前

_delirium将近 12 年前

评论 #6087625 未加载

tel将近 12 年前

Anyone with any experience using Angel (<a href="https://github.com/MichaelXavier/Angel" rel="nofollow">https://github.com/MichaelXavier/Angel</a>)?

评论 #6088568 未加载

eknkc将近 12 年前

Have been using monit to keep a couple of node.js services online / monitor their PIDs and HTTP interfaces. It's been a positive experience so far.

snikch将近 12 年前

评论 #6087254 未加载

评论 #6087240 未加载

jingo将近 12 年前

How many of those are actually derivatives of daemontools?

__david__将近 12 年前

elasticdog将近 12 年前

jensnockert将近 12 年前

launchd, but I can pretty safely assume that I am one of the few here running servers on OS X.

flexd将近 12 年前

eldavido将近 12 年前

评论 #6089274 未加载

brokenparser将近 12 年前

^C, ^Z, ps, fg, bg

wusher将近 12 年前

Bluepill <a href="https://github.com/arya/bluepill" rel="nofollow">https://github.com/arya/bluepill</a>

pranavrc将近 12 年前

calpaterson将近 12 年前

praguebakerr将近 12 年前

nagios + few licences of new relics

评论 #6087170 未加载

loony将近 12 年前

I'm using htop. Very easy but maybe not enough features for what you are looking for ?

Qantourisc将近 12 年前

epynonymous将近 12 年前

personally i've had great success with supervisord, no success with god, good experiences with monit, but i'm curious, whatever happened to good ol' linux watchdog?

anderspetersson将近 12 年前

Currently using upstart, only because it's default in ubuntu.

kunai将近 12 年前

No launchd love? Okay...

chillitom将近 12 年前

Reactive monitoring via Riemann: <a href="http://riemann.io/" rel="nofollow">http://riemann.io/</a>We use this to monitor services at the application level.

fiffig将近 12 年前

Pacemaker if you need to keep it alive no matter what.Systemd was pretty stable until user mode flat out broke in 205. I use it to manage my entire desktop session.

mafro将近 12 年前

Older HN post for reference:<a href="https://news.ycombinator.com/item?id=1368855" rel="nofollow">https://news.ycombinator.com/item?id=1368855</a>

kuahyeow将近 12 年前

We use <a href="https://github.com/willbryant/niet" rel="nofollow">https://github.com/willbryant/niet</a> at work

buster将近 12 年前

Supervisord but i'd love to check on systemd and its capabilities for that in the future..

ricardobeat将近 12 年前

评论 #6089650 未加载

maplebed将近 12 年前

I've used monit since forever, but have really come to like runit more recently.

aredridel将近 12 年前

SMF

D9u将近 12 年前

top

kiallmacinnes将近 12 年前

upstart does the job nicely.

jusob将近 12 年前

monit for active monitoring + munin for trends

luke-stanley将近 12 年前

htop

Aqueous将近 12 年前

'ps' :)

评论 #6100199 未加载

aredridel将近 12 年前

sysv init

aredridel将近 12 年前

docker

aredridel将近 12 年前

forever

aredridel将近 12 年前

launchd

aaronsnoswell将近 12 年前

top

asdasf将近 12 年前

Kinda sad that "nothing" isn't on the list. I just use software that isn't broken, so it doesn't need to be constantly restarted.

评论 #6088157 未加载

评论 #6088420 未加载

评论 #6089268 未加载

评论 #6088350 未加载