Does anyone use flock? I came across it recently and believe it serves the same purpose, very useful from cron tasks:<p>man page:
<a href="http://linux.die.net/man/1/flock" rel="nofollow">http://linux.die.net/man/1/flock</a><p>Example: <a href="https://ma.ttias.be/prevent-cronjobs-from-overlapping-in-linux/" rel="nofollow">https://ma.ttias.be/prevent-cronjobs-from-overlapping-in-lin...</a>
As someone who has had to write this themselves multiple times, there are a few bits that I consider to be missing:<p>1) Command line verification - is the pid owned by the same type of process as is running now? PIDs are re-used, ensure it's the same (the creation time check helps, but it doesn't say anything about what process wrote it).<p>2) Process Hang Detection - Has the process actually consumed any CPU ticks in the last minute?<p>3) Infinite loop detection - Is the other process stuck processing something uselessly?<p>4) Killing off stuck processes - 2 or 3 true? Behead it and continue on. Optionally do some form of alerting - stderr is probably fine.<p>Add these, and I would personally find it much more useful.
On systemd systems there's an easier alternative to this. You can use the per-user systemd instance (systemctl --user) to install a .timer that activates a .service file. If the .service is still running when the .timer next fires, it will not be started again. Systemd is pretty good at this kind of bookkeeping.
I've had a lot of sucess using a key in redis with a TTL value instead of a local PID file. Although adding redis to the picture adds a large new point of failure, I can then have a cronjob set up on multiple instances and still ensure it only runs once across all of them.<p>I'm sure there is a simpler way of doing this, how have other people solved redundantly ensuring a single cronjob runs?
I always encounter the problem where I write Python scripts that run on a cron job that sometimes take longer than the interval before the same cron job will run again (e.g., I have a cron that runs every hour and one run takes 2 hours to complete). In this scenario, you would want the first cron to complete before the second cron is run. What Highlander does is if it sees that your cron is already running, it immediately returns thereby skipping that cron run.
Windows has by far the best solution to this since Windows Vista/2008 server. Full instance control provided by the OS, fully scriptable with powershell, desired state configuration (like ansible), clustering, logging, fully event driven i.e. can trigger on network/OS events with GUI, WMI, script and COM integration.<p>Genuinely wish someone knocked out something like this. systemd is part of the way there but not quite far enough.
Always bring this up anytime I see "Highlander" being used for a project name.<p><a href="http://blogs.msdn.com/b/oldnewthing/archive/2014/09/23/10559783.aspx" rel="nofollow">http://blogs.msdn.com/b/oldnewthing/archive/2014/09/23/10559...</a>