The real culprit here is really "fsync is slow".
A couple of months ago I write a little test that basically wrote a 1MB file, in increments of 16 byte write() calls, followed by either fsync() fdatasync() or.. nothing. That is, basically:<p><pre><code> while(written < 1MB) {
written += write(fd,buf,chunksz);
//fsync(fd);
//fdatasync(fd);
}
</code></pre>
Here's some of the times it took to write a 1MB file (Fedora 14, old hard drive)<p><pre><code> chunksz = 16:
No sync'ing: 926ms
fdatasync: 727114ms
fsync: 3024498ms
(yes, THAT slow)
chunksz = 256:
No sync'ing: 65ms
fdatasync: 53786ms
fsync: 191553ms
chunksz=1024:
No sync'ing: 20ms
fdatasync: 20232ms
fsync: 48039ms</code></pre>
This paragraph is in dire need of citation:<p><i>If your apps have some heavy logging to do, and you can't afford some more expensive setup, then you really should forget about syslog, because normally it doesn't scale well if used with high traffic web applications. It's best used for things like logging done by daemons and system components that do not suffer from the load that a web app with decent traffic can have.</i><p>I've seen syslog incite a holy war on HN recently, with one camp espousing that syslog (really rsyslog and syslog-ng) does not scale. As an interested third party, it'd be very helpful to have references to articles or blog posts detailing just how it didn't scale, or at least some first-hand commentary. The article describes an issue with the syslogd daemon in which it grinds to a halt at a mere 10 requests sent amongst 5 concurrent threads. That kind of issue is not going to burn you months down the road when your production deployment refuses to scale. Any stories of the newer open source implementations of syslog failing or severely degrading at production load, or conversely, syslog success stories?
Syslog* doesn't scale, at some point, but that point is at O(10) machines, not at 10 hits per second. Configuration is often key.<p>I'm on rsyslogd or syslog-ng, depending on the machine at this point. Configurations vary, but they both seem to hold up well to consistently writing 100 messages/sec (and peaks of 20k/minute when I get portscanned) on a VM without causing stuff to break.<p>The no fsync option is important, and really, fsync on important for kernel logs in the event of a crash, and not a whole lot else. Not mail, not messages, not syslog, and certainly not debug. You get (roughly) max 100 fsyncs a second unless you're spending extra money on your disks. That's a really damn small budget given that you mega bytes and giga cycles for other things.<p>The default configuration (at least in debian/ubuntu) writes entries in lots of logs. There's catchalls, there's mail.info, mail.err, mail.log, daemon, and such. You need to prune that down and make sure that you're not writing your debug logs in 2 or 3 places with fsync.<p>I'm finding munin and my log greppers are a whole lot more demanding on the box than syslog.
Which distro is still using syslog? Even Red Hat, an outfit that is super-conservative, moved to rsyslog.<p>One of the first things I used to do was compile my own rsyslog and disable syslog. Haven't had to do that for a while.
There are logs that are useless (but they're logged because someone's lazy to turn them off), logs that are somehow useful (but nobody will grief their loss) and logs that are important and better'd be reliably stored.<p>Syslog (if we talk about networking - not with UDP, but with reliable TCP or SCTP-based protocol) and fsync() after each write is <i>reliable</i> solution.<p>If you're logging mostly pointless runtime data (like webserver access logs to static files, which, most of time, nobody ever cares about) with reliable syslog - you're doing it <i>wrong</i>. (That's why webservers don't generally use syslog, but directly write to files.) If you're logging important transactions with unreliable logging system - you're doing it <i>wrong</i>, too.<p>There's nothing wrong with syslog. Just use the right tool for the right job.
We do a lot of ruby apps, avoiding rails as much as possible, but Redmine is one we use often. In all of our apps we use the Runit process supervisor to launch them, with svlogd logging each individual process. It has proven more than up to the task of keeping up with the busiest of our logging needs, and has many advanced features simply configured that other loggers lack. Worth checking out as an alternative, as you don't have to replace your syslogger to use it. See <a href="http://rubyists.github.com/2011/05/02/runit-for-ruby-and-everything-else.html" rel="nofollow">http://rubyists.github.com/2011/05/02/runit-for-ruby-and-eve...</a> for more info
I don't get why the author explicitly mentions RFC 5424, then mentions:<p><i>...message packets are limited to 1024 bytes</i><p>linking to the older, RFC 3164 and not to RFC 5424[1] which states:<p><i>There is no upper limit per se. Each transport mapping defines the minimum maximum required message length support, and the minimum maximum MUST be at least 480 octets in length.</i><p>Servers like syslog-ng <i>do</i> support messages larger than > 1024 bytes.<p>[1] <a href="http://tools.ietf.org/html/rfc5424#section-6.1" rel="nofollow">http://tools.ietf.org/html/rfc5424#section-6.1</a>