How does fluentd resume tailing the apache log if it crashes? Does it maintain the current file position on disk? What if logs are rotated between a fluentd crash and recovery?<p>I've had to solve this problem for Yahoo!'s performance team, and ended up setting a very small log rotation timeout, and only parsing rotated logs. There's a 5-30 minute delay in getting data out of logs (depending on how busy the server is), but since we're batch processing anyway, it doesn't matter.<p>The added advantage, is that you just maintain a list of files that you've already parsed, so if the parser/collector crashes, it just looks at the list and restarts where it left off. Smart key selection (ie, something like IP or userid+millisecond time) is enough to ensure that if you do end up reprocessing the same file (eg, if a crash occurs mid-file), then duplicate records aren't inserted (use the equivalent of a bulk INSERT IGNORE for your db).<p>This scales to billions of log entries a day.