RRDtool is pretty nice, but it has a fair number of scalability issues too:<p>* Once you create an RRA (archive file) you can't modify it to add or remove metrics, or change their properties. This makes them relatively inflexible.<p>* Updating RRAs is I/O heavy. Every time an update comes in, the OS must read, modify and write a page.<p>* RRDcache mitigates this somewhat by deferring flushes, but there are diminishing returns to this (eventually the number of writes coming in will cause the cache flush and filesystem metadata update rate to exceed the maximum IOPS available), and you risk data loss in the event of a power outage or the OOM killer kills the process.<p>Time-series data access patterns tend to be write-heavy. Storing first in an append-only log is a big win here; Cassandra and MySQL are both good choices, though you do have to think about the schemata first. And disk is so cheap now that expiration can be an afterthought.