TechEcho

10 comments

jbyersover 17 years ago

Consider collectd (<a href="http://collectd.org/" rel="nofollow">http://collectd.org/</a>). Unlike a lot of the usual suspects, collectd is a daemon that records the usual server health stats every 10 seconds into rrd files. After running it for two years on a few dozen systems, it's never failed or caused undue load on its own. We often see events that would have gone completely unnoticed in a 5-minute monitoring window.

dazzawazzaover 17 years ago

I've just started testing nagios <a href="http://www.nagios.org/" rel="nofollow">http://www.nagios.org/</a>.It looks like complete overkill for a single server and I don't know how useful it is but it does look promising. It certainly looks like it would scale to tens if not hundreds of machines easily.It includes an alert structure so different events triggers different actions. For example if the database stops responding email the DBA, if it's a router email the network admin etc.Again I can't vouch for it over the long term as it's only been a week or so of testing but I can't complain atm.It's a PITA field to research and I'm trying to avoid the 'roll my own' urges as I'd quite like to write it ;)Anyone else got any ideas? There is a python based monitoring application out there somewhere that I stumbled upon about 6 months ago with a great plugin API and neato graphs but I can't find it again :( I blame google and not my incompetence :)

评论 #92482 未加载

评论 #92406 未加载

8plotover 17 years ago

I like munin better than nagios: <a href="http://munin.projects.linpro.no/" rel="nofollow">http://munin.projects.linpro.no/</a>

staunchover 17 years ago

Cacti with SNMP is a very good single app solution to keeping an eye on server health trends.<a href="http://cacti.net/screenshots.php" rel="nofollow">http://cacti.net/screenshots.php</a>Smokeping is quite useful for monitoring network and basic http request response times.<a href="http://oss.oetiker.ch/smokeping/" rel="nofollow">http://oss.oetiker.ch/smokeping/</a>Nagios is good for actively monitoring services and alerting you when things go wrong. I generally use those and create tons of additional monitoring tools that generate reports/charts.

patrickg-zillover 17 years ago

Are you talking about site load or server load? If you want to measure the load on a server, look into the sar utility, which is included in Linux and all other Unix'es. It will take snapshots of data at 10 minute intervals and store them. Other tools will take that data and turn it into a bar graph or other visualization.

reitzensteinmover 17 years ago

I'd like to know this too. Rock Solid Arcade I suspect at many points is outgrowing my cacheless hit the database on every page setup. I started out with a cron job monitoring top every minute, and many times the usage hits 50% - I'm using Django, so the GIL means that's at full capacity, but I have no idea how instantaneous that figure is.I got some helpful advice on the Django chat room to monitor Apache, but really what I'd like is for some warning to be dumped to a log if a connection queue backs up for more than a second (i.e. the server is at full capacity). That'll be time to cache/upgrade. :)

lemuelover 17 years ago

nagios scales, and is eazy. You can also write scripts so that if you daemon/application dies, it will be restarted.

davidwover 17 years ago

I used nagios at the last place I worked with a number of servers, and I was pretty happy with it, although I could very easily imagine something better.

dariusover 17 years ago

Thanks everybody. I appreciate all the good links. I'll go experiment a bit with them.

brooksbpover 17 years ago

Google AnalyticsI believe using anything relatively more intensive is not worth the cycles. Even 'top' is taxing at times.

10 comments

jbyersover 17 years ago

dazzawazzaover 17 years ago

评论 #92482 未加载

评论 #92406 未加载

8plotover 17 years ago

I like munin better than nagios: <a href="http://munin.projects.linpro.no/" rel="nofollow">http://munin.projects.linpro.no/</a>

staunchover 17 years ago

patrickg-zillover 17 years ago

reitzensteinmover 17 years ago

lemuelover 17 years ago

nagios scales, and is eazy. You can also write scripts so that if you daemon/application dies, it will be restarted.

davidwover 17 years ago

I used nagios at the last place I worked with a number of servers, and I was pretty happy with it, although I could very easily imagine something better.

dariusover 17 years ago

Thanks everybody. I appreciate all the good links. I'll go experiment a bit with them.

brooksbpover 17 years ago

Google AnalyticsI believe using anything relatively more intensive is not worth the cycles. Even 'top' is taxing at times.

Ask YC: What tools are you using to monitor a site's load?

10 comments

Ask YC: What tools are you using to monitor a site's load?

10 comments