31 December 2009

You can't manage what you can't measure.

Server health can flucuate as it goes through the process of receiving, consuming, and sending data. But you won't know unless you monitor system health. Without collecting system health information over a period of time one cannot know what is a normal state of the system and what is an emerging problem.

One of my favorite tools for system monitoring is Collectd (www.collectd.org). Simple and easy to setup, it covers many of the important aspects of system health and is highly customizable. The resulting system data collected is stored in rrdtool format and can be presented in typical fashion like so:

I have found the collectd documentation easy to follow so I won't cover configuration details here, but I wanted to mention collectd because one of my next articles will rely on it as a measuring tool.