I’ve been using MRTG and routers2.cgi for years to graph the various aspects of a server that warrant monitoring. I’ve long known that they used something called rrdtool to do… well, something, but never had a need or desire to figure out exactly what that was.
But, having just moved my site to a new server, I was curious how the server would handle the load. Rather than setting up some behemoth like Nagios or Zabbix, which are full monitoring/alerting suites, I just wanted graphing. As I said, in the past I’ve used MRTG or routers2.cgi for this but both of them were overkill for me in this case. Since both of them used rrdtool, I figured that was a good place to look.
The two metrics I want to record are server load and in/out bandwidth. The first step is to create the RRDs (round robin databases). This was done via these commands:
# rrdtool create /mrtg/load.rrd --start N DS:load1:GAUGE:600:0:100 DS:load5:GAUGE:600:0:100 DS:load15:GAUGE:600:0:100 RRA:AVERAGE:0.5:2:800 # rrdtool create /mrtg/eth1.rrd --start N DS:in:COUNTER:600:0:10000000000 DS:out:COUNTER:600:0:10000000000 RRA:AVERAGE:0.5:2:800
A good explanation of what these various fields mean is here. In short, each “DS:” section defines a “column” (for fellow RDBMS users) in the database. The first one has 3 “columns,” named load1, load5, load15, each of which will contain GAUGE data. The second one contains two COUNTER fields, representing the bytes in/out for interface eth1.
To actually get the data I poll snmpd via this bash script:
#!/bin/bash rrdupdate /mrtg/load.rrd N: `/usr/bin/snmpget -v 2c -c public -Oqv localhost laLoad.1`: `/usr/bin/snmpget -v 2c -c public -Oqv localhost laLoad.2`: `/usr/bin/snmpget -v 2c -c public -Oqv localhost laLoad.3` rrdupdate /mrtg/eth1.rrd N: `/usr/bin/snmpget -v 2c -c public -Oqv localhost ifInOctets.3`: `/usr/bin/snmpget -v 2c -c public -Oqv localhost ifOutOctets.3`
I have that run every 5 minutes via cron. Then to generate the actual graph, I run this script via cron:
#!/bin/bash rrdtool graph /var/www/html/graphs/load.png -N -E --start now-30hours --title "Load Averages" --width 300 --x-grid MINUTE:60:HOUR:2:HOUR:4:0:%H --height 200 -u 1.0 --lower-limit 0 --vertical-label "Load Avg" --full-size-mode -a PNG --title="Load Avg" 'DEF:load1=/mrtg/load.rrd:load1:AVERAGE' 'VDEF:load1last=load1,LAST' 'DEF:load5=/mrtg/load.rrd:load5:AVERAGE' 'DEF:load15=/mrtg/load.rrd:load15:AVERAGE' 'AREA:load15#33CC33:15 Min Load Avg ' 'LINE1:load1#0000ff:1 Min Load Avg ' 'GPRINT:load1:AVERAGE:"Load1 Avg:%3.2lf"' 'GPRINT:load1last:Drawn at %Y-%m-%d, %H:%M:strftime' #'LINE1:load5#ff00ff:5 Min Load Avg ' rrdtool graph /var/www/html/graphs/eth1.png -N -E --start now-30hours --title "eth1 traffic" --width 300 --x-grid MINUTE:60:HOUR:2:HOUR:4:0:%H --height 200 -u 1000000 --lower-limit 0 --vertical-label "bps" --full-size-mode -a PNG --title="eth1 traffic" 'DEF:eth1in=/mrtg/eth1.rrd:in:AVERAGE' 'CDEF:eth1inbits=eth1in,8,*' 'VDEF:eth1last=eth1in,LAST' 'DEF:eth1out=/mrtg/eth1.rrd:out:AVERAGE' 'CDEF:eth1outbits=eth1out,8,*' 'AREA:eth1inbits#33CC33:eth1 in ' 'LINE1:eth1outbits#0000ff:eth1 out' 'GPRINT:eth1last:Drawn at %Y-%m-%d, %H:%M:strftime'
The final graphs look decent, though not very fancy, but I’ll play around with it a bit more: