Can I create an EC2 MySQL slave to an RDS master?

No.

Here’s what happens if you try:

mysql> grant replication slave on *.* to 'ec2-slave'@'%';
ERROR 1045 (28000): Access denied for user 'rds_root'@'%' (using password: YES)
mysql> update mysql.user set Repl_slave_priv='Y' WHERE user='rds_root' AND host='%';
ERROR 1054 (42S22): Unknown column 'ERROR (RDS): REPLICA SLAVE PRIVILEGE CANNOT BE GRANTED OR MAINTAINED' in 'field list'
mysql>

Note: this is for MySQL 5.5, which is unfortunately what I’m currently stuck with.

Create CloudWatch alerts for all Elastic Load Balancers

I manage a bunch of ELBs but we were missing an alert on a pretty basic metric: how many errors the load balancer was returning.  Rather than wade through the UI to add these alerts I figured it would be easier to do it via the CLI.

Assuming aws-cli is installed and the ARN for your SNS topic (in my case, just an email alert) is $arn:

for i in `aws elb describe-load-balancers | grep LoadBalancerName | 
perl -ne 'chomp; my @a=split(/s+/); $a[2] =~ s/[",]//g ; print "$a[2] ";' ` ; 
do aws cloudwatch put-metric-alarm --alarm-name "$i ELB 5XX Errors" --alarm-description 
"High $i ELB 5XX error count" --metric-name HTTPCode_ELB_5XX --namespace AWS/ELB 
--statistic Sum --period 300 --evaluation-periods 1 --threshold 50 
--comparison-operator GreaterThanThreshold --dimensions Name=LoadBalancerName,Value=$i 
--alarm-actions $arn --ok-actions $arn ; done

That huge one-liner creates a CloudWatch notification that sends an alarm when the number of 5XX errors returned by the ELB is greater than 50 over 5 minutes, and sends an “ok” message via the same SNS topic. The for loop creates/modifies the alarm for every ELB.

More info on put-metric-alarm available in the AWS docs.

Tips for recruiters

I’m a pretty lucky guy these days. As a DevOps engineer in NYC my skills are in high demand and recruiters contact me almost every day. As someone who was once unemployed for 6 months I’m grateful to be in this position. That said, there are some requests that go straight to the trash, and some I’ll at least respond to even if I’m not interested. Here are some of the factors that influence my decision:

Does your email look like a generic mail merge/copypasta?

As with all things in life, you need to make an effort. If you’re just spamming everybody with jobs that are listed on LinkedIn or Dice or whatever, there’s no need to talk to you. Like this one, which looks like an Excel mail merge.

Hi,

Our direct client located in New York, NY has a position open for a Release Engineer. A copy of the job description is below.

If you are interested, please send a copy of your resume (preferably in MS Word format) to xxx@yyy.com.

Please be sure to include your rate, location and contact information.

Thanks
Bob

Here’s another one I got via LinkedIn last week:

Subject: Fantastic opportunity for a very cutting edge company in New York City

Dear Evan,

How are you?

I have a client (startup) looking for someone of your background. The location is Manhattan and the funding for this company is off the charts. The pay is great, the benefits are unbeatable, and technology and collaborative environment is off the charts.

Let me know if you or a friend may be interested and I can give you some more details…

Thanks,
Charlie

This is sort of the perfect bad email. For one thing, there’s no information about the company at all: What industry? What technologies? How big is the team? How long have they been around? Are they profitable? For another, there’s no information about the position itself. This same email could be used for an engineer, sales, ops, finance, CEO or janitor.

There are also some words that add no value at all to the email. When describing a job or a company, you should omit the words “exciting,” “awesome,” “amazing,” “cutting edge.” Just tell me the name of the company, maybe with a link to more info about them.

Are you an in-house recruiter or with a headhunting firm?

I know there are good recruiting firms but I seem not to have worked with any of them in the past. In my experience, “executive search” firms are just concerned with volume – getting people to quit their job to go work somewhere else, and then contacting them a year later asking if they want to move again. I’ve had recruiters call me up asking if I was looking to hire anybody, and when I say no they ask if I want to go work somewhere else. if they can’t sell to me, I guess they’ll try and sell me.

For me, the straw that broke the camel’s back was when a recruiter insisted I interview at a place where the job description said “We’re looking for a Ruby expert. You should eat, sleep, and breathe Ruby.” I told the recruiter I didn’t really know Ruby that well, and he insisted that didn’t really matter. I looked into the company’s product and didn’t really like it, but somehow he talked me into going on the interview. It was kind of a disaster: the office was cramped and hot and looked pretty shabby, it was far from any subway station, the interview questions weren’t relevant to the position, and I didn’t like any of the technologies they used. I was uncomfortable and lost what little interest I had about an hour into it. Apparently the feeling was mutual. The recruiter apologized and asked me what I wanted to do next. I never wrote back.

After that ordeal I decided to deal only with in-house recruiters. Personally, I prefer in-house recruiters because they’ve got skin in the game beyond a commission – they’re employees who are committed to seeing the company succeed and are aware of how important it is to land the right person, and would much rather let a seat go empty than fill it with a bad hire. They understand the company culture because they’re part of it. They can sense whether someone will be a good fit on a team because they know everybody on it. They can answer questions about the company without skipping a beat. The job description is more than words on a page to them. The last time I spoke to a recruiter from a staffing firm he assured me he was different, and then all he had to offer me was a menu of 5 companies that he could “get me an interview with.” Well, thanks, but I could do that myself.

I realize a lot of startups don’t want the expense of a full-time recruiter, and I’m probably missing some good opportunities by ignoring these crappy emails, but my experience indicates most of these guys are just going for quantity, sending as many candidates as possible to as many interviews as possible, and don’t much care about quality. Again, I’m sure there are good ones, maybe even most of them are good, but that hasn’t been my experience.

For God’s Sake Stop Calling Me

Email is one thing. I can ignore an email pretty easily. But please don’t call my cell phone (or worse, office phone). If you’re calling during the day, I’m at work, and I don’t want to talk about a new job at work. If it’s after work, well, I’m on my way home on the train and can’t talk, or I’m at home eating dinner and can’t talk. I don’t know how you even got my number in the first place, but if you manage to trick me into answering a call while I’m at my job, you’re not going to get a warm reception. I don’t have a private office, so how am I supposed to have a conversation about switching jobs while I’m at work?

Some recruiters just can’t take a hint. A couple months ago I was on vacation, heading to a Disney Cruise in Florida. As I was approaching Port Canaveral, my phone rang. It was a 646 number (NYC) so I figured it was a recruiter and let it ring out. A couple minutes later they called back and didn’t leave a voicemail. A couple minutes later, another call. I didn’t recognize the number but I was worried it might be someone from work so I answered it. It turned out to be a recruiter and I told her I was about to get on a cruise ship and she could call me back next week just to get her off the phone. Next week came around and sure enough she started calling multiple times a day for over a week. I ended up having to block her number in Google Voice. A couple weeks later, another recruiter from the same firm started calling me from a different number and I ended up blocking him too. Desperation isn’t attractive.

Another problem I’ve encountered is recruiters who are just lousy at their jobs. A few times when I’ve answered the phone, the person on the other end sounds like a deer in the headlights, like now that they’ve got me on the phone they have no idea what to say. When this happens, I picture an intern given a list of names and phone numbers and told “make 200 calls today or you’re fired.” Out of sympathy I usually let him finish his/her spiel and then say “thanks, but I’m not looking right now” and manage to get out of it, but this doesn’t seem like an effective strategy and just makes your firm look amateurish.

TL;DR

Basically, if you’re looking to hire engineering talent, you should:

  • Be an expert on the company you’re recruiting for. Ideally this would be the company you work for, but even if you’re a third party, you’d do well to spend a day on site at your client’s office so you can answer questions about the culture, location, nearby food, etc.
  • Do some research on the candidate. Whatever resume you have in your database is probably out of date. Maybe your target has a website or a Github or a LinkedIn that gives some insight as to what they’re up to.
  • Make your email short and sweet. Whether you’re in-house or a placement firm, the email should give the basic facts: What’s the name of the company (duh)? Where are they located? Are they profitable? How big is the team? What’s the org chart look like – to whom would they report? What technologies do they use? What’s the ballpark compensation?
  • Not annoy anybody. If you send somebody an email and they don’t respond, they’re not interested. If you send them 10 emails about 10 different jobs and they don’t respond, they’re just not that into you. Give it a break. Definitely don’t “call to follow up” if they don’t respond to your email.

The general theme here is “don’t waste anybody’s time.” Don’t send me an email full of intrigue or try and sell me. Like when buying a house, the company/position should sell itself. Just give me the necessary info and don’t bother me.

Disclaimer: this post is just my opinion, and has nothing to do with my employer.

Goodbye, pg_dump

I’ve been a Postgres user and administrator for a while. Over the years, my views on backups have evolved.

Originally, like most people, I started out with good old pg_dump. With a reasonably small database (under 50 GB) dumping to a flat text file is a fine option. I’d generally do something like pg_dump -Upostgres dbname | gzip > dbname.sql.gz to compress it on the fly and save space. For years this seemed perfect: dumping the entire database in a single transaction into a single file that can be restored anywhere.

But as my databases started growing larger and larger, the time it took to do a pg_dump grew as well. At a previous job, the database grew to nearly 2TB and the pg_dump took nearly 18 hours. We’d by that point already changed the pg_dump schedule from daily to weekly and then to three times a month and then finally to semi-monthly. Not only was it slow, but since it operated in a single transaction it wreaked havoc with normal database operation for queries that needed locks on tables locked by the dump.

When we moved the database from a physical RAID to a volume on our SAN, that gave us the opportunity to use LUN snapshotting rather than pg_dump (I just remembered I already wrote about that here). This let us move to a monthly pg_dump and more frequent snapshot-level backups that took up very little space. This was ideal on Compellent since the snapshots would auto-expire after however long you specified.

When I started at Yodle we were doing nightly pg_dumps and pretty soon we ran into the same problems I’d seen at Didit with the dump itself interfering with normal DB operation – the dump would start at midnight and run until 7-8 AM when I started, and after a few months it would still be running at noon. We discussed moving to wal archiving and making a basebackup to NFS but that would require a pretty massive amount of space, and as anybody who uses “enterprise storage” knows, that’s not something you want to do. We discussed building a whitebox file server for backups but nobody was really in love with that option – we’re trying to reduce the reliance on physical machines as much as possible. We talked about pushing it all to S3 but that seemed rather difficult.

When I attended NYC PgDay earlier this year, there was lots of discussion about WAL-E. I hadn’t ever head of WAL-E so I looked it up and was impressed. Basically, WAL-E handles archiving of wal to S3, but first compresses and pgp-encrypts it. It also handles pushing the basebackup to S3, also compressed and pgp-encrypted. This was just what we were looking for. We set it up and, amazingly, it worked perfectly. After a few weeks (and confirming we can restore from the wal-e backups) we moved our pg_dump to weekly, on the weekend when it doesn’t interfere with any user processes. We do a wal-e basebackup every 3-4 days or so and retain 3 of them. We retain all the wal so we can restore the DB to any point within the last ~10 days if needed. The best part is it’s faster than pg_dump, and since the basebackup doesn’t operate in a transaction (it’s a filesystem-level backup rather than an application-level backup) it doesn’t mess with user queries. There’s of course elevated IO during this time but our SAN has more than enough bandwidth.

We setup some basic monitoring of S3 (check the age of the most recent WAL and log it in Zabbix) just to ensure the backups are actually happening, and we’re at the point where we’re discussing moving pg_dump to monthly, or simply not doing it at all. Overall, wal-e has been a huge win for us, enabling better, faster backups that don’t interfere with the DB itself, and, while not free, aren’t ridiculously expensive. And since it’s in its own S3 bucket, you can tweak the bucket settings (e.g. enable RRS) to save money, and Amazon tells you exactly how much your backups cost you.

XFS write speeds: software RAID 0/5/6 across 45 spindles

We’re currently building a new storage server to store low-priority data (tertiary backups, etc). One of the requirements for the project is that it needs to be on cheap storage (as opposed to expensive enterprise SAN/NAS). After some research we decided to build a Backblaze pod. Backblaze used 3TB Hitachi drives in their system, but the ones they listed in their blog post are discontinued and the reviews for all other 3TB+ drives were terrible, so we went with Samsung ST2000DL004 2TB 7200 RPM drives. Like Backblaze, we’re going with software raid, but I figured a good first step would be to figure out what RAID level we want to use, and if we want to use the mdadm/LVM mish-mosh Backblaze uses, or find something simpler. For my testing I created a RAID6 of all 45 drives and created a single XFS volume (XFS’s size limit is ~8 exabytes vs ext4’s 16TB). Ext4 may present some performance advantages, but the management overhead is probably not worth it in our case.

So, this is just a simple benchmarking comparing RAID 0 (stripe with no parity) as a baseline, RAID5 (stripe with 1 parity disk) and RAID6 (stripe with 2 parity disks) across 45 total spindles. For all tests I used Linux software RAID (mdadm).

To test, I have 3 scripts, makeraid0.sh, makeraid5.sh, and makeraid6.sh. Each one does what its name implies. The raid0 has 43 disks, raid5 has 44 disks, and raid6 has 45 disks, so there are 43 “data” disks in each test. The system is a Protocase “Backblaze-inspired” system with a Core i3 540 CPU, 8 GB memory, CentOS 6.3 x64, and 45x We’re just using this box for backup and it gives us about 79 TB usable, which is still plenty, so 2TB isn’t a big problem.

makeraid?.sh for filesystem creation:

#!/bin/bash

mdadm --create /dev/md0 --level=raid6 -c 256K --raid-devices=45 
/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde 
/dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj 
/dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo 
/dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt 
/dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy 
/dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad 
/dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai 
/dev/sdaj /dev/sdak /dev/sdal /dev/sdam /dev/sdan 
/dev/sdao /dev/sdap /dev/sdaq /dev/sdar /dev/sdas

Filesystem:

[root@Protocase ~]# mkfs.xfs -f /dev/md0
meta-data=/dev/md0               isize=256    agcount=79, agsize=268435392 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=21000267072, imaxpct=1
         =                       sunit=64     swidth=2752 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@Protocase ~]# mount /dev/md0 /raid0/
[root@Protocase ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdat2            289G  3.2G  271G   2% /
tmpfs                 3.9G  260K  3.9G   1% /dev/shm
/dev/sdat1            485M   62M  398M  14% /boot
/dev/md0               79T   35M   79T   1% /raid0

RAID0

[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 25.1944 s, 416 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 25.1922 s, 416 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 24.7665 s, 423 MB/s

RAID5

[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 25.2239 s, 416 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 24.7427 s, 424 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 24.2434 s, 433 MB/s

RAID6:

[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 26.9032 s, 390 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 26.5255 s, 395 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 26.4338 s, 397 MB/s

I found it pretty strange that RAID5 seemed to outperform RAID0, but I tested it several times and RAID5 averaged 10-15 MB/s faster than RAID0. Maybe a bug in the kernel? I tried other block sizes ranging from 60KB to 4MB for dd but the results were pretty consistent. In the end it looks like I’m going to go with RAID6 of 43 drives + 2 hotspares, which still yields ~400 MB/s throughput and 75 TB usable:

#!/bin/bash

mdadm --create /dev/md0 --level=raid6 -c 256K -n 43 -x 2 
/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde 
/dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj 
/dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo 
/dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt 
/dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy 
/dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad 
/dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai 
/dev/sdaj /dev/sdak /dev/sdal /dev/sdam /dev/sdan 
/dev/sdao /dev/sdap /dev/sdaq /dev/sdar /dev/sdas

Update: A coworker suggested looking into write-intent bitmap to improve rebuild speeds. After adding a 256 MB-chunked bitmap, the write performance didn’t degrade much, so this looks like a good addition to the configuration:

[root@Protocase ~]# mdadm -G --bitmap-chunk=256M --bitmap=internal /dev/md0
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 25.8157 s, 406 MB/s
[root@Protocase ~]# rm -fv /raid0/zeros.dat
removed `/raid0/zeros.dat'
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 26.4233 s, 397 MB/s
[root@Protocase ~]# rm -fv /raid0/zeros.dat
removed `/raid0/zeros.dat'
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 26.2593 s, 399 MB/s
[root@Protocase ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdat2            289G  3.2G  271G   2% /
tmpfs                 3.9G   88K  3.9G   1% /dev/shm
/dev/sdat1            485M   62M  398M  14% /boot
/dev/md0               75T  9.8G   75T   1% /raid0
[root@Protocase ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdas[44](S) sdar[43](S) sdaq[42] sdap[41] sdao[40] sdan[39] sdam[38] sdal[37] sdak[36] sdaj[35] sdai[34] sdah[33] sdag[32] sdaf[31] sdae[30] sdad[29] sdac[28] sdab[27] sdaa[26] sdz[25] sdy[24] sdx[23] sdw[22] sdv[21] sdu[20] sdt[19] sds[18] sdr[17] sdq[16] sdp[15] sdo[14] sdn[13] sdm[12] sdl[11] sdk[10] sdj[9] sdi[8] sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb[1] sda[0]
      80094041856 blocks super 1.2 level 6, 256k chunk, algorithm 2 [43/43] [UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU]
      bitmap: 2/4 pages [8KB], 262144KB chunk

unused devices: 

Reorganizing photos in 1 line with exiftool

A few years ago I wrote a utility in Java to find all JPG files in a directory and move them into a date-based directory structure like /YYYY/MM/DD/ based on the date the photo was taken, extracted from the exif metadata in the file. Well, apparently that was a huge waste of time, as I just discovered that exiftool, an awesome perl utility I’ve used for years to edit/extract the metadata on the command line, can also do this natively. So my entire program can be replaced with this simple command:

$ exiftool -r '-FileName<CreateDate' -d /targetDir/%Y/%Y-%m/%Y-%m-%d/%Y-%m-%d.%%f.%%e /media/EOS_DIGITAL/

This will copy the files directly off the SD card mounted at /media/EOS_DIGITAL/ into the proper structure in /targetDir/.

Slow HTTP downloads through Cisco ASA 5500

Recently we noticed weird behavior downloading files from certain sites. The transfer would start out fast (around 10 MB/s), then after a couple of seconds it would plummet to around 9 KB/s. It didn’t happen for every file or every site: downloads from S3 buckets were still particularly fast. But some files that I remember being particularly fast were now showing this weird fast/slow/fast/slow behavior, for example the Sun JDK and ISOs from rit.edu that used to saturate our pipe were now getting all cRAzY.

After some poking around I decided to test HTTP versus FTP to see if it could be an application/protocol-level issue. The easiest way to do this was to find a file available via both FTP and HTTP and download it via both protocols. This is where mirrors.rit.edu came in handy. I used cURL to download it and noticed that via HTTP it was much slower than over FTP:

[evan@boba 16:07:03 ~]$ curl -O ftp://mirrors.rit.edu/pub/centos/6/isos/x86_64/CentOS-6.2-x86_64-netinstall.iso
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  227M  100  227M    0     0   9.8M      0  0:00:22  0:00:22 --:--:-- 7816k
[evan@boba 16:07:33 ~]$ rm CentOS-6.2-x86_64-netinstall.iso 
[evan@boba 16:07:39 ~]$ curl -O http://mirrors.rit.edu/centos/6/isos/x86_64/CentOS-6.2-x86_64-netinstall.iso
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  227M  100  227M    0     0  5686k      0  0:00:40  0:00:40 --:--:-- 6269k

22 seconds via FTP at 9.8MB/s average, 40 seconds over HTTP at 5.6 MB/s average (which was one of the better HTTP runs).

This was affecting all machines on our network, and had nothing to do with the per-machine iptables rules (verified by flushing all rules). The only thing I could think of that might affect all machines, but only HTTP and not FTP would be something like packet inspection. Well, turns out that http packet inspection is on by default on the ASA. So I disabled it as described here:

Zeus(config)# conf t
Zeus(config)# policy-map global_policy
Zeus(config-pmap)# class inspection_default
Zeus(config-pmap-c)# no inspect http
Zeus(config-pmap-c)# write mem
Building configuration...

Since then HTTP transfers have been consistently fast.

Using rrdtool to generate server load & bandwidth graphs

I’ve been using MRTG and routers2.cgi for years to graph the various aspects of a server that warrant monitoring. I’ve long known that they used something called rrdtool to do… well, something, but never had a need or desire to figure out exactly what that was.

But, having just moved my site to a new server, I was curious how the server would handle the load. Rather than setting up some behemoth like Nagios or Zabbix, which are full monitoring/alerting suites, I just wanted graphing. As I said, in the past I’ve used MRTG or routers2.cgi for this but both of them were overkill for me in this case. Since both of them used rrdtool, I figured that was a good place to look.

The two metrics I want to record are server load and in/out bandwidth. The first step is to create the RRDs (round robin databases). This was done via these commands:

# rrdtool create /mrtg/load.rrd --start N DS:load1:GAUGE:600:0:100 DS:load5:GAUGE:600:0:100 DS:load15:GAUGE:600:0:100 RRA:AVERAGE:0.5:2:800

# rrdtool create /mrtg/eth1.rrd --start N DS:in:COUNTER:600:0:10000000000 DS:out:COUNTER:600:0:10000000000 RRA:AVERAGE:0.5:2:800

A good explanation of what these various fields mean is here. In short, each “DS:” section defines a “column” (for fellow RDBMS users) in the database. The first one has 3 “columns,” named load1, load5, load15, each of which will contain GAUGE data. The second one contains two COUNTER fields, representing the bytes in/out for interface eth1.

To actually get the data I poll snmpd via this bash script:

#!/bin/bash

rrdupdate /mrtg/load.rrd N:
`/usr/bin/snmpget -v 2c -c public -Oqv localhost laLoad.1`:
`/usr/bin/snmpget -v 2c -c public -Oqv localhost laLoad.2`:
`/usr/bin/snmpget -v 2c -c public -Oqv localhost laLoad.3`

rrdupdate /mrtg/eth1.rrd N:
`/usr/bin/snmpget -v 2c -c public -Oqv localhost ifInOctets.3`:
`/usr/bin/snmpget -v 2c -c public -Oqv localhost ifOutOctets.3`

I have that run every 5 minutes via cron. Then to generate the actual graph, I run this script via cron:

#!/bin/bash

rrdtool graph /var/www/html/graphs/load.png 
        -N 
        -E 
        --start now-30hours 
        --title "Load Averages" 
        --width 300
         --x-grid MINUTE:60:HOUR:2:HOUR:4:0:%H
        --height 200 
        -u 1.0 
        --lower-limit 0
        --vertical-label "Load Avg" 
        --full-size-mode 
-a PNG --title="Load Avg" 
'DEF:load1=/mrtg/load.rrd:load1:AVERAGE' 
'VDEF:load1last=load1,LAST' 
'DEF:load5=/mrtg/load.rrd:load5:AVERAGE' 
'DEF:load15=/mrtg/load.rrd:load15:AVERAGE' 
'AREA:load15#33CC33:15 Min Load Avg ' 
'LINE1:load1#0000ff:1 Min Load Avg ' 
'GPRINT:load1:AVERAGE:"Load1 Avg:%3.2lf"' 
'GPRINT:load1last:Drawn at %Y-%m-%d, %H:%M:strftime' 
#'LINE1:load5#ff00ff:5 Min Load Avg ' 

 
rrdtool graph /var/www/html/graphs/eth1.png 
        -N 
        -E 
        --start now-30hours 
        --title "eth1 traffic" 
        --width 300
         --x-grid MINUTE:60:HOUR:2:HOUR:4:0:%H
        --height 200 
        -u 1000000 
        --lower-limit 0
        --vertical-label "bps" 
        --full-size-mode 
-a PNG --title="eth1 traffic" 
'DEF:eth1in=/mrtg/eth1.rrd:in:AVERAGE' 
'CDEF:eth1inbits=eth1in,8,*' 
'VDEF:eth1last=eth1in,LAST' 
'DEF:eth1out=/mrtg/eth1.rrd:out:AVERAGE' 
'CDEF:eth1outbits=eth1out,8,*' 
'AREA:eth1inbits#33CC33:eth1 in ' 
'LINE1:eth1outbits#0000ff:eth1 out' 
'GPRINT:eth1last:Drawn at %Y-%m-%d, %H:%M:strftime' 

The final graphs look decent, though not very fancy, but I’ll play around with it a bit more:

eth1 graph
eth1 graph
load graph
load graph

Load balancing in EC2 with Nginx and HAProxy

We wanted to setup a loadbalanced web cluster in AWS for expansion. My first inclination was to use ELB for this, but I soon learned that ELB doesn’t let you allocate a static IP, requiring you to refer to it only by DNS name. This would be OK except for the fact that our current DNS provider, Dyn, requires IP addresses when using their GSLB (geo-based load balancer) service.

Rather than let this derail the whole project, I decided to look into the software options available for loadbalancing in EC2. I’ve been a fan of hardware load balancers for a while, sort of looking down at software-based solutions without any real rationale, but in this case I really had no choice so I figured I’d give it a try.

My first stop was Nginx. I’ve used it before in a reverse-proxy scenario and like it. The problem I had with it was that it doesn’t support active polling of nodes – the ability to send requests to the webserver and mark the node as up or down based on the response. As far as I can tell, using multiple upstream servers in Nginx allows you to specify max_fails and fail_timeout, however a “fail” is determined when a real request comes in. I don’t want to risk losing a real request – I like active polling.
Continue reading “Load balancing in EC2 with Nginx and HAProxy”