Using WAL archiving & Compellent snapshots for PostgreSQL backups

I seem to have what may be an irrational dislike for differential backups in general. No matter what system I’m backing up, I feel far more confident in doing complete backups than differential ones. The significant exception to this is rsync, but even rsync does “full” backups of the files that have changed. Even so, I usually add the -W flag to rsync so it moves the entire file if it’s been modified. I guess I just like knowing there’s a single file that contains the entire database rather than having to restore a huge file and then replay a bunch of differential files in sequence.

This has worked for a long time, even with our PostgreSQL DB, which I’ve been backing up with good ol’ pg_dump, but it’s gotten to the point that the DB backup takes over 12 hours now, during which performance is seriously degraded. With the recent migration to the new server, overall performance seems significantly better overall, but performance during the nightly pg_dump is much worse. Rather than trying to troubleshoot it, I think it’s time I bit the bullet and move to WAL archiving to enable differential backups, and stop doing a full backup every night.

The PostgreSQL docs are pretty great at explaining how to do this. Basically:

  1. Configure WAL archiving to copy the WAL files to another server as soon as they’re complete.
  2. Issue the pg_start_backup() command.
  3. Perform a filesystem-level backup of the DB (usually /var/lib/pgsql on RedHat/CentOS distros).
  4. Issue the pg_stop_backup() command.

That’s basically it. The backup from step 2 plus the archived WAL files will allow you to recover to any point in time after issuing the pg_stop_backup() command. So if you complete your backup at 2 AM on Sunday and your DB crashes on Wednesday at 6 PM, you can restore to any point between 2 AM Sunday and whatever your most recently archived WAL file is (presumably within a few minutes of the crash at 6 PM Wednesday). If you decided you wanted to restore to 12:01 AM on Wednesday, you can do that using the recovery_target_time setting. My aversion to differential backups aside, this is pretty awesome.

As I said, the Postgres docs do a good job of covering this procedure completely. The only thing I have to add is how I did this using a SAN-level snapshot as my “filesystem backup.” I’d never done any scripting with Compellent before and it turned out to be pretty easy. From the Knowledge Center, under Software download the latest version of Command Utility (I downloaded 5.4.1). The utility itself is just a JAR, you’ll need Java installed to run it.

I considered doing a real filesystem backup of the DB (love that rsync) but the problem was that the DB is currently 1.2 TB and gobbling that much space on the NAS wasn’t very appealing. Compellent replays (snapshots) are just deltas, and I can easily store a week worth of backups for much less than (7 * 1.2 = ) 8.4 TB.

I wrote a crappy bash script to do everything, included below:

I retain the WAL files for 8 days and the snapshots for 7 days, but I may adjust this since the WAL files themselves consume a lot of space – about 30-40 GB per day. Though this is still less than the gzipped pg_dump I had been doing, which was about 85 GB per day.

I’ve cronned the script to run at 2 AM and so far it appears to work. Compellent replays are created almost instantly, so the backup script completed in about 10 seconds, which includes 6 seconds of sleep, which probably aren’t necessary. Considering that the pg_dump method took 12+ hours to complete, 10 seconds is immeasurably better. Well, I guess it is measurable, you just need to divide 12 hours by 10 seconds.

I’m pretty happy with this so far. The improved performance far outweighs any irrational philosophical objection I may have had to differential backups. Buuuut I’m still going to do pg_dumps on Saturdays.

Compellent Doesn't Suck

I noticed a bunch of people landing on this site by searching for “compellent sucks.” I just want to avoid any confusion: Compellent doesn’t suck. Now that the pain of spending the money to expand our Compellent SAN is in the past, I am back to being in love with the product. The only gripe I’ve really ever had with Compellent is the price, and as Ben Franklin said:

The bitterness of poor quality remains long after the sweetness of low price is forgotten.

Compellent Doesn’t Suck

I noticed a bunch of people landing on this site by searching for “compellent sucks.” I just want to avoid any confusion: Compellent doesn’t suck. Now that the pain of spending the money to expand our Compellent SAN is in the past, I am back to being in love with the product. The only gripe I’ve really ever had with Compellent is the price, and as Ben Franklin said:

The bitterness of poor quality remains long after the sweetness of low price is forgotten.

The bright side of Compellent

Since I was bemoaning Compellent’s pricing recently I figured it would be unfair of me not to highlight the upside. Their tagline is (or was when we purchased it) “The only SAN so sophisticated it’s simple.” While I can’t say whether they’re the ONLY one, the idea is definitely true. This is the first SAN I’ve ever used, and aside from the learning curve for iSCSI itself (targets, spinup delay, etc.) it’s totally simple and intuitive. Create LUNs, map them to servers. Don’t worry about things like RAID levels or hot disks. We’re into our second year with Compellent and it’s definitely lived up to its promise of simplicity.

I don’t know how much management the average SAN requires but our sales rep recently asked me how much time per week we spend managing the SAN. I crinkled my brow, because I don’t really spend any time managing the SAN. I’ve logged in to the web interface a lot more over the past few weeks than I normally do because the SAN filled up quickly due to our experimentation with Hadoop, and I wanted to make sure we didn’t get to 100% before I was able to order more disks. But aside from that incident I think the only times I’ve logged in to the management console have been to add a LUN or map a datastore to a new ESX host.

I was reminded about this simplicity when we finally added the disks last week. We went to the datacenter Wednesday to move some servers around in the racks to ensure there would be enough power in the SAN rack for the new enclosure (16x 2TB disks). We also updated a firmware update for the SAN (required so it would recognize the new 2TB drives). We have redundant controllers, so we were told there shouldn’t be any downtime. I don’t tend to trust those types of statements – if someone says something will be down for an hour I budget for 4. If it’s 8 hours I budget for a day. If it’s zero I just think they’re lying and it’s going to explode and kill people.

So all things considered I was rather impressed. We have dual controllers, so the update was installed on one controller first, and that controller rebooted. When it rebooted, the iSCSI traffic did actually fail over properly to the secondary controller. This wasn’t completely flawless – the console on some of the machines showed some iSCSI errors, but the machines seemed to be working fine (I rebooted them just to be safe). A couple of the VMs (whose data/swap drives are all on the SAN) barfed and had to be rebooted – I think our Jabber server was the only casualty, but that was back up in under a minute. When the second controller updated itself, its traffic failed back over to the first one. When it was all done (took about 30 mins total) there was a warning about the ports being unbalanced, which was rectified by clicking the “rebalance ports” button. So all in all, I’d say there was “pretty much” no downtime. After the update, we racked the new enclosure and called it a day.

This week a tech from Compellent came onsite to do the actual install for the enclosure (hooking up the fibre loops and installing the new license). This was really zero downtime. I got some alerts that one of the loops was down, but it didn’t affect anything. Pop the disks in, wire it up, install license, and we’ve got another 32 TB usable space. It’s been over a day and the data is in the process of moving from our tier 1 (32x 15krpm FC disks) down to tier 3 (SATA). All in all it was a pretty painless procedure. Sure, it would have been easier had we not had to do the firmware update, but I guess when a new type of drive is introduced that’s to be expected.

So in conclusion, I guess this just reinforces my theory that the only bad thing about Compellent is the price. And if that’s the worst thing someone can say about your product, that’s probably a pretty good place to be.

The SAN Scam

It’s time to buy some more disks for the SAN we have at work. The SAN is made by Compellent and we’ve had it for a year and it’s been great. One of the selling points was the ability to add disks however we wanted – one at a time is possible, which apparently isn’t the case with other SAN products. The one we looked at from LeftHand expanded by purchasing entire nodes, so the incremental cost was pretty high. Compellent seemed to have a higher initial cost but cheaper incrementally.

Well, that wasn’t really the case, as I’ve come to discover. The way they license features on the SAN requires “expansion licenses” for each set of 8 disks you add on. As it happens, I would like to add 8 SATA disks to our SAN, bumping us into a license expansion. The net result of this is that purchasing these disks costs over $16,000.

If that sounds like a lot of money, well, it is. I expected some markup for enterprise-class hardware, but this is ridiculous. A quick search on Newegg shows that hard drives are readily available at about $0.09 – $0.10 per gigabyte, and even Seagate drives are only around $0.14 per gig. At the price I was quoted for the Compellent drives, the price per gig is over $2.00 per gig! The markup is over 1500%, and that’s not even factoring in the discount they likely get for buying disks in bulk – I doubt they pay retail. They claim this is due to the disks being “certified” but I don’t imagine they’re opening up each disk and checking its platters. They probably just make sure the firmware is correct and then ship it out. Their quote also includes 1 year of support on the disks, with 4-hour on-site replacement, but still, as someone who’s basically “cheap,” this just pisses me off.

Now, in Compellent’s defense, their product is amazing, and I would wholeheartedly recommend it to anyone with the need for it and the means to get it, but it is very pricey, moreso than I was led to believe. The fact that I rarely have to think about the SAN probably means it’s money well spent, but as I said, I’m a cheap bastard, so this bothers me.