Goodbye, pg_dump

I’ve been a Postgres user and administrator for a while. Over the years, my views on backups have evolved.

Originally, like most people, I started out with good old pg_dump. With a reasonably small database (under 50 GB) dumping to a flat text file is a fine option. I’d generally do something like pg_dump -Upostgres dbname | gzip > dbname.sql.gz to compress it on the fly and save space. For years this seemed perfect: dumping the entire database in a single transaction into a single file that can be restored anywhere.

But as my databases started growing larger and larger, the time it took to do a pg_dump grew as well. At a previous job, the database grew to nearly 2TB and the pg_dump took nearly 18 hours. We’d by that point already changed the pg_dump schedule from daily to weekly and then to three times a month and then finally to semi-monthly. Not only was it slow, but since it operated in a single transaction it wreaked havoc with normal database operation for queries that needed locks on tables locked by the dump.

When we moved the database from a physical RAID to a volume on our SAN, that gave us the opportunity to use LUN snapshotting rather than pg_dump (I just remembered I already wrote about that here). This let us move to a monthly pg_dump and more frequent snapshot-level backups that took up very little space. This was ideal on Compellent since the snapshots would auto-expire after however long you specified.

When I started at Yodle we were doing nightly pg_dumps and pretty soon we ran into the same problems I’d seen at Didit with the dump itself interfering with normal DB operation – the dump would start at midnight and run until 7-8 AM when I started, and after a few months it would still be running at noon. We discussed moving to wal archiving and making a basebackup to NFS but that would require a pretty massive amount of space, and as anybody who uses “enterprise storage” knows, that’s not something you want to do. We discussed building a whitebox file server for backups but nobody was really in love with that option – we’re trying to reduce the reliance on physical machines as much as possible. We talked about pushing it all to S3 but that seemed rather difficult.

When I attended NYC PgDay earlier this year, there was lots of discussion about WAL-E. I hadn’t ever head of WAL-E so I looked it up and was impressed. Basically, WAL-E handles archiving of wal to S3, but first compresses and pgp-encrypts it. It also handles pushing the basebackup to S3, also compressed and pgp-encrypted. This was just what we were looking for. We set it up and, amazingly, it worked perfectly. After a few weeks (and confirming we can restore from the wal-e backups) we moved our pg_dump to weekly, on the weekend when it doesn’t interfere with any user processes. We do a wal-e basebackup every 3-4 days or so and retain 3 of them. We retain all the wal so we can restore the DB to any point within the last ~10 days if needed. The best part is it’s faster than pg_dump, and since the basebackup doesn’t operate in a transaction (it’s a filesystem-level backup rather than an application-level backup) it doesn’t mess with user queries. There’s of course elevated IO during this time but our SAN has more than enough bandwidth.

We setup some basic monitoring of S3 (check the age of the most recent WAL and log it in Zabbix) just to ensure the backups are actually happening, and we’re at the point where we’re discussing moving pg_dump to monthly, or simply not doing it at all. Overall, wal-e has been a huge win for us, enabling better, faster backups that don’t interfere with the DB itself, and, while not free, aren’t ridiculously expensive. And since it’s in its own S3 bucket, you can tweak the bucket settings (e.g. enable RRS) to save money, and Amazon tells you exactly how much your backups cost you.

Goodbye, pg_dump

Share this:

Leave a comment Cancel reply