Having migrated our DB to a new machine, I was left with a pretty good machine unused. I decided to rebuild it and try out Postgres 9.0.
I downloaded and installed CentOS 6.0 x86_64, built Postgres 9.0.4 from SRPMS (makes me feel better building it myself), installed it and did
initdb, dumped the primary DB to our NAS using
psql 9.0 (
pg_dump [db] | gzip > /path/to/file.gz) and began the restore.
Not much of a story but it took 33 hours to restore the DB, with the bulk of the time spent building indices. When the restore was complete I did an
analyze verbose which took another 2 hours, for a total of 35 hours, not counting the time it took to create the dump file – about 14 hours, but this was with the DB under heavy load, made heavier by the dump, so I don’t know how long it would really take. I’m glad I didn’t have to do this to migrate, but I’m not relishing the idea of doing a real upgrade to 9.0 if it’s going to take this long. For the restore I disabled autovacuum and set
fsync = off. It’s an ext4 filesystem on a 24-disk RAID 10 of 10krpm SAS drives.
DB size on disk after a clean restore was 1156 GB, as per
pg_database_size(), versus 1237 GB on disk for the one that was migrated via bit-copy, so there’s a sizable chunk of crud accumulated in there.
Also, in doing this I discovered the logger command. One of the things that I’ve always found annoying about doing a pg restore is that the output is just a bunch of statements like this:
SET SET SET CREATE SCHEMA ALTER SCHEMA CREATE SCHEMA ALTER SCHEMA SET CREATE TABLE ALTER TABLE CREATE FUNCTION ALTER FUNCTION
This is just what’s returned from the SQL commands, but it’s not very informative, and most importantly it doesn’t show you the timestamp so you have no idea how long anything’s taking. With
logger, you can just pipe the output to it and it’ll be logged with syslog. So my restore command looked like this:
zcat /nfs/db/pgdump-90.sql.gz | psql -Upguser db 2> ~/restore.20110723-1916.log | logger -t PGRESTORE
This made it trivial to get the messages out of
/var/log/messages with grep:
[root@link log]# grep PGRESTORE messages | head Jul 24 03:36:43 link PGRESTORE: ALTER TABLE Jul 24 03:36:43 link PGRESTORE: SET Jul 24 03:36:44 link PGRESTORE: ALTER TABLE Jul 24 03:36:44 link PGRESTORE: ALTER TABLE Jul 24 03:38:45 link PGRESTORE: ALTER TABLE Jul 24 03:39:13 link PGRESTORE: ALTER TABLE Jul 24 03:39:13 link PGRESTORE: ALTER TABLE Jul 24 03:39:13 link PGRESTORE: ALTER TABLE Jul 24 03:39:13 link PGRESTORE: SET Jul 24 03:39:13 link PGRESTORE: ALTER TABLE
In addition to having the timestamps it eliminates the accumulation of the random log files that accumulate like droppings from myriad one-off scripts. I modified about 20 cron jobs to pipe their output to logger rather than junk files in /tmp or my homedir. I can’t believe I never heard of logger before! Hooray for logger!
One Reply to “33 hours to restore a Postgres DB”
Comments are closed.