Migrating a PostgreSQL DB to a new machine without doing a dump & restore

Long ago, before I stepped into my role as de facto DBA in my current job, dump & restore of our Postgres database (dumping the entire DB to a text file, formatting the data disk, and restoring the data) was a pretty regular event. This was needed because we didn’t regularly vacuum the DB, which in turn was due to vacuums taking forever (which in the end ended up being due to crappy underlying hardware). We started doing monthly global vacuums (they took so long that monthly was about all we could handle) and we discovered we no longer needed to do dump & restores. When we finally upgraded to Postgres 8.2, which introduced autovacuum, things got even better, since we didn’t have to manage vacuuming via cron jobs.

In the years between then and now our Postgres DB has ballooned to such a size that a D&R isn’t something we could feasibly do in a scheduled maintenance window. The last time I did one was when we bade farewell to the decrepit database in September, 2007. At that point our database consumed 730 GB on disk. Looking through my email archives, we began the dump at 6:30 PM on a Friday night and it completed at 3:48 AM Saturday. The restore started around 9 AM and ran until around 1 PM (I assume it went so much faster than the dump due to the new DB being significantly better hardware-wise). Building indices took until 9:27 PM Saturday. We then ran a global “ANALYZE” to generate DB stats; the analyze ran from 10 PM until 1 AM Sunday. We then had most of Sunday to process the backlog of data that accumulated since Friday afternoon when we took the database offline.

So, with a 730 GB DB, the entire procedure took 9 hours (dump) + 4 hours (restore) + 8 hours (index rebuild) + 3 hours (analyze), so about 24 hours.

However, as I said, in the years since then our database has grown as we house more and more data, currently at about 1220 GB. It might have been possible to do the migration via dump & restore in the scheduled window, but I wasn’t looking forward to it. Instead, I decided to try a different option: copying the data files directly from the old server to the new one. If this worked it would eliminate almost all the overhead and the move would be complete in however long it took to copy the data from one host to the other.


We had a couple reasons for doing this upgrade. Performance wasn’t really one of them though; we were pretty confident that the performance of the DB was as good as we were likely to get with platter disks, and our SAN doesn’t currently have SSDs. The old DB has dual Xeon 5160 2-core CPUs @3.0ghz, 32 GB memory, and a RAID 5 OS volume. The database resided on an HP MSA70, with 24x 10krpm 146 GB SAS drives (+1 hot spare) in RAID 10 for a 1.6 TB logical volume. At the time I debated RAID 6 vs RAID 10 but in the end I opted for the performance of RAID 10 vs the capacity of RAID 6 and it worked out well.

But one of the reasons we decided to upgrade was that the drives in the DB were starting to die and the warranty had expired on them, and each disk cost about $300 to replace. That was a pretty big liability and I expected the disks to begin dying more frequently.

Another reason for upgrading was the benefits of having the data on the SAN, especially snapshotting. We’d been doing daily backups for a while but doing a dump of the DB while it’s in use makes it take forever, causing degraded performance while it’s running. Snapshot isn’t a perfect solution, but at least it’s an option.

Another reason for wanting to move the DB to the SAN was for DR purposes; if we setup SAN-SAN replication to another site, with the DB on the SAN, we get that backed up for free.

And probably the biggest reason, we were up to 1.25 TB used out of 1.6, over 80% full. We’d probably be good for another few months, but for me 80% is pretty full.


In order for this to work, the version of Postgres on both machines has to be of the same minor version. In my case, the source DB (server A) was running 8.2.5 (the newest at the time the box was built), so I built 8.2.18 (source RPM) on the target (server B). Server B is pretty beefy: HP DL360 G7 with dual Xeon X5660 6-core CPUs @2.8 GHz, 96 GB PC3-10600 ECC mem, QLogic QLE-4062 iSCSI HBA connected to a 4TB volume on our Compellent SAN (tier 1, RAID 10 across 32x 15krpm FC disks). Both machines of course need to be of the same architecture, in my case x86_64. It might work across Intel/AMD, but I’m not sure about that; fortunately I didn’t have to worry about that.

Moving the data

When we did dump & restores, we dumped directly to a commonly-mounted NAS, which worked well, since we wouldn’t start the restore until the dump was complete, and we didn’t want to consume disk on the target with a gigantic dump file (in addition to spinning the disks with reading the dumpfile while attempting to write to them; the contention causes everything to go much slower). There wasn’t really any need to use a NAS as an intermediary in this case though, it would just double the amount of time needed to get the data from A to B.

I created an NFS export on B:

/data/pgsql   ,no_root_squash)

And mounted it on A with these options in /etc/fstab:  /mnt/gannon/nfs nfs     rw,rsize=32768,wsize=32768,nfsvers=3,noatime,udp        0     0

I tried TCP vs UDP mounts and found UDP was faster.

I then copied the data over with my favorite Unix tool, rsync:

time rsync -atp --delete --progress /var/lib/pgsql/data /mnt/gannon/nfs/ > /home/evan/rsync.log

Dry run

In January I did a dry run of the procedure. I had tried copying data over without stopping postgres on A, so as not to cause a service interruption, but it didn’t work; data was changing too rapidly. By the time the rsync completed, the files it had copied over earliest had already been modified again. I ended up scheduling some downtime for a weekend and did the copy. With the above NFS settings I was able to transfer data at around 50 MB/s over our gigabit switches, the whole thing took 3-4 hours. When it came up, everything seemed to be fine. I was pretty happy, because 3-4 hours is a whole lot better than 24+.

Day of Reckoning

I finally did the real migration this past weekend. I started it at 8 PM and (after an rsync snafu) completed it around 4 AM. The snafu was caused by my use of the “--delay-updates” flag, which I later learned copies modified files to a /.~tmp~/ directory, and when all of them are copied, moves them into place in an attempt to make it a more “atomic” operation. I didn’t realize this was what was happening, and I got a little freaked out when I saw the disk usage for the target growing 100 GB larger than the source. I cancelled the rsync and ran it again, stupidly dropping the –delay-updates flag, which with the –delete flag caused it to delete all the stuff in .~tmp~ that it had already copied over. It deleted like 300 GB of stuff before I freaked out and cancelled it again. I cursed myself a few times, then manually moved the contents of .~tmp~ up to the parent to salvage what had already been transferred, and ran the rsync once again to move the remaining data. So it probably would have been done much sooner had I not cancelled it and then deleted a bunch of my progress.

You may notice that the rsync flags above don’t include -z. With huge binary files being transferred over a fast LAN I don’t think there’s much reason to use -z, in fact when I added -z the throughput plummeted.

After copying the data, I moved the virtual IP of the DB to the new machine, moved the cron jobs over, started postgres on B and everything worked. I finished all of my cleanup around 6 AM Saturday, though like I said, I would have been done much sooner had I not deleted a bunch of my progress. Even still, this is a lot better than the dump & restore scenario, and has the added benefit of being reversible. I’m planning to upgrade postgres on A to 8.2.18 and leave it as a standby server; if a problem arises with B, the data can be moved back relatively quickly.


Well, I don’t have any great insight to put here, but so far this has worked out. My next DB project is upgrading from 8.2 to either 8.4 or the 9.x series, but that’s going to require a lot more planning since client drivers will likely need to be updated, and I’m not sure if queries might need to be altered.

The end (I hope).

How I fixed my Wii’s noisy disc drive & read errors.

I got my Wii in late 2006 and around 2009 I noticed it was starting to sound like a circular saw when the disc drive was spinning. This was annoying but it didn’t affect the games so I never thought much about it.

About two weeks ago, however, New Super Mario Bros. Wii stopped working:

The Wii basically worked except for the disc drive. As soon as I put a disc in, I’d get the error above – even if I didn’t start start the game. I tried blowing in the disc slot (the only thing I could think of… worked for my old NES!) but the problem continued. I figured I had two choices: buy another Wii or attempt to fix mine. Since mine was already essentially useless I figured it couldn’t hurt anything to try fixing it.

After some searching I quickly learned that to do any work inside a Wii requires a Tri-Wing screwdriver just to open the case. I went to the hardware store and got a Tri-Wing bit but it was too big to be of any use on the Wii. I found this Silverwing Tri-Wing screwdriver on Amazon for under $5 that did the trick.

With trusty screwdriver in hand, I opened the Wii thusly:

With it now open I followed this guy’s advice:

Bending these small triangles down created a gap that stopped the vibration & the noise and made the Wii playable again.

Bend these small pieces down to stop the Wii disc drive's loud noise.
Bend these small pieces down to stop the Wii disc drive's loud noise.

It has some new weird sounds when the disc first spins up and when it spins it down (I assume the braking mechanism) but it’s practically silent during gameplay. Yay!

Forcing WordPress administration over SSL

I never like typing a password into a non-SSL site, no matter how trivial it is. In order to give my own site this ability I simply used mod_rewrite to force requests to WordPress’s admin pages to go over SSL.

The .htaccess file for the site looks like this:

# BEGIN WordPress

RewriteEngine On
RewriteBase /evan/
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /evan/index.php [L]

# END WordPress

To force the admin pages to SSL, just add these lines under RewriteEngine On:

RewriteCond %{HTTPS} !=on
RewriteRule ^wp-(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R,L]

Edit – The above code screws up uploads (which go into the /wp-content directory). I replaced that with the following and it Worked As Intended.

RewriteCond %{HTTPS} !=on
RewriteRule ^wp-login(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R,L]
RewriteCond %{HTTPS} !=on
RewriteRule ^wp-admin(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R,L]

That’s pretty much it. If your request starts with “wp-” it’ll redirect you to the same URL, but starting with https://. Problem solved. You do need to make sure you have an SSL VirtualHost pointing to your WordPress DocumentRoot so that https://yoursite.com goes to the same place as http://yoursite.com.

HOWTO: Install Flash plugin on Firefox on Fedora / CentOS / RedHat Linux

Updated 8/31/2011 See my newer post on this subject.

I know I’ve been through this before because I’ve setup Fedora before and it’s working. But I just installed FC12 on my desktop at home (since my Seagate drive bricked itself a few days ago and I had to reinstall anyway) and while I finally got the Adobe Flash plugin working, it didn’t have any audio, which made YouTube kind of suck. Anyway, this of course was a solved problem, I just forgot where the fix was located… but duh, it’s on the Fedora website. A simple

yum install flash-plugin nspluginwrapper.x86_64 
    nspluginwrapper.i686 alsa-plugins-pulseaudio.i686 

and everything appears to be fine. Thank you Fedora!

Victory! Change Active Directory Password via LDAP through browser

I had to give up on PHP and go to Perl, but it turned out not to be so bad. Users can now change their Active Directory passwords via a self-service web page that doesn’t require admin credentials. The Perl code is below.  Authentication to the script is done via .htaccess LDAP authentication, so the REMOTE_USER env variable is assumed to contain the user’s username (sAMAccountName) by the time this script is called.  There is a simple check for $ENV{HTTPS} to ensure the script is called via SSL, and AD requires password changes to be done via ldaps, so the whole thing should be encrypted end to end.

(Edited 5/14/2010 to replace the inlined Perl script with a link to the script as a text file.)


How to install the 64-bit Sun Java plugin on 64-bit firefox on 64-bit Fedora Core 11 Linux (which happens to use 64 bits)

I’m giddy! I found this post on mozdev.org which was magical.

[evan@ehoffman ~]$ java -version
java version "1.6.0_17"
Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
[root@ehoffman plugins]# uname -a
Linux ehoffman #1 SMP Fri Sep 25 04:43:32 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
[root@ehoffman plugins]# pwd
[root@ehoffman plugins]# ln -s /usr/java/jdk1.6.0_16/jre/lib/amd64/libnpjp2.so

The main thing I was missing was that the plugin isn’t libpluginjava_oji.so, or whatever I thought it was, but libnpjp2.so. Once I created the symlink into /usr/lib64/mozilla/plugins it worked (as verified on http://www.java.com/en/download/help/testvm.xml and http://www.java.com/en/download/installed.jsp).

That’s all it takes to get the Sun Java plugin working in Firefox on Linux.

Migrating RT from mysql to postgresql.

So I wanted to move our old RT 3.6 system to a new VM running a clean CentOS 5.3 x86 install and RT 3.8. I got RT installed but the 3.6 system was running on MySQL, which sucks (still), and I wanted to move it to Postgres, which rules (more and more each day). So Googling for “migrate rt from mysql to postgresql” brought me here. I tried the procedure listed there but it wasn’t quite working for me. my2pg.pl and mysql2postgresql.perl both caused segfaults when I tried to run them. I realized I was on my own, and it wasn’t really that bad.

Continue reading “Migrating RT from mysql to postgresql.”

One-click converting .avi to .mp4 for iPod or iPhone

After a lot of searching for an easy way to do this, I stumbled onto WinFF, which is a gui wrapper around ffmpeg, a GPL’d MPEG encoder. It worked pretty well out of the box, but for all of the AVIs I’d recorded with my old camera (Canon Powershot A540) it would fail immediately with this error:

Audio resampler only works with 16 bits per sample. patch welcome.

After some searching it seems this is related to a known bug which was apparently fixed in February 2009. I set out to find a newer ffmpeg.exe (which is what WinFF calls to do the actual conversion). I found some Win32 builds of ffmpeg here but found that they all claimed to not know the “libfaac” codec. After some more digging I learned that libfaac (the audio codec for MP4 for iPod/iPhone) is no longer considered free software and was dropped from the repository in mid-April. Fortunately there were older builds available, and I grabbed the one dated 2009-04-01, extracted the ffmpeg.exe into my WinFF directory and voila, it worked.

Old ffmpeg.exe:

C:\Documents and Settings\Evan>"C:\Program Files\WinFF\ffmpeg.exe" -version
FFmpeg version SVN-r15986, Copyright (c) 2000-2008 Fabrice Bellard, et al.
  configuration: --extra-cflags=-fno-common --enable-memalign-hack --enable-pthr
eads --enable-libmp3lame --enable-libxvid --enable-libvorbis --enable-libtheora
--enable-libspeex --enable-libfaac --enable-libgsm --enable-libx264 --enable-lib
schroedinger --enable-avisynth --enable-swscale --enable-gpl
  libavutil     49.12. 0 / 49.12. 0
  libavcodec    52. 6. 0 / 52. 6. 0
  libavformat   52.23. 1 / 52.23. 1
  libavdevice   52. 1. 0 / 52. 1. 0
  libswscale     0. 6. 1 /  0. 6. 1
  built on Dec  3 2008 01:59:37, gcc: 4.2.4
FFmpeg SVN-r15986
libavutil     49.12. 0 / 49.12. 0
libavcodec    52. 6. 0 / 52. 6. 0
libavformat   52.23. 1 / 52.23. 1
libavdevice   52. 1. 0 / 52. 1. 0
libswscale     0. 6. 1 /  0. 6. 1

New ffmpeg.exe:

C:\Documents and Settings\Evan>"C:\Program Files\WinFF\ffmpeg.exe" -version
FFmpeg version SVN-r18306, Copyright (c) 2000-2009 Fabrice Bellard, et al.
  configuration: --enable-memalign-hack --prefix=/mingw --cross-prefix=i686-ming
w32- --cc=ccache-i686-mingw32-gcc --target-os=mingw32 --arch=i686 --cpu=i686 --e
xtra-cflags=-fno-common --enable-avisynth --enable-gpl --enable-zlib --enable-bz
lib --enable-libgsm --enable-libfaac --enable-pthreads --enable-libvorbis --enab
le-libmp3lame --enable-libopenjpeg --enable-libtheora --enable-libspeex --enable
-libxvid --enable-libfaad --enable-libschroedinger --enable-libx264
  libavutil     50. 2. 0 / 50. 2. 0
  libavcodec    52.22. 3 / 52.22. 3
  libavformat   52.32. 0 / 52.32. 0
  libavdevice   52. 1. 0 / 52. 1. 0
  libswscale     0. 7. 1 /  0. 7. 1
  built on Apr  2 2009 03:25:40, gcc: 4.2.4
FFmpeg SVN-r18306
libavutil     50. 2. 0 / 50. 2. 0
libavcodec    52.22. 3 / 52.22. 3
libavformat   52.32. 0 / 52.32. 0
libavdevice   52. 1. 0 / 52. 1. 0
libswscale     0. 7. 1 /  0. 7. 1

This also works great for encoding videos to XviD, which shrinks them to about 10-20% of their original size, making it much faster (in some cases, possible) to upload them to sharing sites.