Using OpenSWAN to connect two VPCs in different AWS regions

Amazon has a pretty decent writeup on how to do this (here), but in trying to establish Postgres replication across regions, I found some weird behavior where I could connect to the port directly (telnet to 5432) but psql (or pg_basebackup) didn’t work. tcpdump showed this:

16:11:28.419642 IP 10.121.11.47.35039 > 10.1.11.254.postgresql: Flags [P.], seq 9:234, ack 2, win 211, options [nop,nop,TS val 11065893 ecr 1811434], length 225
16:11:28.419701 IP 10.121.11.47.35039 > 10.1.11.254.postgresql: Flags [P.], seq 9:234, ack 2, win 211, options [nop,nop,TS val 11065893 ecr 1811434], length 225
16:11:28.421186 IP 10.1.11.254.postgresql > 10.121.11.47.35039: Flags [.], ack 234, win 219, options [nop,nop,TS val 1811520 ecr 11065893,nop,nop,sack 1 {9:234}], length 0
16:11:28.425273 IP 10.1.11.254.postgresql > 10.121.11.47.35039: Flags [P.], seq 2:1377, ack 234, win 219, options [nop,nop,TS val 1811522 ecr 11065893], length 1375
16:11:28.425291 IP 10.1.96.20 > 10.1.11.254: ICMP 10.121.11.47 unreachable - need to frag (mtu 1422), length 556
16:11:28.697397 IP 10.1.11.254.postgresql > 10.121.11.47.35039: Flags [P.], seq 2:1377, ack 234, win 219, options [nop,nop,TS val 1811590 ecr 11065893], length 1375
16:11:28.697438 IP 10.1.96.20 > 10.1.11.254: ICMP 10.121.11.47 unreachable - need to frag (mtu 1422), length 556
16:11:29.241311 IP 10.1.11.254.postgresql > 10.121.11.47.35039: Flags [P.], seq 2:1377, ack 234, win 219, options [nop,nop,TS val 1811726 ecr 11065893], length 1375
16:11:29.241356 IP 10.1.96.20 > 10.1.11.254: ICMP 10.121.11.47 unreachable - need to frag (mtu 1422), length 556
16:11:30.333438 IP 10.1.11.254.postgresql > 10.121.11.47.35039: Flags [P.], seq 2:1377, ack 234, win 219, options [nop,nop,TS val 1811999 ecr 11065893], length 1375
16:11:30.333488 IP 10.1.96.20 > 10.1.11.254: ICMP 10.121.11.47 unreachable - need to frag (mtu 1422), length 556
16:11:32.513418 IP 10.1.11.254.postgresql > 10.121.11.47.35039: Flags [P.], seq 2:1377, ack 234, win 219, options [nop,nop,TS val 1812544 ecr 11065893], length 1375
16:11:32.513467 IP 10.1.96.20 > 10.1.11.254: ICMP 10.121.11.47 unreachable - need to frag (mtu 1422), length 556
16:11:36.881409 IP 10.1.11.254.postgresql > 10.121.11.47.35039: Flags [P.], seq 2:1377, ack 234, win 219, options [nop,nop,TS val 1813636 ecr 11065893], length 1375
16:11:36.881460 IP 10.1.96.20 > 10.1.11.254: ICMP 10.121.11.47 unreachable - need to frag (mtu 1422), length 556

After quite a bit of Google and mucking in network ACLs and security groups, the fix ended up being this:

iptables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
iptables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1500

(The above two commands need to be run on both OpenSwan boxes.)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s