Day 63 post-op: back to the office

This past Monday, Sept 17th, I finally returned to the office, after almost 11 weeks being out.  I was on long weekend (July 4th) when the injury occurred, and worked from home the following week.  I then had the repair surgery on July 16th, and spent about two weeks just sitting on the couch recovering – WFH wasn’t really an option for that, since I had to keep my leg elevated above my heart.  Since then I’ve been working from my home office.

First of all, let me say how amazingly lucky I am to have a job and a manager that allow me to work from home.  I am extremely grateful for that.  However, while there are definitely advantages to working from home, it’s not ideal, so I got clearance from my doctor and went back in this past Monday.

I live on Long Island and my office is in Manhattan, which means commuting involves driving to the local train station, parking there, taking a train into Manhattan, and then taking the subway to get to my office. Competition at the LIRR parking lot is fierce, and if you’re not there by 7 AM you likely won’t get a parking spot.  I left the house at my previously usual time of 6:25 and managed to get one of the last spots in the “good” parking lot.  There are no handicapped spots in this lot so I ended up just taking a regular spot, which was about 1000 feet from the stairs.  The stairs weren’t too much of a problem, though I am definitely slower climbing them, and double-timing was not an option.  I got a seat and the train ride was unremarkable.

I got off at Penn Station and took the escalator up to the concourse.  I took another escalator up, and then decided to take the stairs up to street level.  Again, no problem, just slow.  I walked over to 6th Avenue and entered the NRQ station at 6th & 32nd.  This was the first time all day I had to go down stairs, and it was definitely more challenging than going up. The main problem was that the boot was too big for the steps.  I took the subway to 5th Ave & 59th Street, just to take a peek at Central Park before work.  One thing I noticed walking on the sidewalk was that the slight grade of the sidewalk is very noticeable when wearing the boot.  The sidewalks are all slightly slanted downwards, from the building to the street, so rain will run into the street.  Normally I don’t notice this, but having the massive flat-bottomed boot bolted on, unable to use my ankle, it was awkward and uncomfortable.  I found walking on the right side of the street was easier than the left, so that the boot was lower than my good foot.  Not sure if I’m explaining it well, but it was a noticeable issue.

When I got to the office I took the elevator to the 17th floor, where the coffee is, and then walked down stairs to 16, where my desk is.  I managed to get special handicapped elevator privileges with a doctor’s note, so I can at least take the elevator to my floor in the future without having to walk down.

For lunch I walked to a burrito place about 5 blocks away and brought it back to the office to eat.  That was also a relatively unremarkable experience.

At the end of the day, I left a bit early, since there’s no way I can run if I need to catch my train.  Descending the stairs into the subway station near the office I started to feel sharp pains in my right ankle – my non-injured one.  By the time I got off the LIRR and got back to my car the pain was becoming more frequent.  My immediate guess was that all the walking down stairs had caused Achilles tendinitis in my right leg.  I had a physical therapy session right after work and I told my therapist what happened and he massaged the right leg as well as the left, using his roller thing.  He said the right calf was extremely tight, and I need to make sure to stretch the calf out before doing anything, to avoid future injury.

The pain in the ankle continued intermittently throughout the evening, and when I woke up Tuesday, laying in bed, I felt it continue.  It felt like someone had slashed the very bottom of the back of my ankle with a razor.  I decided to work from home rather than exacerbate whatever the issue was.  By Tuesday afternoon it was fine, but I chose to work from home for the remainder of the week.  There are just too many steps involved in getting to & from the office.  I plan to go back in either tomorrow (Friday) or Monday, and see if I can take a different subway route that has escalators or elevators the entire way.

Can I create an EC2 MySQL slave to an RDS master?

No.

Here’s what happens if you try:

mysql> grant replication slave on *.* to 'ec2-slave'@'%';
ERROR 1045 (28000): Access denied for user 'rds_root'@'%' (using password: YES)
mysql> update mysql.user set Repl_slave_priv='Y' WHERE user='rds_root' AND host='%';
ERROR 1054 (42S22): Unknown column 'ERROR (RDS): REPLICA SLAVE PRIVILEGE CANNOT BE GRANTED OR MAINTAINED' in 'field list'
mysql>

Note: this is for MySQL 5.5, which is unfortunately what I’m currently stuck with.

The m3.medium is terrible

I’ve been doing some testing of various instance types in our staging environment, originally just to see if Amazon’s t2.* line of instances is usable in a real-world scenario. In the end, I found that not only are the t2.mediums viable for what I want them to do, but they’re far better suited than the m3.medium, which I wouldn’t use for anything that you ever expect to reach any load.

Here are the conditions for my test:

  • Rails application (unicorn) fronted by nginx.
  • The number of unicorn processes is controlled by chef, currently set to (CPU count * 2), so a 2 CPU instance has 4 unicorn workers.
  • All instances are running Ubuntu 14.04 LTS (AMI ami-864d84ee for HVM, ami-018c9568 for paravirtual) with kernel 3.13.0-29-generic #53-Ubuntu SMP Wed Jun 4 21:00:20 UTC 2014 x86_64.
  • The test used loader.io to simulate 65 concurrent clients hitting the API (adding products to cart) as fast as possible for 600 seconds (10 minutes).
  • The instances were all behind an Elastic Load Balancer, which routes traffic based on its own algorithm (supposedly the instances with the lowest CPU always gets the next request).

The below charts summarize the findings.

average nginx $request_time
average nginx $request_time

This chart shows each server’s performance as reported by nginx. The values are the average time to service each request and the standard deviation. While I expected the m3.large to outperform the m3.medium, I didn’t expect the difference to be so dramatic. The performance of the t2.medium is the real surprise, however.

#	_sourcehost	_avg	_stddev
1	m3.large	6.30324	3.84421
2	m3.medium	15.88136	9.29829
3	t2.medium	4.80078	2.71403

These charts show the CPU activity for each instance during the test (data as per CopperEgg).

m3.large
m3.large
t2.medium
t2.medium
m3.medium
m3.medium

The m3.medium has a huge amount of CPU steal, which I’m guessing accounts for its horrible performance. Anecdotally, in my own experience m3.medium far more prone to CPU steal than other instance types. Moving from m3.medium to c3.large (essentially the same instance with 2 cpus) eliminates the CPU steal issue. However, since the t2.medium performs as well as the c3.large or m3.large and costs half of the c3.large (or nearly 1/3 of the m3.large) I’m going to try running most of my backend fleet on t2.medium.

I haven’t mentioned the credits system the t2.* instances use for burstable performance, and that’s because my tests didn’t make much of a dent in the credit balance for these instances. The load test was 100x what I expect to see in normal traffic patterns, so the t2.medium with burstable performance seems like an ideal candidate. I might add a couple c3.large to the mix as a backstop in case the credits were depleted, but I don’t think that’s a major risk – especially not in our staging environment.

Edit: I didn’t include the numbers, but the performance seemed to be the consistent whether on hvm or paravirtual instances.

Tips for recruiters

I’m a pretty lucky guy these days. As a DevOps engineer in NYC my skills are in high demand and recruiters contact me almost every day. As someone who was once unemployed for 6 months I’m grateful to be in this position. That said, there are some requests that go straight to the trash, and some I’ll at least respond to even if I’m not interested. Here are some of the factors that influence my decision:

Does your email look like a generic mail merge/copypasta?

As with all things in life, you need to make an effort. If you’re just spamming everybody with jobs that are listed on LinkedIn or Dice or whatever, there’s no need to talk to you. Like this one, which looks like an Excel mail merge.

Hi,

Our direct client located in New York, NY has a position open for a Release Engineer. A copy of the job description is below.

If you are interested, please send a copy of your resume (preferably in MS Word format) to xxx@yyy.com.

Please be sure to include your rate, location and contact information.

Thanks
Bob

Here’s another one I got via LinkedIn last week:

Subject: Fantastic opportunity for a very cutting edge company in New York City

Dear Evan,

How are you?

I have a client (startup) looking for someone of your background. The location is Manhattan and the funding for this company is off the charts. The pay is great, the benefits are unbeatable, and technology and collaborative environment is off the charts.

Let me know if you or a friend may be interested and I can give you some more details…

Thanks,
Charlie

This is sort of the perfect bad email. For one thing, there’s no information about the company at all: What industry? What technologies? How big is the team? How long have they been around? Are they profitable? For another, there’s no information about the position itself. This same email could be used for an engineer, sales, ops, finance, CEO or janitor.

There are also some words that add no value at all to the email. When describing a job or a company, you should omit the words “exciting,” “awesome,” “amazing,” “cutting edge.” Just tell me the name of the company, maybe with a link to more info about them.

Are you an in-house recruiter or with a headhunting firm?

I know there are good recruiting firms but I seem not to have worked with any of them in the past. In my experience, “executive search” firms are just concerned with volume – getting people to quit their job to go work somewhere else, and then contacting them a year later asking if they want to move again. I’ve had recruiters call me up asking if I was looking to hire anybody, and when I say no they ask if I want to go work somewhere else. if they can’t sell to me, I guess they’ll try and sell me.

For me, the straw that broke the camel’s back was when a recruiter insisted I interview at a place where the job description said “We’re looking for a Ruby expert. You should eat, sleep, and breathe Ruby.” I told the recruiter I didn’t really know Ruby that well, and he insisted that didn’t really matter. I looked into the company’s product and didn’t really like it, but somehow he talked me into going on the interview. It was kind of a disaster: the office was cramped and hot and looked pretty shabby, it was far from any subway station, the interview questions weren’t relevant to the position, and I didn’t like any of the technologies they used. I was uncomfortable and lost what little interest I had about an hour into it. Apparently the feeling was mutual. The recruiter apologized and asked me what I wanted to do next. I never wrote back.

After that ordeal I decided to deal only with in-house recruiters. Personally, I prefer in-house recruiters because they’ve got skin in the game beyond a commission – they’re employees who are committed to seeing the company succeed and are aware of how important it is to land the right person, and would much rather let a seat go empty than fill it with a bad hire. They understand the company culture because they’re part of it. They can sense whether someone will be a good fit on a team because they know everybody on it. They can answer questions about the company without skipping a beat. The job description is more than words on a page to them. The last time I spoke to a recruiter from a staffing firm he assured me he was different, and then all he had to offer me was a menu of 5 companies that he could “get me an interview with.” Well, thanks, but I could do that myself.

I realize a lot of startups don’t want the expense of a full-time recruiter, and I’m probably missing some good opportunities by ignoring these crappy emails, but my experience indicates most of these guys are just going for quantity, sending as many candidates as possible to as many interviews as possible, and don’t much care about quality. Again, I’m sure there are good ones, maybe even most of them are good, but that hasn’t been my experience.

For God’s Sake Stop Calling Me

Email is one thing. I can ignore an email pretty easily. But please don’t call my cell phone (or worse, office phone). If you’re calling during the day, I’m at work, and I don’t want to talk about a new job at work. If it’s after work, well, I’m on my way home on the train and can’t talk, or I’m at home eating dinner and can’t talk. I don’t know how you even got my number in the first place, but if you manage to trick me into answering a call while I’m at my job, you’re not going to get a warm reception. I don’t have a private office, so how am I supposed to have a conversation about switching jobs while I’m at work?

Some recruiters just can’t take a hint. A couple months ago I was on vacation, heading to a Disney Cruise in Florida. As I was approaching Port Canaveral, my phone rang. It was a 646 number (NYC) so I figured it was a recruiter and let it ring out. A couple minutes later they called back and didn’t leave a voicemail. A couple minutes later, another call. I didn’t recognize the number but I was worried it might be someone from work so I answered it. It turned out to be a recruiter and I told her I was about to get on a cruise ship and she could call me back next week just to get her off the phone. Next week came around and sure enough she started calling multiple times a day for over a week. I ended up having to block her number in Google Voice. A couple weeks later, another recruiter from the same firm started calling me from a different number and I ended up blocking him too. Desperation isn’t attractive.

Another problem I’ve encountered is recruiters who are just lousy at their jobs. A few times when I’ve answered the phone, the person on the other end sounds like a deer in the headlights, like now that they’ve got me on the phone they have no idea what to say. When this happens, I picture an intern given a list of names and phone numbers and told “make 200 calls today or you’re fired.” Out of sympathy I usually let him finish his/her spiel and then say “thanks, but I’m not looking right now” and manage to get out of it, but this doesn’t seem like an effective strategy and just makes your firm look amateurish.

TL;DR

Basically, if you’re looking to hire engineering talent, you should:

  • Be an expert on the company you’re recruiting for. Ideally this would be the company you work for, but even if you’re a third party, you’d do well to spend a day on site at your client’s office so you can answer questions about the culture, location, nearby food, etc.
  • Do some research on the candidate. Whatever resume you have in your database is probably out of date. Maybe your target has a website or a Github or a LinkedIn that gives some insight as to what they’re up to.
  • Make your email short and sweet. Whether you’re in-house or a placement firm, the email should give the basic facts: What’s the name of the company (duh)? Where are they located? Are they profitable? How big is the team? What’s the org chart look like – to whom would they report? What technologies do they use? What’s the ballpark compensation?
  • Not annoy anybody. If you send somebody an email and they don’t respond, they’re not interested. If you send them 10 emails about 10 different jobs and they don’t respond, they’re just not that into you. Give it a break. Definitely don’t “call to follow up” if they don’t respond to your email.

The general theme here is “don’t waste anybody’s time.” Don’t send me an email full of intrigue or try and sell me. Like when buying a house, the company/position should sell itself. Just give me the necessary info and don’t bother me.

Disclaimer: this post is just my opinion, and has nothing to do with my employer.

The not-so-secret secret to Postgres performance

I manage a bunch of Postgres DBs and one of the things I almost always forget to do when setting up a new one is set the readahead up from the default of 256. I created this script and run it out of /etc/rc.local and sometimes cron it too. The 3 commands at the top are only really relevant on systems with “huge” memory – probably over 64 GB. We ran into some memory problems with CentOS 6 on a box with 128 GB ram which we ended up working around by reinstalling CentOS 5, but the /sys/kernel/mm/redhat_transparent_hugepage/ options below should fix them in 6.x (though we haven’t actually tried it on that DB, we haven’t seen any problems in other large DBs).

#!/bin/bash

echo no > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag
echo never >/sys/kernel/mm/redhat_transparent_hugepage/defrag

echo 1 > /proc/sys/vm/dirty_background_ratio

BLOCK_DEVICES=`perl -ne 'chomp; my @a=split(/[s]+/); next unless $a[4]; next if ($a[4] =~ /sd[a-z]+[d]+/); print "$a[4]n";' /proc/partitions `

logger -t tune_blockdevs "Block devices matching: $BLOCK_DEVICES"

for DEV in $BLOCK_DEVICES;  do
        logger -t tune_blockdevs "Setting IO params for $DEV"
### Uncomment the below line if using SSD
#        echo "noop" > /sys/block/$DEV/queue/scheduler
        /sbin/blockdev --setra 65536 /dev/$DEV
done

XFS write speeds: software RAID 0/5/6 across 45 spindles

We’re currently building a new storage server to store low-priority data (tertiary backups, etc). One of the requirements for the project is that it needs to be on cheap storage (as opposed to expensive enterprise SAN/NAS). After some research we decided to build a Backblaze pod. Backblaze used 3TB Hitachi drives in their system, but the ones they listed in their blog post are discontinued and the reviews for all other 3TB+ drives were terrible, so we went with Samsung ST2000DL004 2TB 7200 RPM drives. Like Backblaze, we’re going with software raid, but I figured a good first step would be to figure out what RAID level we want to use, and if we want to use the mdadm/LVM mish-mosh Backblaze uses, or find something simpler. For my testing I created a RAID6 of all 45 drives and created a single XFS volume (XFS’s size limit is ~8 exabytes vs ext4’s 16TB). Ext4 may present some performance advantages, but the management overhead is probably not worth it in our case.

So, this is just a simple benchmarking comparing RAID 0 (stripe with no parity) as a baseline, RAID5 (stripe with 1 parity disk) and RAID6 (stripe with 2 parity disks) across 45 total spindles. For all tests I used Linux software RAID (mdadm).

To test, I have 3 scripts, makeraid0.sh, makeraid5.sh, and makeraid6.sh. Each one does what its name implies. The raid0 has 43 disks, raid5 has 44 disks, and raid6 has 45 disks, so there are 43 “data” disks in each test. The system is a Protocase “Backblaze-inspired” system with a Core i3 540 CPU, 8 GB memory, CentOS 6.3 x64, and 45x We’re just using this box for backup and it gives us about 79 TB usable, which is still plenty, so 2TB isn’t a big problem.

makeraid?.sh for filesystem creation:

#!/bin/bash

mdadm --create /dev/md0 --level=raid6 -c 256K --raid-devices=45 
/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde 
/dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj 
/dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo 
/dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt 
/dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy 
/dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad 
/dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai 
/dev/sdaj /dev/sdak /dev/sdal /dev/sdam /dev/sdan 
/dev/sdao /dev/sdap /dev/sdaq /dev/sdar /dev/sdas

Filesystem:

[root@Protocase ~]# mkfs.xfs -f /dev/md0
meta-data=/dev/md0               isize=256    agcount=79, agsize=268435392 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=21000267072, imaxpct=1
         =                       sunit=64     swidth=2752 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@Protocase ~]# mount /dev/md0 /raid0/
[root@Protocase ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdat2            289G  3.2G  271G   2% /
tmpfs                 3.9G  260K  3.9G   1% /dev/shm
/dev/sdat1            485M   62M  398M  14% /boot
/dev/md0               79T   35M   79T   1% /raid0

RAID0

[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 25.1944 s, 416 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 25.1922 s, 416 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 24.7665 s, 423 MB/s

RAID5

[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 25.2239 s, 416 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 24.7427 s, 424 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 24.2434 s, 433 MB/s

RAID6:

[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 26.9032 s, 390 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 26.5255 s, 395 MB/s
[root@Protocase ~]# rm -f /raid0/zeros.dat 
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 26.4338 s, 397 MB/s

I found it pretty strange that RAID5 seemed to outperform RAID0, but I tested it several times and RAID5 averaged 10-15 MB/s faster than RAID0. Maybe a bug in the kernel? I tried other block sizes ranging from 60KB to 4MB for dd but the results were pretty consistent. In the end it looks like I’m going to go with RAID6 of 43 drives + 2 hotspares, which still yields ~400 MB/s throughput and 75 TB usable:

#!/bin/bash

mdadm --create /dev/md0 --level=raid6 -c 256K -n 43 -x 2 
/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde 
/dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj 
/dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo 
/dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt 
/dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy 
/dev/sdz /dev/sdaa /dev/sdab /dev/sdac /dev/sdad 
/dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai 
/dev/sdaj /dev/sdak /dev/sdal /dev/sdam /dev/sdan 
/dev/sdao /dev/sdap /dev/sdaq /dev/sdar /dev/sdas

Update: A coworker suggested looking into write-intent bitmap to improve rebuild speeds. After adding a 256 MB-chunked bitmap, the write performance didn’t degrade much, so this looks like a good addition to the configuration:

[root@Protocase ~]# mdadm -G --bitmap-chunk=256M --bitmap=internal /dev/md0
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 25.8157 s, 406 MB/s
[root@Protocase ~]# rm -fv /raid0/zeros.dat
removed `/raid0/zeros.dat'
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 26.4233 s, 397 MB/s
[root@Protocase ~]# rm -fv /raid0/zeros.dat
removed `/raid0/zeros.dat'
[root@Protocase ~]# dd if=/dev/zero of=/raid0/zeros.dat bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 26.2593 s, 399 MB/s
[root@Protocase ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdat2            289G  3.2G  271G   2% /
tmpfs                 3.9G   88K  3.9G   1% /dev/shm
/dev/sdat1            485M   62M  398M  14% /boot
/dev/md0               75T  9.8G   75T   1% /raid0
[root@Protocase ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdas[44](S) sdar[43](S) sdaq[42] sdap[41] sdao[40] sdan[39] sdam[38] sdal[37] sdak[36] sdaj[35] sdai[34] sdah[33] sdag[32] sdaf[31] sdae[30] sdad[29] sdac[28] sdab[27] sdaa[26] sdz[25] sdy[24] sdx[23] sdw[22] sdv[21] sdu[20] sdt[19] sds[18] sdr[17] sdq[16] sdp[15] sdo[14] sdn[13] sdm[12] sdl[11] sdk[10] sdj[9] sdi[8] sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb[1] sda[0]
      80094041856 blocks super 1.2 level 6, 256k chunk, algorithm 2 [43/43] [UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU]
      bitmap: 2/4 pages [8KB], 262144KB chunk

unused devices: 

Load balancing in EC2 with Nginx and HAProxy

We wanted to setup a loadbalanced web cluster in AWS for expansion. My first inclination was to use ELB for this, but I soon learned that ELB doesn’t let you allocate a static IP, requiring you to refer to it only by DNS name. This would be OK except for the fact that our current DNS provider, Dyn, requires IP addresses when using their GSLB (geo-based load balancer) service.

Rather than let this derail the whole project, I decided to look into the software options available for loadbalancing in EC2. I’ve been a fan of hardware load balancers for a while, sort of looking down at software-based solutions without any real rationale, but in this case I really had no choice so I figured I’d give it a try.

My first stop was Nginx. I’ve used it before in a reverse-proxy scenario and like it. The problem I had with it was that it doesn’t support active polling of nodes – the ability to send requests to the webserver and mark the node as up or down based on the response. As far as I can tell, using multiple upstream servers in Nginx allows you to specify max_fails and fail_timeout, however a “fail” is determined when a real request comes in. I don’t want to risk losing a real request – I like active polling.
Continue reading “Load balancing in EC2 with Nginx and HAProxy”

Installing Sun (Oracle) JDK 1.5 on an EC2 instance

I’m currently working on moving a Tomcat-based application into EC2. The code was written for Java 5.0. While Java 6 would probably work, I’d like to keep everything as “same” as possible, since EC2 presents its own challenges. I spun up a couple of t1.micro instances and copied everything over, including the Java 5 JDK, jdk-1_5_0_22-linux-amd64.rpm. Installing from RPM was easy, but the EC2 instance defaults to using OpenJDK 1.6:

[root@ec2 ~]# java -version
java version "1.6.0_20"
OpenJDK Runtime Environment (IcedTea6 1.9.10) (amazon-52.1.9.10.40.amzn1-x86_64)
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)

There were a couple of things I had to do to get the system to accept the Sun JDK as its “real” java.

Alternatives

Red Hat’s “alternatives” system is designed to allow a system to have multiple versions of a program installed and make it easy to choose which one you want to run. Unfortunately I’ve found the syntax a bit strange and always have to Google it, so I figured I’d document it here for posterity.

So here’s the default:

[root@ec2 ~]# alternatives --config java

There is 1 program that provides 'java'.

  Selection    Command
-----------------------------------------------
*+ 1           /usr/lib/jvm/jre-1.6.0-openjdk.x86_64/bin/java

Enter to keep the current selection[+], or type selection number: 

Here’s how to add Sun java, assuming the java binary is in /usr/java/jdk1.5.0_22/jre/bin/java (where the RPM puts it).

[root@ec2 ~]# alternatives --install /usr/bin/java java /usr/java/jdk1.5.0_22/jre/bin/java 1
[root@ec2 ~]# alternatives --config java
There are 2 programs which provide 'java'.

  Selection    Command
-----------------------------------------------
*+ 1           /usr/lib/jvm/jre-1.6.0-openjdk.x86_64/bin/java
   2           /usr/java/jdk1.5.0_22/jre/bin/java

Enter to keep the current selection[+], or type selection number: 2
[root@ec2 ~]# java -version
java version "1.5.0_22"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_22-b03)
Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_22-b03, mixed mode)

Yay! Unfortunately this doesn’t help with the other problem I had with Tomcat, which was that EC2 instances set the JAVA_HOME var to OpenJDK as well (/usr/lib/jvm/jre). Fortunately this is an easy fix as well.

Setting JAVA_HOME

The JAVA_HOME var is set in /etc/profile.d/aws-apitools-common.sh. Comment out this line:

export JAVA_HOME=/usr/lib/jvm/jre

Create a new file, /etc/profile.d/sun-java.sh, and put this in it:

export JAVA_HOME=/usr/java/jdk1.5.0_22/jre

Also in that file I added the following to instruct the JVM to process all dates in America/New_York, since that’s the timezone all of our other servers use, and it makes reading log files easier when all dates are in the same tz:

export TZ=America/New_York

(I found I had to do this even after pointing /etc/localtime to the correct zoneinfo – Java was stuck on UTC even after the rest of the system was using America/New_York.)

Rescan SATA bus (aka hot-adding a SATA disk on a Linux guest in VMware without rebooting)

Linux supports hot-adding disks but whenever I add a new vdisk in VMware the new disk doesn’t show up unless I reboot, which defeats the purpose of hot-add. This command forces a rescan of the bus:

echo "- - -" > /sys/class/scsi_host/host0/scan

dmesg shows the new disk has been found:

  Vendor: VMware    Model: Virtual disk      Rev: 1.0 
  Type:   Direct-Access                      ANSI SCSI revision: 02
 target0:0:2: Beginning Domain Validation
 target0:0:2: Domain Validation skipping write tests
 target0:0:2: Ending Domain Validation
 target0:0:2: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 127)
SCSI device sdd: 1048576000 512-byte hdwr sectors (536871 MB)
sdd: Write Protect is off
sdd: Mode Sense: 03 00 00 00
sdd: cache data unavailable
sdd: assuming drive cache: write through
SCSI device sdd: 1048576000 512-byte hdwr sectors (536871 MB)
sdd: Write Protect is off
sdd: Mode Sense: 03 00 00 00
sdd: cache data unavailable
sdd: assuming drive cache: write through
 sdd: unknown partition table
sd 0:0:2:0: Attached scsi disk sdd
sd 0:0:2:0: Attached scsi generic sg3 type 0

Now, why there’s no “rescan_sata” command is something I can’t fathom, but that’s Linux for you.

Making sure SSLv2 is disabled in Apache (and Nginx)


Edit Jan 24, 2012: Deleted all the crap from this story and just left the recommended Apache and Nginx SSL cipher suites for maximum security without SSLv2 and without BEAST vulnerability (at least according to Qualys).

Apache httpd

SSLProtocol -ALL +SSLv3 +TLSv1
SSLCipherSuite ECDHE-RSA-AES256-SHA384:AES256-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!AESGCM;
SSLHonorCipherOrder on

nginx

        ssl_protocols  SSLv3 TLSv1;
        ssl_ciphers     ECDHE-RSA-AES256-SHA384:AES256-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!AESGCM;
        ssl_prefer_server_ciphers   on;

Source:

Go Daddy $12.99 SSL Sale!