Exchange (OWA) CAS crashes with 503 error – again

This just started happening again, with these errors appearing in the event viewer:

Log Name: System
Source: Microsoft-Windows-WAS
Date: 9/18/2011 11:16:33 AM
Event ID: 5011
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: exch2010fe1
Description:
A process serving application pool 'MSExchangeOWAAppPool' suffered a
fatal communication error with the Windows Process Activation Service.
The process id was '3760'. The data field contains the error number.

Log Name: System
Source: Microsoft-Windows-WAS
Date: 9/17/2011 6:47:07 AM
Event ID: 5009
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: exch2010fe1
Description:
A process serving application pool 'MSExchangeOWAAppPool' terminated
unexpectedly. The process id was '3108'. The process exit code was
'0x800703e9'.

Log Name: Application
Source: Application Error
Date: 9/17/2011 6:46:30 AM
Event ID: 1000
Task Category: (100)
Level: Error
Keywords: Classic
User: N/A
Computer: exch2010fe1
Description:
Faulting application name: w3wp.exe, version: 7.5.7600.16385, time
stamp: 0x4a5bd0eb
Faulting module name: KERNELBASE.dll, version: 6.1.7600.16385, time
stamp: 0x4a5bdfe0
Exception code: 0xe053534f
Fault offset: 0x000000000000aa7d
Faulting process id: 0x%9
Faulting application start time: 0x%10
Faulting application path: %11
Faulting module path: %12
Report Id: %13

After reviewing the IIS logs and the event logs, I think it has to do with the WebReady document viewer – the thing in OWA that renders and lets you view .doc attachments within the browser rather than forcing you to open Word or Excel. I think users were attempting to open corrupted files and that was causing it to crash. I’ve disabled Webready in EMC (Server Config -> CAS) and I’ll see what happens.

Go Daddy $12.99 SSL Sale!

Integrating Amazon Simple Email Service with postfix for SMTP smarthost relaying.

So, we’ve outgrown the 500 outbound messages/day limit imposed by Google Apps’s Standard tier. A wise friend suggested SendGrid, but I figured it was worth looking into what options Amazon provides. I found SES and am in the process of setting it up. Hopefully I can set it up as a drop-in replacement, obviating the need for code changes to use it. SES is attractive for us because:

Free Tier
If you are an Amazon EC2 user, you can get started with Amazon SES for free. You can send 2,000 messages for free each day when you call Amazon SES from an Amazon EC2 instance directly or through AWS Elastic Beanstalk. Many applications are able to operate entirely within this free tier limit.

Note: Data transfer fees still apply. For new AWS customers eligible for the AWS free usage tier, you receive 15 GB of data transfer in and 15 GB of data transfer out aggregated across all AWS services, which should cover your Amazon SES data transfer costs. In addition, all AWS customers receive 1GB of free data transfer per month.

Free to try? Sounds good.

After signing up, the first thing I did was download the Perl scripts. Create a credentials file with your AWS access key ID and Secret Key (credentials can be found here when logged in). The credentials file (aws-credentials) should look like this:

AWSAccessKeyId=022QF06E7MXBSH9DHM02
AWSSecretKey=kWcrlUX5JEDGM/LtmEENI/aVmYvHNif5zB+d9+ct

Make sure to chmod 0600 aws-credentials. To ensure it’s working, run:

$ ./ses-get-stats.pl -k aws-credentials -s

If it doesn’t return anything it should be working correctly.

Next, you need to add at least one verified email address:

$ ./ses-verify-email-address.pl -k aws-credentials --verbose -v support@example.com

Amazon will send a verification message to support@example.com with a link you need to click to verify the address. Once you click, it’s verified. It’s important to note that initially your account will only be able to send email to verified addresses. According to this thread, you need to submit a production access request to send to unverified To: addresses. I did this and got my “approval” email about 30 minutes later.

To send a test email:

$ ./ses-send-email.pl --verbose -k aws-credentials -s "Test from SES" -f support@example.com evan@example.com
This is a test message from SES.

(Press ctrl-D to send.)

The next step is integrating the script with sendmail/postfix. The first thing I did was move my scripts to /opt/ (out of /root/) and attempt to run them with absolute pathnames (rather than ./ses-send-email.pl) and I got perl @INC errors:

[root@web2 ~]$ mv amazon-email/ /opt/
[root@web2 ~]$ /opt/ses-get-stats.pl -k aws-credentials -s
-bash: /opt/ses-get-stats.pl: No such file or directory
[root@web2 ~]$ /opt/amazon-email/ses-get-stats.pl -k aws-credentials -s
Can't locate SES.pm in @INC (@INC contains: /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.7/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.6/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl/5.8.7 /usr/lib/perl5/site_perl/5.8.6 /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.7/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.6/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl/5.8.7 /usr/lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8 .) at /opt/amazon-email/ses-get-stats.pl line 23.
BEGIN failed--compilation aborted at /opt/amazon-email/ses-get-stats.pl line 23.

The problem is that SES.pm isn’t in perl’s include path. To solve this, I tried adding the directory to the PERL5LIB environment var:

[root@web2 amazon-email]$ PERL5LIB=/opt/amazon-email/
[root@web2 amazon-email]$ echo $PERL5LIB
/opt/amazon-email/
[root@web2 amazon-email]$ cd
[root@web2 ~]$ export PERL5LIB
[root@web2 ~]$ /opt/amazon-email/ses-get-stats.pl -k aws-credentials -s
Cannot open credentials file . at /opt/amazon-email//SES.pm line 54.
[root@web2 ~]$ /opt/amazon-email/ses-get-stats.pl -k /opt/amazon-email/aws-credentials -s
Timestamp               DeliveryAttempts        Rejects Bounces Complaints
2011-04-27T20:27:00Z    1                       0       0       0
[root@web2 ~]$

This worked for setting all users’ PERL5LIB … but didn’t allow postfix to send the message. After a couple more attempts at doing this “the right way,” I just ended up dropping a symlink to SES.pm in /usr/lib/perl5/site_perl and the @INC error went away.

After following Amazon’s instructions for editing main.cf and master.cf, I still was unable to send mail through Postfix, even though I could send directly through the perl scripts. I kept getting this error:

Apr 28 11:26:32 web2 postfix/pipe[27226]: A2AD33C9A6: to=, relay=aws-email, delay=0.35, delays=0.01/0/0/0.34, dsn=5.3.0, status=bounced (Command died with status 1: "/opt/amazon-email/ses-send-email.pl". Command output: Missing final '@domain' )

Google led me to this blog post which led me to this other blog post which illuminated the problem: apparently the Postfix pipe macro ${sender} uses the user@hostname of the mail sender. Since the hostname of an EC2 machine is usually something crazy like dom11-22-33-44.internal, this is not likely a validated sending email address. So the solution proposed by Ben Simon was to create a regex to map user@internal to user@realdomain.com and have postfix map everything. This didn’t work for me or the bashbang.com guys, who changed it to map from user@internal to validuser@realdomain.com. I found that you can eliminate the need for the mapping entirely by changing the master.cf entry to this:

  flags=R user=mailuser argv=/opt/amazon-email/ses-send-email.pl -r -k /opt/amazon-email/aws-credentials -e https://email.us-east-1.amazonaws.com -f support@example.com ${recipient}

The only difference between the above line and Amazon’s suggestion is that this replaces “-f ${sender}” with “support@example.com” which is a validated email address.

After this I was able to relay email successfully through SES. Whew!

Update 5/26/2011: We’ve been relaying through SES without issues for a few weeks now. I recently ran ses-get-stats.pl to see how many messages we’re actually sending and it’s a lot lower than expected. I’m still glad we moved to SES though, since it has no hard cap like Google Apps does:

$ /opt/amazon-email/ses-get-stats.pl -k /opt/amazon-email/aws-credentials -q
SentLast24Hours Max24HourSend   MaxSendRate
317             10000           5

Going back to FiOS

I’m not sure why these guys operate this way – they’re more than happy to lose me as a customer and then throw huge discounts at me to get me back. If they’d just give me a good price I’d love not to have to go through this rigmarole. But after being with Cablevision for 2 months I checked Verizon’s pricing and it beat my current deal with Cablevision.

FiOS digital voice with number ported for free; 25/25 Mbps internet; HMDVR free “forever” plus a second HD STB, Showtime, Movie Channel and Flix. Since I already had the battery thing installed last time I had FiOS they gave me a fair discount. Basically the whole package for $87/month + tax, price locked for 2 years, no contract. Not as great of a deal as I’d had with FiOS originally, but it’s pretty good, and FiOS’s service is definitely better than Cablevision’s. I’ve heard Cablevision was rolling out their “DVR plus” service with all programs recorded “in the cloud” rather than on the actual box, but it’s been two months and I haven’t heard of it coming to Long Island. So basically 2 years later Cablevision’s service is exactly the same while Verizon has iPhone apps to control the DVR and use the phone as a remote, plus DVR that’s much faster and just generally better service.

On a side note, I noticed tonight I was having problems trying to stream Netflix to my Wii. I tried loading netflix.com on my laptop and that also didn’t work, it said “couldn’t find server movies.netflix.com.” I tested this via dig on my linux box and sure enough, movies.netflix.com isn’t resolving against the default Cablevision nameserver (167.206.3.206) – getting a SERVFAIL:

[evan@lunix ~]$ dig movies.netflix.com

; <> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <> movies.netflix.com
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 17569
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;movies.netflix.com.            IN      A

;; ANSWER SECTION:
movies.netflix.com.     232     IN      CNAME   merchweb-frontend-1502974957.us-east-1.elb.amazonaws.com.

;; Query time: 2129 msec
;; SERVER: 167.206.3.206#53(167.206.3.206)
;; WHEN: Sun Apr 24 01:23:58 2011
;; MSG SIZE  rcvd: 103

I tried the same query against Google’s nameserver (8.8.8.8) and it resolves correctly:

[evan@lunix ~]$ dig movies.netflix.com @8.8.8.8

; <> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <> movies.netflix.com @8.8.8.8
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43718
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;movies.netflix.com.            IN      A

;; ANSWER SECTION:
movies.netflix.com.     300     IN      CNAME   merchweb-frontend-1502974957.us-east-1.elb.amazonaws.com.
merchweb-frontend-1502974957.us-east-1.elb.amazonaws.com. 39 IN A 174.129.220.6

;; Query time: 34 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sun Apr 24 01:37:26 2011
;; MSG SIZE  rcvd: 119

I set my router to resolve against 8.8.8.8 rather than whatever Cablevision provides and now it works. I’m not sure if this is related to the big EC2 disaster of the past few days but it looks more like Cablevision’s fault than Amazon’s or Netflix’s.

How I fixed my Wii’s noisy disc drive & read errors.

I got my Wii in late 2006 and around 2009 I noticed it was starting to sound like a circular saw when the disc drive was spinning. This was annoying but it didn’t affect the games so I never thought much about it.

About two weeks ago, however, New Super Mario Bros. Wii stopped working:

The Wii basically worked except for the disc drive. As soon as I put a disc in, I’d get the error above – even if I didn’t start start the game. I tried blowing in the disc slot (the only thing I could think of… worked for my old NES!) but the problem continued. I figured I had two choices: buy another Wii or attempt to fix mine. Since mine was already essentially useless I figured it couldn’t hurt anything to try fixing it.

After some searching I quickly learned that to do any work inside a Wii requires a Tri-Wing screwdriver just to open the case. I went to the hardware store and got a Tri-Wing bit but it was too big to be of any use on the Wii. I found this Silverwing Tri-Wing screwdriver on Amazon for under $5 that did the trick.

With trusty screwdriver in hand, I opened the Wii thusly:

With it now open I followed this guy’s advice:

Bending these small triangles down created a gap that stopped the vibration & the noise and made the Wii playable again.

Bend these small pieces down to stop the Wii disc drive's loud noise.
Bend these small pieces down to stop the Wii disc drive's loud noise.

It has some new weird sounds when the disc first spins up and when it spins it down (I assume the braking mechanism) but it’s practically silent during gameplay. Yay!

After switching back to Cablevision, FiOS users can’t call us.

So we switched back to Cablevision and it went pretty well, but apparently Verizon users can’t call our house number (ported from Verizon to Cablevision). Verizon users have to call from their mobiles in order to complete the call. I’m guessing that Verizon hasn’t updated their systems to indicate that they no longer “own” our number and is trying to route the call inside their network. Sucks because I can’t imagine Verizon jumping to help fix this since I’m not their customer anymore.

VMware Workstation 7 – virtual ethernet fails to start after changing vmnet8 subnet

After my recent wipe of my laptop, I reinstalled VMware Workstation and my Win XP VM was working fine. The one wrinkle I faced was that the subnet for the vmnet8 (NAT) vnic had changed from 192.168.250.0/24 to 173.16.132.0/24. The host machine had been 192.168.250.1 so rather than reconfiguring everything on the guest to point to a new IP for the host I figured it would be easier to change the subnet for vmnet8. I went into the Virtual Network Editor and just changed the subnet. Seemed to work correctly, but after doing a release/renew in Win XP I couldn’t get an IP.

I tried disconnecting the vnic and reconnecting it; the guest recognized that the “cable was unplugged,” but still couldn’t get an IP. I rebooted the guest – same thing. Restarted the vmware service and saw this:

[root@ehoffman ~]# /etc/init.d/vmware restart
Stopping VMware services:
   VMware USB Arbitrator                                   [  OK  ]
   VM communication interface socket family                [  OK  ]
   Virtual machine communication interface                 [  OK  ]
   Virtual machine monitor                                 [  OK  ]
   Blocking file system                                    [  OK  ]
Starting VMware services:
   VMware USB Arbitrator                                   [  OK  ]
   Virtual machine monitor                                 [  OK  ]
   Virtual machine communication interface                 [  OK  ]
   VM communication interface socket family                [  OK  ]
   Blocking file system                                    [  OK  ]
   Virtual ethernet                                        [FAILED]
[root@ehoffman ~]#

That’s weird. The vnic is up with the specified IP:

[root@ehoffman vmnet8]# ifconfig
vmnet1    Link encap:Ethernet  HWaddr 00:50:56:C0:00:01
          inet addr:172.16.3.1  Bcast:172.16.3.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:fec0:1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

vmnet8    Link encap:Ethernet  HWaddr 00:50:56:C0:00:08
          inet addr:192.168.250.1  Bcast:192.168.250.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:fec0:8/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:36 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

Checked the /var/log/vnetlib logfile and it appears to be some problem starting DHCP on vmnet8:

Internet Software Consortium DHCP Server 2.0
Copyright 1995, 1996, 1997, 1998, 1999 The Internet Software Consortium.
All rights reserved.

Please contribute if you find this software useful.
For info, please visit http://www.isc.org/dhcp-contrib.html

Configured subnet: 172.16.3.0
Setting vmnet-dhcp IP address: 172.16.3.254
Opened: /dev/vmnet1
Recving on     VNet/vmnet1/172.16.3.0
Sending on     VNet/vmnet1/172.16.3.0
Internet Software Consortium DHCP Server 2.0
Copyright 1995, 1996, 1997, 1998, 1999 The Internet Software Consortium.
All rights reserved.

Please contribute if you find this software useful.
For info, please visit http://www.isc.org/dhcp-contrib.html

Address range 192.168.250.128 to 192.168.250.254 not on net 192.168.250.1/255.255.255.0!
exiting.
Failed to start DHCP service on vmnet8
Failed to start some/all services
Feb 23 09:28:46 VNL_Load - LOG_ERR logged
Feb 23 09:28:46 VNL_Load - LOG_WRN logged
Feb 23 09:28:46 VNL_Load - LOG_OK logged
Feb 23 09:28:46 VNL_Load - Successfully initialized Vnetlib
Feb 23 09:28:46 VNL_StartService - Started "Bridge" service for vnet: vmnet0
Feb 23 09:28:47 VNL_CheckSubnetAvailability - Subnet: 172.16.3.0 on vnet: vmnet1 is available
Feb 23 09:28:47 VNL_CheckSubnetAvailability - Subnet: 192.168.250.0 on vnet: vmnet8 is available
Feb 23 09:28:47 VNL_StartService - Started "DHCP" service for vnet: vmnet1
Feb 23 09:28:47 VNL_EnableNetworkAdapter - Successfully enabled hostonly adapter on vnet: vmnet1
Feb 23 09:28:47 VNLServiceStart - Daemon process did not report status, returning failure
Feb 23 09:28:47 VNL_StartService - Failure in starting "DHCP" service for vnet: vmnet8
Feb 23 09:28:47 VNL_StartService - Started "NAT" service for vnet: vmnet8
Feb 23 09:28:47 VNL_EnableNetworkAdapter - Successfully enabled hostonly adapter on vnet: vmnet8
Feb 23 09:28:47 VNLServiceStatus - pid: 13512 for Netdetect service daemon on vnet: 0 is stale
Feb 23 09:28:47 VNL_StartService - Started "Netdetect" service for vnet: vmnet0
Feb 23 09:28:47 VNL_Unload - Vnetlib unloaded.
Started Bridge networking on vmnet0
Started DHCP service on vmnet1
Enabled hostonly virtual adapter on vmnet1
Started NAT service on vmnet8
Enabled hostonly virtual adapter on vmnet8
Started Network detection service

The important line there is Address range 192.168.250.128 to 192.168.250.254 not on net 192.168.250.1/255.255.255.0! Apparently DHCP is misconfigured. The config file for dhcpd for vmnet8 is /etc/vmware/vmnet8/dhcpd/dhcpd.conf . Here’s what it looked like:

subnet 192.168.250.1 netmask 255.255.255.0 {
        range 192.168.250.128 192.168.250.250;
        option broadcast-address 192.168.250.255;
        option domain-name-servers 192.168.250.1;
        option domain-name localdomain;
        default-lease-time 1800;                # default is 30 minutes
        max-lease-time 7200;                    # default is 2 hours
        option routers 192.168.250.2;
}

I changed the range to 192.168.250.2 to 192.168.250.127, thinking that was the problem, but it turned out to be the “subnet” line – the subnet should be “192.168.250.0 netmask 255.255.255.0” rather than “192.168.250.1 …” After changing that, everything Worked As Intended.