Sometime Around Midnight

It's sometime around midnight and I'm performing real-world disaster recovery testing on our live servers. In the first test we switched over from the East Coast Infrastructure to the West Coast with a 19 minute downtime. Not ideal. My target was 15 and one simple mistake cost us those extra minutes.

Shortly we will be performing the switch-over using a new idea and if it works we can get the time down to 7 minutes if everything goes to plan.

We are allowed 24 hours of downtime before anybody gets upset with regard to the contract - as long as we we let the customers know if we are going to be down for more than 8 hours. No advance warning needed.

Just because we are allowed doesn't mean we should and I'll be very happy if we can hit sub-ten minutes. In an ideal world we could switch infrastructure between locations on the fly with no downtime whatsoever. We are not there and to be honest there isn't a business case that means we should invest the engineering resources in building such a thing, no matter how technically challenging and satisfying it would be.

Still waiting for RDS to do it's thing.

