Hurricane Electric should have been prepared for this.
I’ve been reading, negotiating and occasionally writing service level agreements (SLAs) for many years. It’s an abstract exercise – so many nines for so much money. That’s until until something happens and you have to figure out whether 1. you stupidly hand waved the whole thing hoping that nothing would happen, 2. the calculated risk you took was worth it, or 3. you got screwed by your service provider.
This morning, none of this seems abstract. Yesterday evening, from about 6:30 p.m. to about 11:30 p.m., this website was offline because the Linode server that hosts it was down. Four companies share responsibility: PG&E for the lengthy power outage in Fremont, Hurricane Electric for the backup generator that didn’t work, Linode for picking a data center without adequate power redundancy and Tellus Venture Associates – aka me – for trusting in Linode to worry about the details.
It’s easy – and fun! – to blame PG&E for all the world’s problems, but power outages are a fact of life and this one doesn’t seem to be particularly egregious. Five hours is a long time but common enough, which is why critical facilities are supposed to have adequate back up capability. Hurricane clearly didn’t. It apparently relied on a single generator for the section of the data center where Linode lives, without proper maintenance and/or a Plan B if it failed.
Even with last night’s outage, Linode is still within its annual downtime quota – about nine hours – based on the SLA I accepted, although it failed on a monthly basis, far exceeding the allowable 43 minutes. Which means it owes me a prorated refund for the downtime: as a percentage of the monthly $25 fee I pay, five hours comes out to be 17 cents. Note to Linode: don’t bother.
PG&E performed poorly but as expected. Hurricane performed horribly. Even if the generator failure was a one-in-a-bazillion shot – which I seriously doubt – there’s no excuse for just kicking back and waiting for PG&E to fix things.
Linode performed well, from a cost/benefit perspective. It charges me $20 a month (plus another $5 for back up service) for more bandwidth, processing power and disk space than I need, with the clearly stated caveat it might go down every so often.
So, it comes down to me. Did I make a good choice? I think so. Having this website go dark for a few hours on a Friday night is annoying and might have a lingering effect on traffic, but it’s not going to kill or even dent my business. Typically, though, I evaluate my IT infrastructure once a year, so when I do, I’ll do some comparison shopping to see if an SLA upgrade is worth the money and the time involved.