On Monday, I had a conversation with Derrick Harriss of GigaOM, which he published here. I’ve followed up with a piece on my tech blog, adding a few thoughts and ripping the naivety of certain analysts…

2 Responses to “Live, learn, fix, repeat – my thoughts on the recent AWS outage.”
  1. Dave Walker says:

    I read the GigaOM piece; it was very thoughtfully put, and it’s nice to see the press calling on you :-) .

    I also read the detailed report Amazon released about the outage – good on them for doing so – which served to correct a misconception I’ve had for a while. Basically, I’d erroneously mapped AWS availability zones onto geographically located datacentres, in my mind; I thought AMER would be AWS Zone A, EMEA Zone B, APAC Zone C or some permutation thereof. As it turns out, if I read the report correctly, you can have all 3 zones within the same datacentre. I’m not sure that’s a good idea.

    Also, I wonder whether there’s a bigger story here – some of the customers’ services which were taken out were distributed across all 3 availability zones, and (owing to their size, I would hope) more datacentres than had an outage; for example, I’ve read a few articles by Adrian Cockroft about how careful Netflix have been, to maximise the resilience of their streaming service. Was the fact that a bunch of such distributed environments still went down, just a matter of availability zones being co-located, or did global server load balancing somehow fail to kick in (or was GSLB mostly done between datacentres that all suffered interruptions)?

    While I agree with your sentiment on :live, learn, fix, repeat”, I wonder if an element of the “learn, fix” component is going to involve changing the mapping between datacentres and AWS availability zones – and while it can make cloud economics calculations harder, I wonder if more cloud-using organisations will start to look at dividing the services they have hosted with public cloud service providers, across multiple such providers (I recommend a minimum of 3)…

  2.  
Creative Commons Attribution-ShareAlike 3.0 Unported
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported.