Twitter experienced a major outage last night, with tens of thousands of users around the world unable to use the platform for over an hour. The issue appears to have been resolved now and Twitter have stressed that there is no evidence of a security breach; attributing the outage to an “inadvertent change” to their systems.
Whilst not a huge incident in its own right, it is a timely reminder of the vulnerability of cloud-based apps. There is a widespread perception that the hosting of these services is so resilient that they do not experience outages. However, this is shown not to be the case by an analysis of public announcements about outages in popular internet services over the period 2009 to 2015 including:
- Twitter; and
Gunawi et al (2016) found a total of 597 unplanned outages across the 32 services that they studied; ie almost 3 outages per year for each service. Only 7 of the 32 services achieved 99.9% uptime, and 2 didn’t even achieve 99% uptime. It would appear then that the idea of “five-nines uptime” (less than 5 minutes of downtime per year) is still some way off! It must also be remembered that this only includes outages that were reported in the media.
Awareness of the reliability of the cloud is of increasing importance as organisations move towards cloud-based hosting of their critical corporate systems. Whilst it obviates the need for traditional IT DR planning, moving to the cloud certainly does not guarantee 100% uptime.