2022 Outage Analysis report finds that digital infrastructure downtime costs and consequences are worsening
- Published: Thursday, 09 June 2022 08:33
The digital infrastructure sector is struggling to achieve a measurable reduction in outage rates and severity, and the financial consequences and overall disruption from outages are steadily increasing, according to Uptime Institute’s 2022 Outage Analysis report.
“Digital infrastructure operators are still struggling to meet the high standards that customers expect and service level agreements demand – despite improving technologies and the industry’s strong investment in resiliency and downtime prevention,” said Andy Lawrence, founding member and executive director, Uptime Institute Intelligence.
“The lack of improvement in overall outage rates is partly the result of the immensity of recent investment in digital infrastructure, and all the associated complexity that operators face as they transition to hybrid, distributed architectures,” said Lawrence. “In time, both the technology and operational practices will improve, but at present, outages remain a top concern for customers, investors, and regulators. Operators will be best able to meet the challenge with rigorous staff training and operational procedures to mitigate the human error behind many of these failures.”
Uptime’s annual outage analysis draws on multiple surveys, information supplied by Uptime Institute members and partners, and its database of publicly reported outages.
Key findings include:
- High outage rates haven’t changed significantly. One in five organizations report experiencing a ‘serious’ or ‘severe’ outage (involving significant financial losses, reputational damage, compliance breaches and in some severe cases, loss of life) in the past three years, marking a slight upward trend in the prevalence of major outages.
- The proportion of outages costing over $100,000 has soared in recent years. Over 60 percent of failures result in at least $100,000 in total losses, up substantially from 39 percent in 2019. The share of outages that cost upwards of $1 million increased from 11 percent to 15 percent over that same period.
- Power-related problems continue to dog data center / centre operators. Power-related outages account for 43 percent of outages that are classified as significant (causing downtime and financial loss). The single biggest cause of power incidents is uninterruptible power supply (UPS) failures.
- The overwhelming majority of human error-related outages involve ignored or inadequate procedures. Nearly 40 percent of organizations have suffered a major outage caused by human error over the past three years. Of these incidents, 85 percent stem from staff failing to follow procedures or from flaws in the processes and procedures themselves.
- External IT providers cause most major public outages. The more workloads that are outsourced to external providers, the more these operators account for high-profile, public outages. Third-party, commercial IT operators (including cloud, hosting, colocation, telecommunication providers, etc.) account for 63 percent of all publicly reported outages that Uptime has tracked since 2016. In 2021, commercial operators caused 70 percent of all outages.
- Prolonged downtime is becoming more common in publicly reported outages. The gap between the beginning of a major public outage and full recovery has stretched significantly over the last five years. Nearly 30 percent of these outages in 2021 lasted more than 24 hours, a disturbing increase from just 8 percent in 2017.
- Public outage trends suggest there will be at least 20 serious, high-profile IT outages worldwide each year. Of the 108 publicly reported outages in 2021, 27 were serious or severe. This ratio has been fairly consistent since the Uptime Intelligence team began cataloging major outages in 2016, indicating that roughly one-fourth of publicly recorded outages each year are likely to be serious or severe.