Why disaster recovery should become a thing of the past…
- Published: Friday, 19 October 2018 09:32
Patrick Smith traces the history of IT disaster recovery and explains why he believes that it is time for the discipline to be pensioned off alongside RTOs and RPOs.
For businesses today, regardless of industry, the outage of a key IT system ranks among the most serious technology challenges they can face. In fact, the Business Continuity Institute’s 2018 Horizon Scan Report estimates that unplanned outages are the third biggest risk to businesses globally. Beyond the financial ramifications of downtime, the long-term reputational consequences are significant as customer confidence is dented. Rebuilding trust after a major IT failure can be a multi-year process.
In the 1970s when data center / centre managers first came into being, they began to understand how dependent on computers their organizations would soon become. With that in mind they instigated the notion of disaster recovery – an insurance should one or more applications, storage components, databases or network elements go offline.
As IT developed into the 1990s and the dawn of the Internet era, our connectivity to and reliance upon computer systems became far more intense. As computers began to undertake real-time processing, not just batch processing, it was even more important that IT did not miss a beat. While there were global incidents caused by earthquakes, floods and other natural disasters, downtime was more likely to occur due to challenges with utilities, technology change or human error.
Two closely linked disciplines emerged: business continuity, or how the firm kept delivering its goods and services in case of an incident, and disaster recovery, otherwise known as how to get the IT environment back online after a problem.
A lack of affordable solutions meant that even into the early 1990s, always-on, high bandwidth connectivity was a challenge. IT managers therefore tended to build redundant replicas of their IT environments, often in addition to infrastructure supporting local high availability. This wasn’t simply a business decision, regulators across many industries mandated that organizations offering critical services, such as financial services, embedded proper contingency into their firm’s IT environment. Processes to get systems back online had to be written, distributed across organizations and more importantly, they had to be practiced; especially when you consider the financial losses that were and can still be incurred from a loss of trading for even a few minutes.
All of this spawned an incredibly lucrative disaster recovery industry – and dented IT budgets for years. Disaster recovery planners were wedded to SLAs that included recovery point and recovery time objectives and were mapped to specific components of the IT infrastructure.
Despite this protection, when disaster struck enterprises were almost always likely to attempt to ‘fix in place’ rather than invoke disaster recovery. Even when a recent test had been successful the implications of failover to the IT environment meant using the disaster recovery capability was considered a last resort, particularly given the complexity in reverting back to normal running following restoration of the failed component or service. This ‘stay or go’ dilemma would add considerable delay to bringing back business services.
And that’s the way it remained until cloud became prevalent and the always-on economy emerged, with its constant thirst for real-time access to and processing of data.
Traditional disaster recovery – particularly getting data recovered in the event of an emergency and the costs and potential delays associated with it – is a concern for almost all organizations. However, I do not believe that in today’s real-time, never-off world, we should use outdated, legacy recovery principles to deliver business continuity. There’s no reason why organizations should pay for fully redundant data systems that are seldom used, especially when you could reduce your IT budget by transitioning from a disaster recovery approach to business continuity. Why continue to waste time, resources and money maintaining rigid processes to restore data at one disastrous moment, when there are now far more fluid ways to support a far more agile, dynamic way of working.
Businesses should look for affordable solutions that allow them to adopt a synchronous active/active approach to business continuity. Through synchronous replication, data is made available across two sites simultaneously. This allows businesses to create multiple concurrently running data sets at multiple sites. Even if one site were to fail, it won’t result in downtime. This enables an uninterrupted user experience through shared, multi-location processing that reroutes traffic automatically in the event of failure. As a result the concepts of recovery point objective (RPO) and recovery time objective (RTO) need no longer be a consideration for CIOs.
Once this approach has been implemented IT professionals need only confirm that all instances are still online and working, that’s all there is to disaster scenario system preparedness with this new technology.
If you would like to pension off disaster recovery and the rigidity for which it has become renowned, speak to your preferred vendors to see how they can help.
Patrick Smith, Field CTO for EMEA, Pure Storage
Patrick Smith is Pure Storage’s Field CTO for EMEA. As a senior technical advisor, he provides crucial input and leadership across engineering, product management, sales, marketing as well as presales.
With a 25 year career in financial services technology, Patrick has held roles at Credit Suisse, Goldman Sachs, Merrill Lynch and Nomura in London and New York. Most recently, he was responsible for core infrastructure engineering at Deutsche Bank.
Patrick’s career began as an electronic design engineer at STC Submarine Systems, where he worked on underwater telecommunications systems. He holds a Bachelor of Engineering in electrical and electronic engineering from Staffordshire University.