RTO, MTPD and putting the cart before the horse
By Rainer Hübert, MBCI
This is a story of putting the cart before the horse, however, in this case, it probably was not avoidable because of the historical timeline of events and developments. So, let’s start at the beginning: and that was IT disaster recovery with its recovery time objective (RTO).
Initially, as we all know, the RTO described the time that the IT department had to repair a failed resource or to provide a contingency resource. As time moved on RTOs were also used by other organizational units that had responsibility for the availability of resources.To avoid misunderstandings: 'resource' is everything that is needed to run an administrative, production, service or other business process (1).
The problem was – and, more often than not, still is – that organizational units either did not know or did not care how much time the users of failed resources could live without a failed resource. They just set the RTO on the basis of the time they needed to solve the problem, regardless of whether this was quick enough for the resource using processes to avoid severe losses. It was up to those processes to cope with the situation. The guideline was ‘as quick as possible’: of course within the constraints of budgets and personnel available. Whether this was quick enough to save the company was not really known but was hoped for. It was also not known whether a somewhat slower approach could save costs because ‘as quick as possible’ was not actually necessary.
So someone came up with the idea to actually ask the users of those resources how long they could live with a disturbed or unavailable process because of a failed resource: only to find out, that the users didn’t know either!
At this point the concept of ‘maximum tolerable period of disruption’ (MTPD or MTPoD) was born and eventually implemented into the globally recognized British business continuity management standard, BS25999.
Unfortunately, with this, a new unanswered question arose: 'What is tolerable?'
The definition of BS25999 defines MTPD as the “duration after which an organization´s viability will be irrevocably threatened because of the adverse impacts that would arise as a result of not providing a product/service or performing an activity.” This is not a particularly helpful definition and it has led to many discussions about what it actually means. How close to bankruptcy will a company tolerate? What losses will or can it suffer?
To discuss this, I would like to introduce the concept of a ‘point of bankruptcy’ (PoB)- the point where a company has to file for bankruptcy - and add it to the arsenal of BCM lingo.
There is no clear answer to the question of “what is tolerable?” The answer depends on the risk appetite of the company; its tendency to embrace or to avoid risk. I call it the 'panic zone.' The bigger this is, the bigger is the time gap between the PoB and the MTPD, and the smaller is the operationalized value of ‘tolerable’ loss or risk.
Generally speaking, the higher the acceptance of risk is, or the smaller the panic zone can be, the closer a company is able to approach its PoB before panicking, and the cheaper the recovery measures and preparations are in the end.
A big panic zone equals less risk appetite, early panicking and little tolerance to loss and disruption. A small panic zone means large risk appetite, late panicking and lots of tolerance to loss and disruption.
Please note the Vector of Calculation on the top of the picture above. It is crucial in this approach to get the direction of the calculation right: The RTO(2) is no longer at the beginning of the calculation, but is the end result. The calculation in this model begins with the PoB.
Thus the calculation of the RTO(3) would be:
The nice thing with this approach is that we can clearly identify the point of bankruptcy, since it is determined in most countries by legislation and regulation and can be reported by a company’s financial organization (4). Based on this, we can quantify how much the company is allowed to lose in regard to turnover, cash or profits, or increase its debts, before it has to file for bankruptcy. Now we have an un-negotiable fixed point which we can start from when calculating MTPD and eventually RTO, which is not a matter of preference or appraisal, but based on hard facts and regulated algorithms.
Originally, the PoB is financial data, expressed in currency, not in time. So we need to identify what effect time has on reaching the PoB, thus translating financial data into time data. Consequently, we need to identify the effect of the duration of a failed process on loss of turnover, cash or profits, or increase of debts: which is typically done with a business impact analysis (BIA).
Historically, all this has begun with the RTO. Much later the MTPD was added. Now we factor in the PoB and can start calculating the time constraints of reaction to a disruption from this new fixed and undisputable point. Rather than being the start, the RTO ends up becoming the end result of determining the necessary reaction times after a disruption. Now the horse is put before the cart.
Author: Rainer Hübert, MBCI, REX Management Systems GmbH & Co.
(1) Compare the definition of ‘resource’ in the draft ISO 27301.
(2) In cases where the RTO is defined as Reaction Time + Repair Time + Buffer Time, the value of RTO is identical to the value of MTPD, however it should be mentioned that even in this case RTO still is not equal to MTPD, we are still talking about two different things. In this case RTO would be defined as Reaction Time + Repair Time + Buffer Time, needing to meet the time requirement of MTPD, while MTPD would be defined as PoB – Panic Zone.
(3) It is no longer: RTO + Buffer Time + Reaction Time = MTPD, since this would represent a false cause and reaction relation.
(4) Or it should be, since more often than not accountants cannot answer this question at short notice, since it most probably was never asked before.
I do not know of many companies that want to base any risk calculation or determination against the eventual possibility of a bankruptcy. On the other hand I do believe that senior level management do want to develop a cost efficient program to reduce risk and prevent adverse events. Even to the point of quantifying them to a particular monetary value (e.g., company or business unit is willing to tolerate up to $1M is losses).
So, the issue is to develop a valid RTO number. The RTO number is developed by the business function owner. They determine how long they can be without a particular service or application before they could possibly incur the maximum tolerable loss (MTL) value and over what time period that would occur. That time period becomes the MTPD.
Then the IT department provides the possible scenarios of IT recovery times versus costs from which the business owner needs to select the one most acceptable for them. This we will call RTO1.
Next the business owner needs to determine any immediate workarounds or processes that can be used to keep their operations functioning (i.e., order processing, AR Billing and etc.) until the IT component is restored or failed over to and the manually processed data can be resynchronized in the recovered system(s). We will call that RTO2. So the actual MTPD value equals RTO1 + RTO2.
This has worked successfully for me with numerous clients.
Francisco Fernández Antonio
Although I found the article interesting it didn’t provide useful guide on how to calculate the panic zone. I did the very same thing for my Enterprise. I calculated a PoB, but instead leaving to the Managers to determine the MTPD, I offered them a time where the incomes before the crisis would no longer sustain the company, therefore it was the maximum time the company could be “out of the market” before the damage would be permanent.
The model was set in differents scenarios, considering week of the month and month of the year and the financial forecast of sales and incomes as well as the payments in the short and long term.
I hope this Works for someone else.
Please no, not another acronym! MTPD is fine. If you find it hard estimating MTPD, then having to work out Point of Bankruptcy isn’t going to help.
We don’t need PoB and it’s not all that useful a concept as many organisations don’t go bankrupt – they just get taken over by another organisation.
CLICK HERE TO MAKE A COMMENT, OR DISCUSS AT:
•Date: 25th November 2011 • Region: Europe/World •Type: Article • Topic: BC plan development