|
At what point does an incident move beyond the capabilities of business continuity strategies? And how does an organization know when this point has been reached? Ian Charters, FBCI, addresses these questions.
In section 8.1 of BS 25999-1 it states that:
“The range of threats to be planned for should be determined by the organization’s risk appetite”. Risk appetite is defined in the glossary as “total amount of risk that an organization is prepared to accept, tolerate or be exposed to at any point in time”. It looks innocuous but how do we interpret this statement to set the scope and content of the business continuity plan? Do we write a plan that is written around predictable and likely threats? Or do we ‘weigh’ the risks and decide no business continuity plans are required but still claim conformance with the standard?
Perhaps this statement should be in the previous section (determining business continuity strategy) since it is a recommendation about the range of threats which the continuity strategy (rather than just the plans) should address. What is concerning, for those attempting to implement the standard, is that no guidance is forthcoming as to how such a ‘total level of risk’ is to be measured ; which is surely a prerequisite of a decision as to whether to accept exposure to it.
This article suggests a possible interpretation for this ‘range of threats’ and how this should determine the scope of strategies and plans.
We make certain assumptions when developing business continuity strategy and plans but these are often not documented and may therefore be misunderstood outside the BCM team. What does become quickly apparent is that we lack suitable terminology to succinctly express these ideas to those around us - exemplified by the ‘planning for a worst-case scenario’ cliché which can be interpreted as anything from a computer failure to nuclear annihilation.
When incidents occur the impact can be measured on a variety of scales including geographical, economic, physical and human impacts. Clearly the appropriate business continuity approach to an incident will depend on the scale of the actual event. But if we are hoping to plan a response in advance we do not have the luxury of prior knowledge of the impact. Instead, we should recognise that there are certain thresholds in an increasing intensity and scale of impacts at which the business continuity response will need to change and may be ineffective.
A localised incident, affecting a single site with minimal injuries to staff is well within the capability of a business continuity programme to provide an effective response, restoring the operation of the organization within acceptable timescales. Even if the locality is affected, an adequate separation between alternate sites and data storage locations should enable rapid recovery.
However a more widespread incident at or near a site may have impacts on the surrounding community causing economic problems, environment damage and even long-term relocation of population. A small business (such as an independent retail outlet) may have been totally dependent on this local market and infrastructure. A business continuity recovery strategy is designed to ensure that the business will be recovered to, roughly, the same position it was before the incident but if the market, labour or infrastructure situation has changed significantly this may not be appropriate and strategic decisions are required.
A serious, damaging event may also result in fatalities or long-term absence. Documentation and cross-training can only go so far; the loss of a significant proportion of the workforce may make it impossible to cover the required roles and run the business as before. Even a multinational organization may struggle to maintain its service in an area following significant local loss of staff. As well as problems caused by the lack of skills, the cause of the staff loss may result in reputational damage to the ‘brand’, making recovery in that location unviable.
The response of the local or national authorities to an incident should also be considered. Emergency powers may be imposed in response to a major physical incident, civil strife, invasion or war could occur and these may fatally hamper the ability of the organization to invoke its recovery procedures. Equipment may have been requisitioned, transport may be unavailable and personnel unable to take the required actions. In this case the appropriate strategy may be one of withdrawal from the area rather than continuity of the business.
If the incident is of an economic nature (such as a banking collapse) then business continuity methods and strategies provide no direct responses for these problems though some of the impacts, such as failure of suppliers, may be mitigated by supply-chain management. In these circumstances a thorough review of the strategic business direction may be required; though one would hope the BC Manager would be asked to contribute his/her unique view of its operation to this discussion.
The real ‘worst-case’ - a total annihilation of a district, region or the whole world - is a scenario for which business continuity has no answer because there is no business to recover (though civil authorities may still need to retain control). Businesses providing services solely to the Chernobyl region or in the parts of New Orleans left abandoned will never recover since there is no longer a requirement for those services. Likewise in a war-zone, an evacuation may become permanent withdrawal.
These are complicated (and somewhat dismal) ideas to get over to senior management when developing a business continuity strategy and we lack the language to express them succinctly. It is relatively easy to express geographical limits in the commercial sector in terms of viability and ‘market areas’ but more difficult for the provision of appropriate services in the public sector. However few useful terms exist to express the scale and intensity of an incident in terms of disruption or loss of life - and ‘risk appetite’ isn’t one of them.
We cannot develop a rational business continuity strategy without certain assumptions being made about the practical limits to our planning. This is not a risk decision - assigning probabilities as scientific as roulette for these sorts of events. Instead each organization must reach their own conclusions about the limits to its ability to recover which may include commercial, regulatory or reputational obligations; these limits can be explored, for example, in terms of a geographical scale, intensity of impacts and staff loses from which logical decisions can be made on such issues as separation distance, the extent of staff resilience required and the point at which the BCM ‘white flag of surrender’ is raised and top management have to take hard strategic decisions about a new direction for the organization in a changed environment.
One term which may be useful as a way of expressing the limits discussed above is ‘Maximum Survivable Incident’ (MSI). However other terms and metrics are need to refine our vocabulary in this area and make it understandable to top management so they can make appropriate decisions otherwise we risk giving the organization a false sense of security as to the capabilities of our plans.
When an existing strategy is reviewed, an organization may well discover that their planning assumptions have led to their effective MSI being lower than they required. For example alternate locations may be too close or staff skills too concentrated. Improvements in their business continuity strategy should then be aimed at increasing their ability to cope with a bigger incident - and we should have metrics which enable us to measure this improvement in a way that can be demonstrated to management.
Section 4.2 (Context) in BS 25999-1 says that BCM Policy should be ‘appropriate to the nature, scale, complexity, geography...’. Part of that context should be to state in the Policy the limitations of a BCM response and to identify at what points the scale of the impact will be in the hands of the civil authorities or require a strategic withdrawal rather than the planned business recovery. However we are hampered partly by a lack of accepted terminology and sometimes an unwillingness to face up to the realistic limitations to our response.
Author: Ian Charters, FBCI is a BCM consultant and trainer for Continuity Systems Ltd.
www.continuity.co.uk - ianc@continuity.co.uk
Make a comment
I found Ian's article most refreshing in its outlook. As somebody who has been involved in the business continuity industry for nearly a decade now, I have developed a view that we have three principal flavours of continuity management: operational continuity, business continuity and an overwhelming event.
Operational continuity is something that managers do on a daily basis - members of staff are on leave or call in sick, customer complaints have to be dealt with and there are often minor system disruptions. Managers don't refer to plans, they get on and manage the impacts as they arise, their continuity strategies normally revolving around reallocation or reprioritisation of work, approving overtime or bringing in temporary resource. In this context, [planned disruptions such as] the summer or Christmas holidays may be considered as continuity events.
Situations often escalate beyond these capabilities and this is where the organization's planned business continuity response can kick in. Established corporate solutions may be invoked, such as crisis communications, IT disaster recovery or alternate sites. Incident management teams are formed to respond to impacts occurring across more than one part of the operation.
Finally, we come to the incidents that Ian describes, that I refer to as ‘overwhelming events’. These events are characterised by multiple impacts, where the decisions relating to response implementation may be out of your control. Having worked in the City during the IRA bombing campaign of the 90s and during the tragic events of September 11th, I now realise that for these type of incidents, no amount of planning can mitigate all of the impacts. Some elements of a plan may help, for example contact details or a staff incident line may be very useful when trying to communicate with staff, but it should be recognised that decisions affecting your organization's options will be taken by people outside your control.
I was happy to have this theory validated, and indeed simplified, when consulting to a group of operational managers from a local authority. One of the managers nodded at the end of the briefing. "I think I get that," he said, "Operational continuity is stuff I deal with, business continuity is stuff I need help with and an overwhelming event means we're stuffed".
I am frequently amazed at the arrangements that some organizations feel they have to implement for these overwhelming events. For example, (and I fear this may be a heresy) there has been plenty of really good material published in recent years in relation to a flu pandemic. We're told that we're massively overdue a pandemic, we have seen mutations of the H5N1 strain and that given globalisation, the impacts will be seen worldwide. However, in the grand scheme of things, what can companies do that would keep its business operational? We cannot possibly predict all the effects of a pandemic: staff may be affected directly or indirectly through caring for family; people may actually demand more services - I doubt that supermarkets would be able to meet the demand for home delivery with staff impacts themselves. One thing is for sure, the government would formulate a response as the entire country is affected. You may have worked out extensive staffing strategies, but if the government tells people to stay at home, your planning will be undermined. In a way, this response may help your organization take decisions - consider the financial investments industry - if the markets are suspended, there will be no need to process new product applications.
If you plan for a flu pandemic, where does it stop? Not wanting to set the rabbits running, but we're told that we're overdue a solar storm, but I've not heard of too many BCM managers planning for an event like the great solar storm of 1859, which knocked out the infant telegraph systems in Europe and North America. I doubt that many companies have factored for the effects of the super volcano under Yellowstone National Park exploding, but we're told that it too is overdue. Worryingly for me, companies may be missing the point. During a flu pandemic, the impacts may be managed by the government, but during a flu epidemic it's far less likely that you'll get this support. Some of the planning for a flu epidemic may be useful should a pandemic occur but organizations must recognise the limitations of business continuity arrangements.
We, as business continuity professionals, must always recognise that there is an option to stop activity and be prepared to offer it as a solution. The challenge for most managers is recognising when to raise the white flag. The challenge for senior managers is deciding what to do next.
Richard Jones, Business Continuity Manager.

•Date: 7th Nov 2008• Region:UK/World •Type: Article •Topic: BC general
Rate this article or make a comment - click here
UPDATED 11TH NOVEMBER |