|
By John Glenn MBCI, CBCP
Caveat: This business continuity planner prefers to leave the training to the trainers. Trainers and planners, working together, are able to create and execute exercises which meet all training requirements.
The other day I was asked how I go about testing business continuity response plans. I mumbled a reply which was not satisfactory to either the person on the other end of the line or myself. Exercising such a plan is a bit more complicated than a simple sentence can convey.
As a consultant, I usually trot out the methodology I ‘borrowed’ from past experiences and tailor it to the client's requirements and capabilities. But this call required something more and ‘on the fly.’ The caller was considering my credentials for a staff (in-house) opportunity.
So, how would I go about exercising the plan as a staffer rather than a consultant?
A WORD ABOUT ‘TESTING’
Before going any further, a clarification. I am adamantly against ‘testing,’ both the process and the word. It is absent from my vocabulary (except when it comes in handy for a headline). Testing, as we all know from school days, means pass or fail; our skills or memory is challenged. If we fail, we are expected to suck it up and do better next time. We, business Continuity planners, may ‘test’ the plan to discover deficiencies, (technically known as got'chas!). But responders ‘exercise’ the plan to become familiar with their assignments and, more importantly, to develop confidence in their capabilities.
I don't test, I have ‘exercises’. No one passes or fails an exercise; we all - and I mean all - learn from the exercises; what can be done better and how it can be done better. Finger pointing and blame placing is banished from the program.
MOVIN' ON UP
Exercises align with response plans. When we create a response plan for a functional unit, the plan includes everything from a minor process interruption to a total loss. If, for example, a server fails - probably the most common event in most organizations - some basic steps must be taken:
- Isolate the problem - why did the server fail.
- Determine how long it will take to eliminate the problem.
- Investigate: Will the outage affect other functional units; can the Service Level Agreements (SLAs) be met?
- Escalate: If the failure will impact other functional units, then the interruption must be escalated to the next higher level.
These are Standard Operational Procedures (SOPs) for any event.
Based on the business impact analysis (BIA), each functional unit knows its critical processes and how long those processes can be degraded or interrupted before there is a negative impact on the operation. Typically there are ten or fewer critical processes identified for each functional unit. Operative word: ‘Typically’. There may be more; 10 is not fixed in concrete as an upper limit.
The (planner as) trainer is well advised to start off the training exercises with a single, simple process. The initial effort is the "let's make certain we have everything covered" desktop walk-through. Everything is put on the table; everything is open to discussion. Something is ‘off-the-wall?’ Talk about it anyway.
Should the planner be concerned if something was missed in the initial planning effort? A resounding "No." As a business continuity planner I have never - not once - seen a complete, bullet-proof, initial plan.
This, of course, begs the question: ‘Who creates the plan?’ Is it the planner's job or is it the job of the functional unit’s subject matter experts (SMEs), with the planner acting as coach or mentor? In any, and every, event, expect to uncover some additional got'chas. That is one of the reasons for the exercises.
The exercise purpose should be clearly defined and expected results known to all. There should be a high ‘comfort level’ for initial exercises. Work through each of the critical processes in this manner.
MAKING IT REAL
As responders develop confidence in their abilities, the exercises must become more realistic. Pressure is added, but only that which would be generated by a real event. There are several reasons to heighten tension:
First, the responders need to get a taste of how things might be if a disaster event occurred.
Second, this level should identify personnel who can handle the pressure and, more importantly, those that cannot.
Thirdly, by involving the people who aggravate a situation - a manager who insists that someone must do something NOW! - those people may begin to realize that their actions are counter-productive.
The second and third reasons can cause a planner sleepless nights. It is the planner's obligation to diplomatically replace a panicking person who may be a manager with a person who reports to the manager but who can keep a cool head. Swapping roles may not sit well with the manager, and the stand-in manager must understand that the role is a temporary one.
Another diplomatic headache is the elimination of intruders who disrupt the response. If all else fails, it may require the planner to go above the offender's head to place the responders' functional unit ‘off limits’ to specific personnel.
As the responders learn to deal with realistic small events, the exercises are expanded to include greater response efforts, e.g. moving from a simple work- around for a temporary loss of service to relocating to an alternate site.
THE BIG(GER) PICTURE
Getting one functional unit prepared is only the first step in the response program. In the case of a ‘global’ event (i.e one that impacts the whole business), functional units need to work together to meet the organizational SLAs. Beyond that, ‘public’ issues come into play.
The goal of all business continuity response plans is to ‘maintain a minimum level of service while restoring the organization to business as usual.’ ‘Minimum level of service’ means, simply, meeting SLAs, regardless if they are internal (e.g. IT providing services to business units) or external (providing product or services to its customers).
In the event of a global event which may endanger organizational SLAs, the organization must work together to assure a continuation of the minimum level of service. Fires and floods are the most common global risks across the board.
Global events can cause both business units and their resources to relocate, sometimes to different alternate sites. Getting people to the alternate site(s) can be a logistics headache. The aspirin for that headache is . . . planning. While some organizations expect individuals to make travel arrangements, most with which I have worked prefer to utilize a central resource, often HR.
Whether travel and, if necessary, lodging and food, are arranged by a functional unit or a central source, someone must be trained to do the job. Taking this a step farther, the responder must have confidence that any external vendors can, and will, meet their SLAs with the organization.
Accounts Payable, while not normally a profit center and often exempted from any response role, needs to be brought in to track and pay the bills coming due to keep the minimum level of service and to restore the organization to business as usual.
Someone, or several someones, must protect the organization's image. The audiences which may multiple reasons to know what happened and when will things be back to normal includes general and trade media, the financial community, the customer base, critical vendors, and last, but by no means least, the employees.
The spokesperson(s) should have generic scripts for the most common events and strict instructions about what may, and may not, be said. The person or persons responsible to meet the press typically come from corporate resources. As with personnel from each functional unit, they need to participate in exercises to assure that they, too, can handle the pressure of the moment.
In my scheme, almost everyone has a role to play. Some may be assigned as primary responders and others as alternates; some may have immediate responses (access damage, set up alternate facilities) and some may have delayed responses (e.g. coming in only after an alternate site is established).
An aside: policies and procedures are critical to a plan's success. They set the ground rules on who can spend what and when, what records must be maintained, who will be paid how much for overtime and if furloughed; is personal vehicle use allowed; R&R from a remote site to the responder's home; assistance for a responder's family dealing with the daily obstacles such as health insurance claims, and a hundred other questions which may be raised.
At the beginning of this article I confessed that I prefer to leave training to the trainers.
Understanding that all functional units and management are involved in business continuity response plans, it becomes almost impossible for one person to develop and implement an across-the-organization training program. Keep in mind that as the training progresses personnel changes are being made, some critical processes may change - organizations are, if they are successful, dynamic.
John Glenn, MBCI, has been helping organizations of all types avoid or mitigate risks to their operations since 1994. Comments may be sent to JohnGlennCRP@yahoo.com (c) 2006, John Glenn MBCI

•Date: 23rd Feb 2006 • Region: World • Type: Article •Topic: BC testing and exercising
Rate this article or make a comment - click here |