Monthly newsletter Weekly news roundup Breaking news notification    

Practice makes almost perfect

Get free weekly news by e-mailThe exercising of a business continuity plan is what separates successful recovery from disastrous failure. Dr. Jim Kennedy, MRP, MBCI explores the subject.

Developing good business continuity and emergency planning habits takes time and discipline, but it is essential for the effective business continuity or emergency management of an enterprise.

Exercising business continuity and/or emergency management (BC and/or EM) plans is one of the best business investments that a company or government organisation will ever make.

How individuals will react under stressful and dangerous situations is difficult to determine. The military realises this so they train their soldiers in difficult situations so that they will know what it feels like before they have to make life or death decisions under fire. Business continuity is no different. In order to know how a particular team or individual will react under adverse situations can only come from proper planning and exercising of those plans. The business continuity profession has known this for some time. However, based on numerous surveys taken before and after 9/11, business continuity and/or emergency management plans exist for most organisations but they have never been exercised. A wide preponderance of business and government organisations have indicated that they have NEVER exercised their BC or EM plans. Exercising of a BC or EM plan is what separates successful recovery from failed attempts.

The group responsible for reacting during an incident or emergency should make sure that well documented and easily understood procedures for response and recovery are developed and maintained. During a crisis, people react differently than when there is no stress, so the procedures should be clearly written. Using step-by-step check lists or flow-charts can make the procedures easier to follow. Having these procedures available at each site will save a great deal of time during an actual crisis event. It is also important to exercise periodically to help to make sure everyone is comfortable and prepared for the time when the drill may be replaced by reality.

Unlike planning, which defines what you do, exercising relates to actually doing it. Regular exercising helps to build habits and establishes learned behaviours. An organisation, through exercising, can develop these learned behaviours if they do not have them already. The bad news is that developing them is never easy. It takes time and a desire for perfection. It often involves the painful process of thinking and re-thinking incident scenarios and developing new mitigation strategies until the plan is running like a well-oiled machine.

Example of failure to exercise
A US wireless carrier in 1996 had developed a business continuity plan for its service delivery operation. The company had done the due diligence of having both a risk assessment and a business impact assessment completed and signed off by management. From the finding of these assessments, mitigation strategies were incorporated to limit risk and a business continuity plan was designed and developed. The plan called for a large number of tapes to be shipped from the main customer billing centre to a hotsite location in case of a major failure of the bill processing centre. The tapes were to be shipped nightly to the hotsite and the hotsite vendor was to store the tapes until they were needed. In addition, all call detail reports (information about each call, who made it, who it was made to, and for how long) were buffered for six hours in addition to being electronically sent to the billing centre. So the plan was based on recovering the billing process in less than six hours so that no billing records would be lost.

The actual business continuity plan was never actually tested at the hotsite and this is where the problem began. One evening after a very heavy snow storm the roof of the billing processing centre began to leak. At first it was a slight leak and tarps were used to channel the water away from the critical IT systems. However, after a few hours it became apparent that the roof was compromised and that the entire floor was in jeopardy of being flooded if the roof collapsed. The business continuity plan was quickly placed into effect. The IT billing administrators packed up their belongings and headed for the hotsite. When they arrived they quickly inventoried the hardware and software components and brought the backup AS-400 into operation. They loaded the operating system, the utilities, and finally the billing database. This all took about four hours and they were now ready to load the backup tapes that were sent to this facility daily, but there was a problem. When they went to physically retrieve the tapes they could not find them. They went through the locker for over an hour where they were supposed to be and they were not there.

Apparently a new night-shift person in the receiving department was hired a few weeks previously. They were unfamiliar with this particular customer’s tape archive storage area and placed the tapes in a spot were there was room in the vault area, not in the carrier’s particular vault area. After the individual was finally contacted the company’s administrators were shown where the tapes were stored. The team then had to sift through several hundred tapes before the correct ones were found, because of poor identification on the tapes. All in all a total of eight hours elapsed from the time of the original establishment of the event, until the tapes were located and loaded on the system. This cost the loss of critical call detail data and a failure to meet the company’s recovery point objectives. The six hour buffer time was exceeded so many hours of call detail reports were overwritten and subsequently unavailable.

The reason for the failure of the plan was that the company never performed an actual recovery using live data. They had tested bringing up the AS-400 to the operating system and database level, but had never actually loaded in real data tapes. The failure of the processing site was the first test of the plan utilising real backups – AND IT FAILED.

Exercising plans
There are five basic types of exercises that can be undertaken to properly assess the effectiveness of a business continuity or emergency management plan.

They are:
* Desk check
* Tabletop exercise
* Simulation
* Functional exercise
* Full-scale exercise

Desk check
An untimed exercise to review all of the elements of the plan in a stress-free environment. The participants are management and response team members who gather across the table to ensure that all are familiar with the plan; questions are asked and answered; changes are made to the plan if problems are discovered. This exercise is usually facilitated by the plan developer or business continuity plan coordinator.

Tabletop exercise or walk-through
A tabletop exercise simulates an incident in an informal, stress-free environment. The participants who are usually the responsible managers and the response teams gather around a table to discuss general problems and procedures in the context of an incident scenario. The focus is on training and familiarisation with roles, procedures, or responsibilities.

The tabletop is largely a structured walk-through guided by a facilitator. Its purpose is to solve problems as a group. A scenario is developed in advance but there are no attempts to arrange elaborate facilities or communications. One or two evaluators may be selected to observe proceedings and progress toward the objectives.

The success of a tabletop exercise is determined by feedback from participants and the impact this feedback has on the evaluation and revision of policies, plans, and procedures.

Simulation
This type of exercise involves a predefined scenario which is developed prior to the event. It is unannounced and once started it is timed from beginning to end. The exercise addresses the scenario using only the plan. It is used to determine the state of readiness and awareness of the plan’s response teams.

It incorporates associated plans and tests accuracy of call trees and supplier or recovery vendor lists.

Functional exercise
The functional exercise simulates an emergency in the most realistic manner possible, short of moving real people and equipment to an actual site. As the name suggests, its goal is to test or evaluate the capability of one or more functions in the context of an adverse or emergency event.

* It involves controller(s), players, simulators, and evaluators.

* The atmosphere is stressful and tense because of real-time action and the realism of the problems.

* Exercise is lengthy and complex; requires careful scripting, careful planning, and attention to detail.

* Geared for policy, coordination, and operations personnel (the players).

* Players practice their response to an incident by responding in a realistic way to carefully planned and sequenced messages given to them by simulators.

* Messages reflect a series of ongoing events and problems.

* All decisions and actions by players occur in real time and generate real responses and consequences from other players.

* Guiding principle: imitate reality.

Full plan exercise
A full-scale exercise is as close to the real thing as possible. It is a lengthy exercise that takes place on location (at the hotsite for example), using, as much as possible, the equipment and personnel that would be called upon in a real event.

In a sense, a full-scale exercise combines the interactivity of the functional exercise with a field element. It differs from a drill in that a drill focuses on a single operation and exercises only one organisation.

Eventually, every incident response organisation must hold a full-scale exercise because it is necessary at some point to test capabilities in an environment as near to the real one as possible.

Benefits from exercising
The goal for every business or government entity should be to design exercises and to establish a comprehensive exercise program. That program should be based on a long-term, carefully constructed plan. In a comprehensive program, exercises build upon one another to meet specific operational goals. The aim is to provide competence in all incident and emergency response functions.

The two main benefits of an exercise program:

Individual training: exercising enables people to practice their roles and gain experience in those roles.

System improvement: exercising improves the organisation’s system for managing incidents and emergencies.

These benefits arise not just from exercising, but from evaluating the exercise, evaluating problems, and acting upon the recommendations.

Management should be clear that exercises are NOT tests. The intent is not to establish a pass or fail. An exercise should be viewed as the normal work required to refine and to tune business continuity and emergency plans. An exercise has value only when it leads to improvement.

Exercises should be conducted periodically. The period of the exercises should at least be yearly, or, if business is rapidly, changing twice a year.

One last thought
Exercising of business continuity and/or emergency management plans and verification of their accuracy and efficiency are fundamental to achieving the objective of a responsive and recoverable operation.

About the author
Dr. Jim Kennedy is Distinguished Member of Consulting Staff in the Security and Business Continuity Practice of Lucent Worldwide Services. In this position Dr. Kennedy is responsible for providing consulting services in information security and business continuity to a wide range of clients all over the world.

Dr. Kennedy has over twenty-five years of experience in the security and business continuity and disaster recovery fields and holds numerous certifications in network engineering, security, and business continuity. He has developed more than thirty business continuity and disaster recovery plans, planned or participated in over one hundred BC and DR plan tests, and has helped to coordinate three actual recovery operations during his 25 year career. He has worked in telecommunications, manufacturing, pharmaceutical, consulting, and chemical industry segments.

jtkennedy@lucent.com

Date:10th March 2005 •Region: US/World •Type: Article •Topic: BC exercising
Rate this article or make a comment - click here




Copyright 2005 Portal Publishing LtdPrivacy policyContact usSite mapNavigation help