By Lawrence Robert, CBCP, CBRP.
Service levels and recovery times: what’s the difference? This has come up in various discussions as the profession of business continuity understands the boundaries we face in the world of ‘production’ vs. ‘recovery’.
Although we engage ourselves in all aspects of a business model, we do need guidelines to help us become more focused and proficient. When do we let the established production procedures dictate problem resolution and when does it become a business continuity issue?
There is an expected service level that the business relies on to assess the 'up time' required for specific business functions. For new business processes, this is determined early on in the business process development stages. The ‘service delivery area’ has this responsibility. The service delivery area may be an internal organization or an outsourced model. On a yearly basis or when contracts are being established or renegotiated, the service levels are determined for various purposes, such as metrics, billing, etc. Application 'up time' is based on normal production processing throughout the day/week/month and is gauged a success or failure by the percentages that are agreed upon. These are normal production expectations of service delivery.
During times when there are disruptions in the service AND the expected service levels are in jeopardy, another measurement takes over: recovery times. Because of the understanding in the industry that recovery infrastructure upkeep and maintenance is costly, we as practitioners in the business continuity profession should always look for ways to maintain a low cost with the maximum benefit towards the recovery solution.
When do business continuity processes get introduced into a production problem? I use a ‘sliding scale’ method. Just as each business process is different, so are the metrics that are applied to the sliding scale. As production problems continue to impact a business unit, the recovery ratings are investigated early in the problem. Some of the early questions are:
- What is the business unit?
- What are the recovery ratings associated with all the processes in that business unit?
If there are 0-4 hour recovery times in the processes, then business continuity should be ‘laying the ground work’ at least one to two hours prior to the maximum recovery time of four hours is allowed. Not that we engage, but rather we start lining up the varying solutions that may be needed and validate their readiness to initiate. We should also take this opportunity to remind the business users of what is in their plans that can be used prior to the problem escalating. The disclaimer here is that the ‘sliding scale’ can be very different depending on the business they support.
Once a business unit’s recovery times have been established, there is also complexity within the business as it relates to triage of their individual processes. An example being, the functions associated with a typical help desk/call center have varying process levels of recovery. Obviously, the answering of calls is very critical. The documenting of those calls is also important but not as much as receiving the calls themselves. To have a tested work-around is key in that the focus during an actual event would be on the calls not so much as the documenting of the calls because there may be a workable solution in place, or a manual process. It’s about triage in an area that is predominately a critical business line but not all functions are critical to that business.
So, service levels vs. recovery times are different.
The help desk/call center functions have varying levels of service as being provided by the service provider; this is service levels regulated by available up-time. Recovery times are closely analyzed during a time of disruption to the service levels that spans outside the recovery time objectives timeframe. Those recovery times should fluctuate based on alternate (less expensive) solutions being available such as 'work from home', manual processes, workload shift, etc.
Business continuity is not something that should be ‘declared’ or ‘stated for the record’ during a crisis, but more of a gradual inclusion of work around, manual processes, and work load shifts that in some cases are so subtle that it may not even seem like business continuity plans are being implemented.
A fully integrated and ‘all inclusive’ business continuity program that addresses small problems as well as large scale events is where we provide the most value to a company. As we evolve the profession of business continuity to the next generation model of resiliency, we will see business continuity/disaster recovery become more transparent and inclusive in the standard business model. This ensures that what we provide the business is important in terms of clarity and focus. Working in a framework of understanding roles and responsibilities not only allows us to understand the difference between service levels and recovery times, but solidifies our involvement and value with our customers; both on the business side as well as with the technology providers.
Author: Lawrence Robert, CBCP, CBRP, Director of Business Continuity
Sun Life Financial.
Make a comment
•Date: 25th March 2009• Region:US/World •Type: Article •Topic: IT continuity
Rate this article or make a comment - click here