Monthly newsletter Weekly news roundup Breaking news notification    

What is operational risk management (ORM)

Operational risk is often seen to be of relevance only to banks and the financial industry, but in fact it is a facet of every organisation and reflects the inevitable fact that assets, processes and people can fail, leading to effects that are unplanned and unwanted by the business.

Examples are not hard to find;
* A computer fails and a day's work is lost. We pay overtime to catch up,
* A manager underestimates task complexity and a project over-runs,
* Subsidence causes a building to be declared unsafe and evacuated.

The definition used in the Second Basel Accord describes operational risk as:
"...the risk of loss resulting from inadequate or failed internal processes, people and systems or from external events"

What is the scope of operational risk management?
Operational risk is recognised as being distinct from market risk and credit or trade risk (although an operational failure may result in a loss of control and an increase in exposure in these areas also). However, as the definition suggests, ORM confines itself to managing those elements that fall within the business operational remit. They include:
* Process and procedural robustness and integrity
* People, skills and training
* Insurance and self-insurance
* The supply chain, outsourcing and inherited risk
* Infrastructure, systems and telecommunications
* Physical and information security.

What is the value of operational risk management?
Undoubtedly, you already manage your exposure to operational risk in a number of ways; you lock doors and windows at night, you encourage staff to access the firm's data securely, you run anti-virus software and so on. But most companies buy CCTV only after a break-in. They test their backups only after a system failed. They buy disaster recovery provision for computers only after being persuaded by a doom and gloom salesman.

The result is usually a patchwork of overlapping and gap-ridden investments reflecting the legacy of past decisions. Rarely are the provisions matched to the organisation's actual needs and often it is left unwittingly exposed but feeling 'safe'.

There are many other good reasons for embarking on an ORM programme; some of the main drivers are as follows:
* The changing environment invites new risks and dilutes old ones
* Prospective customers expect risk management to be in place
* IPO acquisitors require ORM or devalue accordingly
* The cost of even a brief period of downtime is now unacceptable
* Corporate governance is already under the audit spotlight

ORM is a logical response to these requirements; it is:
* Systematic, ensuring all risks are identified and treated appropriately
* Repeatable, as part of a process that accommodates change
* Auditable, evidencing governance decisions
* Entirely at the discretion of the business; you choose to accept or mitigate a risk based entirely on the evidence placed before you.

How can we manage operational risk?
The key to ORM lies in the understanding and management of two important concepts - loss and probability.

Loss is defined in this context as:

"...any financial or otherwise unwelcome effect that impinges on a business stakeholder as a result of operational failure "

Loss can include a wide range of both tangible and intangible components such as:
* Lost sales and missed trade opportunities
* Loss of market share and long-term revenue
* Fines, penalties, lawsuits and interest payments
* Eroded reputation, share price, brand value and image
* Loss of employment or directorship
* Reduced pay, compensation or benefits
* Excessive overtime
* Physical trauma, hospitalisation or death.

Probability is defined in this context as:

"...the qualitative or quantitative likelihood of a particular operational failure occurring"

This reflects the fact that most operational failures are rare and a precise statistical value cannot be readily obtained. This is therefore an area of subjectivity where we manage improvement rather than absolute value. For example:
* Earthquake damage to buildings occurs once in 100 years in the UK
* Theft of PCs occurs in one-in-ten businesses each year in the UK
* Computer viruses affect 60 percent of computer users each year globally.

Of course, neither loss nor probability exists in isolation and each is associated instead with specific failure events, for example:
"Server A has a mean time between failures of 10,000 hours and a mean time to repair of 24 hours. If it fails we expect to suffer losses of £50,000 due to missed sales opportunities and increased staff costs." This measures our exposure to risk of Server A failing in this particular way. However, we must add the exposures due to other events such as:
* Theft
* Sabotage
* User error
* Power loss
and so on.

These are called 'threats', defined in this context as:

"...Unpredictable events arising from beyond present operational controls giving rise to failure mode(s) in process(es), person(s) or system(s)"

Different threats can give rise to a single failure mode e.g. a fire, an earthquake or an air accident may equally cause a building to be destroyed. Conversely, a single threat can generate failure modes in many assets e.g. a power surge may cause diverse equipment to fail.

We can repeat this form of evaluation for any or all of the internal processes, people and systems that form the business and obtain an overall appraisal of exposure.

At this point, the potential enormity of the task should become apparent and you are, no doubt, asking some probing questions:
We have how many assets...?
There are how many threats to each and every one...?
They overlap by how much...?
How can we even begin to...?

DON'T PANIC....
ORM recognises the problem and provides rationalisation, described later in this section; nonetheless, it is important that you understand risk before you begin to manage it.

The operational risk model
The operational risk model illustrated below offers a representation of the concepts explained in this section (some of the terminology is repeated).


Risk
Risk has two main components; loss and probability.
Loss is a reflection of the pain or loss or discomfort that may be caused by an event.
Probability is an indication of how often we can expect a particular event to occur.

Taken together, they indicate how much we can expect to suffer as a result of unwanted unplanned events. This is called our 'exposure' to risk

Loss
Loss is partly a reflection of financial loss arising from an incident. Financial losses can be complex, including creditworthiness, lost opportunity, fines, penalties, restrictions. Loss also includes qualitative measures such as reputation, image, morale, loyalty, confidence, credibility.

Probability
Probability is a qualitative measure of likelihood and is often applied due to a lack of statistical data. Quantitative measures include 'serious fires occur on average once every ten years in this industry'. Qualitative estimates are usually comparative and sometimes use a colour scale e.g. red, amber, green

Risk profile
Three elements define an organisation's risk profile:
1. Threat profile, which reflects the relevance of hazards due to environments, working practises, business sector and so on
2. Loss profile, which reflects how it feels pain following a disruptive event
3. Gap profile, which reflects the condition of its defences, identifying where holes and overlaps exist
Each is uniquely characteristic of the organisation and substantially defines the implementation and cost of its ORM plan.

Causes
The causes of disruption always arise from a point beyond our normal operational control. They are known as threats or hazards and there are many to consider. They include natural events such as lightning strike as well as human error, arson, sabotage, terrorism.

Dependency
Many threats simply 'bounce off' the organisation due to measures taken to improve resilience. But once a threat penetrates our defences its effects spread until they reach barriers. An uninterruptible power supply (UPS) is a good example of a barrier against the threat of power loss. Occasionally, threats propagate right through the business, affecting customers and markets and become externally visible. Extreme events may cause large areas of the business to be affected and are called disasters, catastrophes, crises and so on. The propagation of effect through the business is due to the interdependency of processes, people and systems and can be modelled to assist the ORM process.

Scenarios
Scenarios are the accumulation of effects that have propagated right through the business as a result of one or more threats occurring. Because they are cumulative they may simultaneously take many different forms. For example, loss of a central computer system may cause five very different departments to stop working. Scenarios can be complex, difficult to predict and are often summarised into worst-case situations that the company might have to deal with for example 'total loss of building', 'denial of access', 'loss of computer room', 'loss of telecommunications' and so on. Scenarios are the outward manifestations that lead to Loss, completing the cycle.

Mitigation
There are three points at which the flow around the risk model can be interrupted and risk reduced.
* Threats can be prevented, reduced or averted
* Gaps, vulnerabilities and weaknesses can be rectified, containing the spread of effects through the organisation
* Recovery measures can be introduced so that should the worst occur, the duration of outage and consequent effects can be minimised.

The risk model offers a simplistic view of how we can manage operational risk by understanding it. Staff find it easy to understand and can visualise their activity within its context.

The ORM life cycle and process
To manage operational risk we must devise ways of measuring, prioritising, monitoring and systematically reducing our exposure. The ORM life cycle (illustrated below) offers a representation of the concepts explained in this section.

Programme design
ORM is potentially a significant undertaking, potentially running indefinitely. It demands a level of control, backing, structure and overall programme design commensurate with with other corporate initiatives such as TQM or BPR. This framework helps ensure that management and staff remain focussed, particularly after the early high-profile stages of the programme are completed.

Impact analysis
Business impact analysis (BIA) is the technique used to determine the organisation's tolerance and characteristic pattern of loss arising from disruption. The resulting priority and timeframe data is used to determine loss arising from specific incidents and is used in risk assessment. It is also used to establish the timeframes for recovering functions, processes and systems in continuity planning.

Risk assessment
Risk assessment involves the collection of data relating to people, processes, systems and environmental circumstances, culminating in a threat profile and a gap profile. The former is a descriptive list of the threats that currently affect the organisation with estimates of probability. The latter identifies vulnerabilities in the business that allow threats to propagate with potentially disruptive effect. The assessment combines BIA and probability data to prioritise the plugging of gaps, proposing, cost-justifying and comparing strategies for mitigation.

Continuity planning
The business continuity plan (BCP) provides the ultimate backstop where risk mitigation measures have failed or were inappropriate (e.g. a nearby explosion) and the organisation faces potential catastrophe. BCP identifies what people, processes, systems and other structures must be provided to the firm in good time to ensure its survival. These timings and quantities are derived from the BIA, allowing recovery to be planned with confidence.

Assurance
It is well appreciated that tired and out-of-date continuity measures are a liability and that most untested continuity plans contain serious flaws. Assurance is a set of activities that help ensure that your continuity provisions work. Training encourages staff to develop a consistent understanding of risk and continuity issues, building familiarity with aspects that could affect them. Periodic review or audit ensures your continuity provisions still reflect the needs of the business. Rehearsal and testing provide controlled means of simulating real incidents, ironing out problems under safe conditions.

Article by John Robinson, JRCPL Ltd

eBRPNon Verba



Copyright 2008 Portal Publishing LtdPrivacy policyContact usSite mapNavigation help