Monthly newsletter Weekly news roundup Breaking news notification    

The case for CDP

Get free weekly news by e-mailSimon Kelson explains why he believes that continual data protection is the optimum IT disaster recovery solution for most businesses.

Disaster recovery is now a strategic concern for organizations of all sizes. DR recognises the business value of data, and manages the risk of loss against the cost of loss prevention.

Traditional approaches to backup

The cost per gigabyte of storage is one of the baseline metrics that is frequently used to evaluate the price of IT. For data backup applications there have been three main methods of provisioning for data availability, representing a neatly differentiated cost spectrum:

1) Infrastructure replication
Some of the ways that have been used to maintain the constant availability of data are:
* Server clustering - allows multiple servers to host the same data or service; this prevents a single server failure from damaging availability.
* Data replication – simultaneously creating a copy of the data at offsite storage facilities in real-time using high speed data links and storage arrays.

There are two different aspects of availability here. In the former, server failure only becomes a disaster recovery issue if the server failure has corrupted the data or the file system; in the latter remote storage containing replicated data is the ultimate DR solution.

The feature that is common to both of these models is the duplication of physical resources. In this regard they belong to a group of technologies that can be collectively defined as ‘infrastructure replication’.

Infrastructure replication solutions occupy the high end of the cost-per-unit spectrum. This restricts them to the financial reach of those at the enterprise level. The cost of such a solution can only be justified where continual data availability is a direct business imperative or a matter of compliance.

2) Electronic data vault
At the other end of the cost-per-unit spectrum is the offsite electronic data vault. A customer visits the website of the service provider and points to where the data is located on their network and specifies the frequency of the service through the browser.

At the appointed time, the data is copied out over the organization’s external web connection where it is compressed and encrypted as it is written to the vaulting company’s storage systems. It is simplicity itself to use; however there are issues inherent with this ‘point and shoot’ approach to what is one of the most important business services that an organization can procure.

This highly commoditised sector within the market is really only suitable for low volumes; it is very much the preserve of SOHO and small businesses. The data is not accessible ‘live’ as it would be if it were located on duplicated infrastructure; restoration of the data is the reverse of the upload, decryption and decompression are required before it can be delivered back through the Internet connection.

Once a company’s data storage needs reach hundreds of gigabytes it will find such a backup strategy unfit for purpose. The restoration of information in the event of a disaster scenario would take too long.

As such a service is critical, there needs to be a high degree of trust on the part of the customer in the business relationship. It may be an exaggeration to suggest that surrendering the data backup to an Internet storage ‘bucket-shop’ is like baring the soul of the business to a stranger, but it certainly invites a degree of uncertainty on matters of confidentiality and security.

3) Tape media
It is unsurprising that the most popular of the traditional methods of backup is also the one that holds the mid-price point. Tape backup systems were well established long before the widespread commercial development of data transfer technology that paved the way for infrastructure replication and electronic data vaults.

However, for reasons which will become clear, this is no longer the case. Only recently, file sizes could be described as small, comfortably fitting on low capacity media. Fifteen years ago, a 2GB drive was considered a massive storage device. The advances in computer technology have led to increasing files sizes. Now, burgeoning databases, image files and complex media types have combined to create a situation that some observers refer to as the ‘data flood’.

The capabilities of tape media have not kept in step with this. Often multiple tapes are required per backup and auto-loading tape devices and libraries were developed for this purpose. Unfortunately this introduces another mechanical sub-system into a device that is already highly mechanized around a thin plastic tape.

It is not a coincidence that tape auto-loaders are costly, fail frequently and have expensive maintenance and SLA requirements. Media are organised into sets based on weekly and monthly rotational cycles.

Regardless of capacity, tape devices are relatively slow at reading and writing data to and from the media. This limitation means that backups frequently run over night and on into the next day, retarding server and storage performance and leaving only narrow windows for media set rotation, data restore operations and any maintenance that is required.

Full-scale restoration in a recovery suite after a disaster can be a complex business; hardware driver and volume image problems can represent barriers to the rapid recovery of operational status.

Backup in the 21st century

Disaster recovery and business continuity are two of the agenda items that sit beneath governance, risk and compliance. They are part of the strategic framework, and as such are the responsibility of an organization’s senior management. Reliable data backups are integral to the success of this board level mission.

There must be absolute confidence in the reliability of the technology and the human processes which create backups.

In strategic terms, one simple question arises: “What is the business value of IT?” The answer is that IT is only as valuable as the data it processes. It is data which has value, not the IT infrastructure itself; and data only has value if it is accessible. It *must* be available whenever it is needed.

These questions follow:
* How long can the business afford to not be able to access data?
* How much data can the business afford to lose?

In the formalised language of business continuity and disaster recovery, the first of these identifies the recovery time objective, or RTO. Quite simply this is the time that it takes the organization to return to an operational state.

The second pinpoints the recovery point objective, or RPO. This refers to the last point at where the data can be verified as ‘good’. If the last good data was six hours before the loss of service, then six hours of data will have to be deemed as untrustworthy and will effectively have been lost.

Many organizations are engaged in critical business areas, financial, medical, transport for instance. It would be impossible for them to tolerate even very short RTO or RPOs. The requirement here is for continual availability and access.

This is known as high availability.

Most modern businesses can tolerate short term RTO/RPOs, and are said to have a near-high availability requirement and the optimum solution for such requirements is ‘continual data protection’ (CDP).

Under the CDP model, a CDP appliance continually mirrors the data held on an organization’s servers. Technically advanced, CDP solves problems that have historically prevented open database files from being backedup correctly. Best of breed CDP appliances integrate with a wide spectrum of database, messaging, and file systems. This integration places active applications in a stable state so that they can be backedup with complete transactional and point-in-time integrity – what is known in CDP speak as a ‘snapshot’. This application awareness enables all essential information to be recovered very quickly, without the need for lengthy verification processes.

Changes to the mirrored copy are continually logged allowing a rapid recovery to any point in time. A ‘visual slider’ software interface allows the roll-back to the most relevant image, pinpointing the optimal recovery points that enable the shortest time to the continuation of business.

To meet the requirements of disaster recovery the CDP model can be implemented in conjunction with an offsite remote replication service. Such a system mirrors the data and the journal of changes that are logged on the CDP appliance. This provides an RTO and RPO that can be measured in minutes, a timescale that is defined as near-high availability.

Summary and conclusion

Businesses of all sizes are today adopting an altogether more serious approach to disaster recovery planning. This is a necessity, a board level task that resides beneath the strategic umbrella of governance, risk & compliance.

Data-backup has assumed much more significance. This is because an organization’s ability to resume business operations after a major disruption is directly indexed to the speed at which it can gain access to ‘good’ data.

Locating, copying or mirroring data at remote sites is the best way to ensure that data cannot be destroyed by the loss of a place of business. Infrastructure replication and the electronic data vault (EDV) both use this principle but in different ways. Infrastructure replication is beyond the reach of all but the largest budgets whilst electronic data vaulting is unsuitable for large data volumes.

Tape is the most widely used backup media; it is an area of some complexity and it cannot reliably provide predictable RTO and RPO. This makes it unfit for purpose in the context of modern disaster recovery planning and as such, should now be regarded as obsolete, a legacy system.

Organizations should migrate to CDP based backup technology at the earliest opportunity. Continual data protection in combination with offsite replication represents the optimal balance between the cost of the solution, the management of risk and the speed of disaster recovery.

This meets the DR requirements of the vast majority of organizations. It allows those who are able to tolerate the denial of access to their data for a short time to enjoy near-high availability. This removes uncertainty from disaster recovery and allows for operational confidence.

Author: Simon Kelson, managing director, Atlanta Technology
www.atlantatechnology.co.uk

Date: 15th August 2008• Region: UK/World •Type: Article •Topic: IT continuity
Rate this article or make a comment - click here




Copyright 2008 Portal Publishing LtdPrivacy policyContact usSite mapNavigation help