Business continuity adverts
Monthly newsletter Weekly news roundup Breaking news notification    

Reclaiming fault tolerance

Get free weekly news by e-mailStratus attempts to clarify what the definition of fault tolerance is.

Misuse of the term ‘fault tolerant’ is laying companies open to business risk and financial losses from system downtime, according to Stratus Technologies.

In separate research conducted by TheInfoPro and by ITIC/Stratus Technologies, the absence of clear definitions and interchangeable use of terms that specify levels of uptime is pronounced throughout the IT community. Confusing vendor claims and the industry's incorrect use of terminology are major contributing factors, Stratus asserts.

With 30 years of experience in fault-tolerant computing, Stratus says that it is defining true fault tolerance in the way the term was originally understood and measured by a user community that demanded the highest possible uptime protection for their mission-critical applications:

Fault tolerant computing is the ability to provide highly demanding enterprise application workloads with 99.999 percent (five nines) system uptime or better, zero failover time and no data loss.

Anything less is high availability computing at best, which is suitable for meeting uptime requirements of many less critical applications. However, applications that can damage a company's external reputation, cause compliance violations, or result in unacceptable financial cost or life safety risk should they fail typically require better uptime than high availability solutions can deliver, says Stratus.

Fault tolerant computing is not:

- failover;
- hot standby;
- replication;
- mirroring; or
- recovery.

TheInfoPro's September QuickTip report, ‘Users Demand High Availability, But How Good is 'Good Enough?'’ included results from its most recent server study which found that almost 60 percent of IT users don't understand the difference between availability, high availability and software-based fault tolerance, none of which meet the definition of fault tolerance, and hardware-based fault tolerance, which does.

"When users discuss cloud computing or virtualization, one benefit consistently mentioned is 'improved availability.' This may be expressed either as fault tolerance, disaster recovery or high availability. Each of these is technically very different from the other, and they are achieved using very different solutions," said TheInfoPro managing director Bob Gill.

"In other words, there is significant confusion in the marketplace, as organizations want higher levels of availability, but are not always clear on the best method to meet the objective because a number of different terms are often used interchangeably with no commonly accepted definition."

A Stratus Technologies-ITIC survey of almost 250 IT professionals bears out that confusion. Fifty three percent said they are using fault tolerant technology, yet an almost equal number define less than 99.999 percent system availability as fault tolerant. With currently available technology, only fully redundant hardware running in lockstep can provide true fault tolerance, with five minutes or less of unscheduled downtime per year, no failover and no data loss on x86-based systems.

"When customers buy what they think is a fault tolerant solution, they should expect a system that, for all practical purposes, has no unscheduled downtime," said Denny Lane, Stratus director of product management and marketing. "Clusters, virtualization and software-based availability solutions require failover recovery, which means data loss, or they can't scale to support a demanding application and maintain transactional integrity. All this leaves many customers confused as to what they are getting or what they should expect to get when someone says 'fault tolerant.'"

www.stratus.com

Make a comment

•Date: 13th Nov 2009• Region: World •Type: Article •Topic: IT continuity
Rate this article or make a comment - click here

BC Journal


SPONSOR:
Business Continuity from Backup Technology





Copyright 2010 Portal Publishing LtdPrivacy policyContact usSite mapNavigation help