WELCOME TO THE CONTINUITY CENTRAL ARCHIVE SITE

Please note that this is a page from a previous version of Continuity Central and is no longer being updated.

To see the latest business continuity news, jobs and information click here.

Business continuity information

Researchers develop new software tool to help maintain cloud computing availability

Researchers from North Carolina State University have developed a new software tool to prevent performance disruptions in cloud computing systems by automatically identifying and responding to potential anomalies before they can develop into problems.

The tool looks at the amount of memory being used, network traffic, CPU usage and other system-level data in a cloud computing infrastructure to develop a definition of the wide range of activities that can be considered ‘normal.’ The program defines ‘normal behavior’ for every virtual machine in the cloud, and can then look for deviations and predict anomalies that could affect the system’s ability to provide service to users.

One advantage of this approach is that it does not require users to provide so-called ‘training data’ about what constitutes abnormal behavior, which is important because training data is often difficult to obtain in production cloud systems. Moreover, this approach is also able to predict anomalies that have never been seen before.

If the program spots a virtual machine that is deviating from its normal behavior, it runs a ‘black box’ diagnostic that can determine which metrics – such as CPU usage – may be affected, without exposing user data. This metric data can then be used to trigger the appropriate prevention system, which will address the deviation and prevent it from becoming a problem.

“If we can identify the initial deviation and launch an automatic response, we can not only prevent a major disturbance, but actually prevent the user from even experiencing any change in system performance,” says Dr. Helen Gu, an assistant professor of computer science at NC State and co-author of a paper describing the research. “Also, it’s important to note that this program does not access any user’s individual information. We’re looking only at system-level behavior.”

The program is also lightweight, meaning it does not use much of the cloud’s computing power to operate. It is able to collect the initial data and define normal behavior much faster than existing approaches. Once it is up and running, it uses less than 1 percent of the CPU load and 16 megabytes of memory.

In benchmark testing, the program identified up to 98 percent of anomalies, which is much higher than the rate found in existing approaches. “It also had a 1.7 percent rate of false positives, meaning it triggered very few false alarms,” Gu says. “And because the false alarms resulted in automatic responses, which are easily reversible, the cost of the false alarms is negligible.”

Gu says her team’s next step is to incorporate more detailed ‘white box’ diagnostic tools into the software, so they can identify the software bugs causing any anomalies and correct them.

The paper describing the software, ‘UBL: Unsupervised Behavior Learning for Predicting Performance Anomalies in Virtualized Cloud Systems,’ has been co-authored by NC State Ph.D. students Daniel Dean and Hiep Nguyen and will be presented on September 20th at the 9th Annual ACM International Conference on Autonomic Computing in San Jose, Calif.

The research was supported by the National Science Foundation, the U.S. Army Research Office, an IBM faculty award and a Google research award.

•Date: 11th Sept 2012 • US/World •Type: Article • Topic: Cloud computing

Business Continuity Newsletter Sign up for Continuity Briefing, our weekly roundup of business continuity news. For news as it happens, subscribe to Continuity Central on Twitter.
   

How to advertise How to advertise on Continuity Central.

To submit news stories to Continuity Central, e-mail the editor.

Want an RSS newsfeed for your website? Click here