|
By Sujal Patel, Founder and CTO of Isilon Systems.
Today’s competitive companies are facing a tremendous increase in the amounts of data used to conduct their everyday business, which has fundamentally shifted how IT organizations evaluate, select and deploy technology. Indeed, as more companies turn to clustered computing for their IT needs, clustered storage has become an extremely attractive solution to complement their computational infrastructures, enhancing data collection and analysis to deliver better business results.
There are basically three major trends driving this need for clustered storage systems across multiple vertical markets and industries: the avalanche of digital content and unstructured data being created; the strong desire to pair industry-standard hardware with smart software to achieve cost savings and performance advantages; and the fundamental technology paradigm shift to clustered computing architectures.
As IT pros know, applications using video, audio, images, research sets and other large digital files – which are bombarding today’s enterprises – have requirements for which traditional storage systems were not designed. Digital content and unstructured data consume large amounts of storage capacity, grow much faster than traditional data stores, and require high throughput and often high concurrency. The need to efficiently manage storage for these files is becoming a key factor in the demand for network storage systems that can easily grow to accommodate business needs.
That ease of growth combined with the scalability of clustered network storage systems addresses the needs created by the enormous and rapid growth of digital content in today’s businesses without putting cost or complexity burdens on companies, and because of that analysts such as Enterprise Strategy Group’s Tony Asaro have said that clustered systems are the next evolution in enterprise storage. Clustered storage systems deliver capabilities beyond those for which traditional storage systems were designed, and analysts believe that they will be a core infrastructure for grid computing going forward.
The evolution of storage
So, how did the evolution to clustered storage begin?
Well, this shift closely parallels what we have seen in the server world in recent years. The use of ‘big iron’ mainframes started to become obsolete as minicomputers (like the IBM AS400 and multiprocessor SUN and SGI boxes) became more prevalent. As technologies became more effective and efficient, those machines also became more compact. Eventually there was a further shift towards clustered infrastructures that use industry standard hardware, such as blade and 1RU servers, running Linux and Windows.
The world of data storage has gone through a similar evolution. Storage systems, once monolithic, refrigerator-sized units, were replaced by mid-sized (but still monolithic) systems. Today’s storage has evolved further, to compact, easily expandable, modular clustered storage systems.
The best of these clustered storage systems run on industry-standard hardware and have software that enables all components, or nodes, to seamlessly work together as one single system. Clustered storage systems have much more collective horsepower, cost 40-60 percent less than the traditional big monolithic systems and can scale much larger than traditional storage systems ever could. For example, whereas traditional storage systems top-out at several terabytes per file system, today’s clustered storage systems support more than 150 terabytes.
The clustered storage approach is central to overcoming the four biggest limitations that traditional storage systems have with digital content and unstructured data.
Traditional storage:
* Creates many separate ‘islands of storage’ that are hard to manage and can not be easily shared.
* Creates performance bottlenecks – because the devices are all separate systems. Some devices may get maxed out, particularly with high throughput digital content applications, while others are underutilized.
* Is complex and hard to grow. Adding new systems creates separate silos of storage. As each new system is added, someone has to figure out what files should be moved to the new devices and then how to tell all the users and applications where to find their data.
* Contains inherent single points of failure within the systems. If any server dies, all access to the data that sits on the disks behind that server is lost. In addition, the systems cannot withstand the simultaneous loss of more than one disk. If multiple failures happen on a traditional system, the data on those disks will always be lost.
Although storage solutions vary, a good network attached storage (NAS) clustered storage system successfully addresses all these problems. In sum, this evolution to clustered storage is providing businesses with systems that are more reliable and easier to manage.
Survival of the fittest
As we all know, businesses rely on data, and the loss of that data can have profound repercussions on a company’s performance and reputation. Disk failures are a likelihood that all businesses must accept and plan for – especially businesses that are experiencing exponential growth in today’s competitive market. However, with clustered storage, businesses don’t have to risk losing their valuable information. There are solutions available on the market today that seamlessly continue to operate with 100 percent data availability while simultaneously recovering failed disks or even entire nodes.
Indeed, intelligent clustered storage systems are built to easily scale all the way from four terabytes to many petabytes, and as such employ sophisticated, robust, policy-based file management tools that drastically minimize or even completely eliminate the impacts of failed hard drives – ensuring that your mission critical business data is always available.
Key storage system features often sought out by data-intensive companies include: disaster recovery, distributed workflow, information lifecycle management (ILM), backup and migration capabilities and distributed online delivery of content.
* Disaster recovery
IT-savvy technology purchasers seek out systems and software that deliver high reliability, availability and IT continuity even in the face of failure. A good system will do this through its ability to replicate and secure mission-critical data to secondary locations, such as a local area network (LAN) or wide area network (WAN.)
* Distributed digital workflow
Digital workflow, data stores and the teams that access them are frequently spread across multiple locations. Competitive businesses are very selective about sophisticated, file-based replication policies that enable faster and more efficient collaboration on large shared data sets at each stage of the digital content workflow.
* Information lifecycle management (ILM)
Solutions that automatically move files from more expensive, primary online storage to less expensive, disk-based near-line storage based on specific business – by allowing content to be replicated based on specific criteria such as file type, creation date or file size – provide sophisticated and economical tools for reaching a company's information lifecycle management goals.
* Local backup, restore and migration
To shrink the window of time required to perform large-scale backup, restore and data migration operations, companies are using fast, scalable and distributed disk-based targets that enable critical data to be made available while eliminating downtime due to replication and migration operations.
* Distributed online content delivery
Increased bandwidth utilization, achieved by distributing content across multiple networks, data centers and geographical locations, decreases costs while dramatically improving end-user access speeds.
Basic instincts
Selecting the right storage system can be confusing, but it is nevertheless an important decision. The right storage system allows IT executives and their businesses to redefine the economics of storage and improve the overall cost of doing business by allowing the company to ‘scale as you grow’ and handle mass amounts of multiplying digital data safely and efficiently.
The weight of this decision is often made more taxing by the significant amount of misinformation that exists in the storage market – at times making it difficult to discern the differences between storage area networks (SANs) and network attached storage (NAS) systems. Instinctively, though, most technology buyers know that they need storage that provides both scalability and reliability.
More often than not, a company also needs storage systems that offer decreased complexity, reduced risks and a simplified management interface that significantly lowers total cost of ownership. One key element that helps to enable this is a single, expandable global namespace. A single, expandable global namespace can allow companies to directly manage 1000 terabytes as easily as one terabyte and greatly reduces risk by eliminating unnecessary complexity.
Companies facing a growing data warehouse also need the ability to scale capacity within a single file system – to as much as, or more than, 150 terabytes – while maintaining performance and data integrity. This will allow a company to grow its data stores significantly at minimal cost.
Other features that companies should look for in storage systems include:
* Modular, clustered architectures that allow flexibility in growing the system and provide performance advantages for concurrent access to content;
* Built-in software intelligence that provides ease of management and an easy-to-navigate web interface that makes changes to system capacity or protection levels effortless;
* Systems that are optimized to accommodate large file sizes, high aggregate throughput and many concurrent users that can also easily handle unpredictable or explosive data growth;
* Load balancing that enables client and application connections to be evenly distributed within a cluster through intelligent software policies; and
* Use of industry-standard hardware that takes advantages of standard Ethernet as the interconnect fabric and uses standard protocols to communicate with other applications and clients, reducing unnecessary cost and configuration complexity.
Conclusion
In order to sustain unyielding growth in digital content and unstructured data with the same or even fewer resources, data-centric organizations must change their approach to storage and embrace the shift to a clustered architecture.
The fact of the matter is that decreased complexity reduces risk, simplifies management and lowers the total cost of ownership. In addition, intelligent technology and a single, expandable global namespace allows companies to directly manage large amounts of data while greatly reducing risk by eliminating unnecessary complexity.
The digital content and unstructured data flooding companies in data-intensive markets such as media and entertainment, digital imaging, life sciences, oil and gas and government, create unique storage requirements and as such, businesses in such data-intensive industries need to embrace clustered architectures to ensure that they can easily manage their digital assets for the long term.
Isilon Systems UK Ltd is exhibiting at Storage Expo 2007. Now in its 7th year, the show features a comprehensive free education programme, and over 100 exhibitors at Olympia, London from 17-18th October 2007 www.storage-expo.com

•Date: 19th July 2007• Region: World •Type: Article •Topic: IT continuity
Rate this article or make a comment - click here |