|
Jim Lee explains why managing the data
life cycle is becoming one of the most critical IT continuity challenges.
Data is at the heart of every organisation
and as companies accumulate increasing volumes of it, managing and
storing enterprise data is fast becoming one of the most critical
IT continuity challenges. IT executives are looking for more cost-effective
ways to improve data and storage management, while reducing costs
to maximise their storage investment. As a result, storage resource
management (SRM) is rapidly taking a lead position in helping companies
meet these goals. As the amount and value of corporate data increases,
as computing environments become more complex and as storage management
costs skyrocket, the importance of SRM is increasing exponentially.
The data explosion has resulted in increasing
enterprise storage requirements, creating the need for companies
to deploy a variety of storage technologies including storage area
networks (SANs), networked attached storage (NAS), hierarchical
storage management (HSM) and direct attached storage. Considering
the breadth of enterprise storage, a comprehensive SRM solution
is needed to provide a global view of an organisation’s data
and storage resources, including monitoring the status, managing
storage resources cost-effectively, ensuring availability and supporting
future growth.
Today, active archiving is recognised as a
proven and cost-effective strategy for managing fast growing complex
relational databases by controlling excessive database growth for
the long-term. Active archiving works within the framework of various
storage technologies and SRM to offer a ‘best practices’
approach for managing storage resources and reducing operational
costs. Combining active archiving with SRM provides organisations
with the ability to meet the challenges of managing increasing data
volumes effectively. The right combination and deployment of these
technologies ensures that organisations can meet their data management,
data retention and storage requirements at the lowest cost.
Managing enterprise
data throughout its life cycle
Appropriate data and storage management requires the realisation
that data has a life cycle. Typically, the data life cycle begins
with a business need, initially acquiring data and subsequently
referencing that data on a regular basis during day-to-day business
operations. Over time, this data loses its vitality and is accessed
less often, gradually losing its business value, and finally ending
with its disposal. However, through most of the life cycle, this
data is retained online.
The simple, but critical principle that all
data moves through life cycle stages is the key to improving data
management. By understanding how the data is used and how long it
must be retained, companies can develop a strategy to map usage
patterns to the optimal storage media, thereby minimising the total
cost of storing the data over its life cycle.
The same principles apply when the data is
stored in a relational database; however, the challenge of managing
and storing relational data is compounded because of the complexities
inherent in the data relationships. Relational databases are a major
consumer of storage and are also among the most difficult to manage
because they are accessed on a regular basis. Without the ability
to manage relational data effectively; relative to its use and storage
requirements, runaway database growth will result in increased operational
costs, poor performance and limited availability for the applications
that rely on these databases. The ideal solution is to manage data
stored in relational databases as part of an overall enterprise
SRM solution.
Impact of relational
database growth
Accelerating database growth across industries and applications,
combined with the dramatic increase in graphic, audio and video
media, have created a growing demand for better ways to manage data
and a requirement for more efficient and less expensive storage
solutions throughout the data life cycle. Within the data storage
marketplace, the most difficult challenge is managing the growth
of relational databases that drive mission critical applications
— the backbone for today’s corporate decision-making
and competitive advantage.
The impact of database growth extends well
beyond increasing storage costs and is also critical to business
continuity and disaster recovery plans. Larger databases take significantly
more time to rebuild and restore, whilst overloaded relational databases
degrade performance and limit the availability of critical applications.
Database tuning and expensive hardware, software and storage upgrades
offer diminishing returns. Lastly, to comply with data retention
policies, companies retain much of their historical data online
for audit and legal reasons, despite much of it being rarely accessed.
Even though managing the data life cycle is
critical to the enterprise, few standards exist today to assist
companies in formulating and implementing long-term data retention
strategies. Based on regulatory and legislative requirements, IT
organisations must develop a plan for managing enterprise data in
a complex relational database environment. So, how can companies
implement the best methodology for managing this critical data throughout
its life cycle?
Active archiving
Active archiving is essential for managing the data life cycle efficiently,
complying with data retention requirements and reducing costs. Active
archiving is the only way that rarely accessed data can be safely
archived and removed from an online relational database and transitioned
to other storage media, while retaining easy access to the archived
data in its business context.
However, before developing an active archiving
strategy, an organisation must first identify all the types of enterprise
data to ensure a comprehensive understanding of what the data is
and how it is used, and to identify the data retention and appropriate
storage requirements. Typical corporate data includes all transactional
data from enterprise business applications and the associated databases,
such as payroll, customer information systems and purchasing systems.
This analysis ensures the best mix of which data must remain online
and which data should be archived to ensure a cost-effective balance
throughout the data life cycle. This process also ensures that enterprise
application databases are maintained at a manageable size that improves
the performance and availability of critical systems. The goal of
effective data life cycle management is to keep historical data
as long as required, but not any longer. Consider this approach
as ‘just-in-time’ data accessibility.
Going well beyond the traditional definition
of archiving, active archiving is a proven technology that safely
archives and removes precise subsets of rarely used data from complex
relational databases with 100 percent accuracy. Companies can store
archived data and keep it ‘active’ for easy access when
needed. The referential integrity and business context is preserved.
Users may even access and restore archived data selectively and
referentially intact, eliminating the need to restore all archived
data for the sake of just a few rows.
These capabilities dramatically reduce database
overload, allowing companies to reduce storage requirements, improve
application response times and reallocate current capacity to support
more users and transactions. Active archiving allows IT organisations
to maximise the benefits of existing SANs, NAS and HSM storage solutions
because it complements these technologies; especially HSM systems,
to enable a best-practice ‘staged’ approach to managing
historical relational data that can be an integral part of enterprise
SRM.
Active archiving
and HSM
It’s true that active archiving and HSM both address the problem
of explosive data growth by moving data to more cost-effective storage
devices. However, active archiving is designed for relational data,
while HSM is best suited for other types of data such as document
files, bit maps, and video clips. Although HSM is ideal for managing
these types of data, it is poorly matched for managing relational
database tables, which can be very large.
Active archiving handles relational data at
the row or record level, while HSM handles relational data at the
table or dataset level (a relational database table is physically
stored as a file). HSM performs the migration function based on
the last time a particular database table or file was accessed.
It is likely that users may need to access a small part of the database
at least once during the period when the storage system administrator
has designated that the data be kept at the highest level. For this
reason, it is most likely that the entire relational database file
will continue to reside at the highest storage level.
For example, a customer database table will probably be accessed
on a regular basis, keeping it at Level One. However, only a subset
of this data remains ‘hot’ (that is, current customers),
consequently, the entire dataset must be kept on the server because
HSM cannot distinguish relational data at the row or record level.
Companies that have already deployed HSM will
understand the benefits of “staged” data management.
With active archiving, companies achieve similar benefits for relational
data. Although HSM can migrate relational database tables up and
down the HSM hierarchy, the size of the database does not change.
In contrast, active archiving streamlines relational databases by
archiving and removing referentially intact subsets of related data.
This capability provides the best of both worlds, combining active
archiving for the relational databases, while applying HSM rules
to manage the archived data.
My company, Princeton Softech, has customers
that have safely archived and removed 65 percent of their database
in just their first production archive and delete. This capability
frees tremendous processing power to improve performance, availability
and implement new applications, without upgrading capacity. In addition,
a large amount of disk capacity is made available for other uses.
Regularly scheduled active archiving continues to free significant
disk space, saving millions in hardware and software upgrades. Because
active archiving is an effective long-term solution to the problem
of explosive database growth, it is critical to an enterprise data
storage strategy.
A comprehensive enterprise active archiving
methodology must provide the capability to archive data from a variety
of relational databases and platforms. The ideal active archiving
solution must also guarantee to retain the referential integrity
and business context of the archived data and provide for easy access.
In addition, there must be a capability for managing and storing
archived data on the most cost-effective storage medium (online
in an archive database, near-line on a file server, optical devices
or offline to tape). Integrating active archiving with SRM offers
a best practices approach is to ensure that data and storage resources
are well managed throughout the data life cycle.
Summary
Effective storage resource management enables companies to reduce
storage costs, improve data management, and keep data accessible
throughout its life cycle. Along with the leading storage technologies,
active archiving must be an integral part of any SRM initiative.
Companies can remove rarely accessed historical data from overloaded
databases and store it on the most cost effective medium. The best
and most comprehensive solution to the data explosion challenges
and for managing data throughout its life cycle requires the overall
view provided by SRM combined with the refined and proven approach
provided by active archiving.
Jim Lee is vice president, product marketing,
Princeton
Softech.
Princeton Softech is exhibiting at Storage
Expo the UK's largest dedicated data storage event, delivering the
latest data storage products on the market to over 3,000 end users,
Olympia London from 15-16 October 2003. www.storage-expo.com

•Date:
6th June 2003 •Region: Worldwide •Type:
Article •Topic: IT
continuity
•Rate this article
or make a comment - click
here
|