Monthly newsletter Weekly news roundup Breaking news notification    

Building true business continuity solutions over IP

Get free weekly news by e-mailBy Dante Malagrino, sr. technical marketing manager, Cisco Systems EMEA

Many business continuity plans are based on the creation of one or more secondary data centres, where storage and applications are replicated in such a way that if a disaster occurs, the equipment hosted in the secondary site will be able to take over and keep the business up and running.

Common sense suggests that the secondary data centre should be located far enough away from the primary one to reduce the likelihood of all IT operations being affected by the same disaster. But longer distances imply higher latency, which is not well tolerated by some applications. In fact, this is particularly true for synchronous replication, while it becomes less relevant in the case of asynchronous replication. Therefore, companies that need synchronous replication to guarantee that primary and remote data centres constantly have the same information, will have to deploy their secondary site within a typical radius of about 100km from the primary site and use synchronous replication between them.

At these distances it is possible to choose amongst many different transport technologies, including of course IP, but also CWDM (Coarse Wavelength Division Multiplexing) or DWDM (Dense WDM,), which can be more expensive, but provide very high performance and excellent reliability. Companies who do not necessarily require synchronous replication can locate their secondary data centre further away (more than 100km from the primary site) and will have to consider transport technologies that are capable of spanning across those distances, such as IP. The technology that leverages IP for the interconnection of remote sites is called FCIP (Fibre Channel over IP) and allows Fibre Channel fabrics to be transparently interconnected over any IP infrastructure.

Fibre Channel Over IP
FCIP is a protocol specification developed by the Internet Engineering Task Force (IETF) that allows a device to transparently tunnel Fibre Channel frames over an IP network. The operation of the FCIP protocol is very similar to any other tunnelling mechanism. There are two edge devices that interface between the local Fibre Channel SAN in each of the data centres and the IP network. Each of these devices takes Fibre Channel frames from the SAN and encapsulates them within IP packets that can be transferred over an IP network in a reliable manner by using the TCP (Transmission Control Protocol) as the transport layer protocol. At the remote site, another FCIP device receives incoming FCIP traffic, strips off the additional headers and places the original Fibre Channel frames back on to the SAN.

This mode of operation represents both the strength and the weakness of the protocol as it makes FCIP completely transparent to the Fibre Channel SANs that are being interconnected; this implies that most, if not all of the management procedures that a SAN administrator is used to perform on a Fibre Channel SAN within a single data centre are easily extended to the interconnected environments, but it also means that the two Fibre Channel SANs have effectively been ‘bridged,’ thus creating one, large, geographically dispersed SAN.

There are two problems associated with bridging Fibre Channel SANs across distance. The first one is the fact that the stability of the extended SAN is now dependent on the stability of the link that connects the two environments. The second issue is related to having one unified SAN across the two (or more) data centres, which means that any fault on any of the two sites gets propagated to the other and may cause disruption to the entire environment.

The main purpose of building a secondary data centre is to protect the data and the application that are hosted at the primary site, but the probability of having a service interruption increases as soon as any connection between the primary and secondary locations is established.

The solution to this paradox exists and it’s based on the combined use of FCIP and two advanced Fibre Channel features called Virtual SANs (VSANs) and InterVSAN Routing (IVR.)

The Role of VSANs
Very much like Ethernet, Fibre Channel is a layer-2 protocol without any hierarchical network domain concept that would serve to isolate and localise control protocols and messaging within a given region of the network. Instead, like Ethernet, Fibre Channel maintains a set of control protocols that are fabric-wide in scope such as zoning or state change notifications. Local control protocol events can potentially result in disruptions that span the full extent of the fabric. Obviously this is as true for SANs that are fully confined within a data centre as it is for extended SANs, built using any kind of transport. In the case of an extended SAN, the consequence of a disruption on either side of the link equally affects both sides.

In Ethernet the problem of segmenting large physical domains into multiple logical infrastructures is solved by VLANs (Virtual LANs,) whose key attribute is the fact any disruptive fault on one VLAN does not affect any of the others and this is achieved by having a separate control plane per each VLAN. In Fibre Channel the equivalent of VLANs is represented by VSANs. Now part of the ANSI T.11 standard, VSANs behave in a very similar way to VLANs and provide exactly the same benefits in terms of security, scalability and fault isolation.

When multiple VSANs need to be carried over one ISL (Inter Switch Link) each frame is tagged with explicit VSAN membership information in such a way that the receiving switch can take the appropriate forwarding decision also considering the VSAN tag. Of course this VSAN tag is never exposed to end devices, such as HBAs (Host Bus Adapters) or storage array interfaces. A switch-to-switch link that supports VSAN tagging is called EISL (Enhanced ISL.) It goes without saying that ISLs and EISLs can also be extended over long distance, possibly using FCIP.

InterVSAN Routing
VSANs alone do not yet solve the problem of interconnecting two data centres while keeping them isolated from a control plane perspective. VSANs make it possible to isolate two or more data centres by using different sets of VSANs, but this also inhibits data traffic from flowing between the sites. In the Ethernet world, the problem of enabling nodes belonging to different VLANs to communicate to each other is solved by IP. By leveraging a hierarchical addressing scheme and a set of routing techniques, IP, a layer-3 protocol, lets data traffic cross the boundaries of VLANs without merging their control planes.

Unfortunately, there is no IP in SANs and there is not even any kind of layer-3 protocol. The common upper layer protocol for storage networking is SCSI (Small Computer Systems Interface,), which was designed on the basis of completely different assumptions than IP. When SCSI was first conceived, nobody could have ever imagined that the distances spanned by the protocol could have been as long as those of a FCIP link. SCSI was originally meant to be a bus protocol to connect peripherals, including storage devices to a computer. As such, SCSI architects never thought about including a proper layer-3 protocol, but limited themselves to design a local I/O (Input/Output) protocol.

Since nobody would even dream about changing SCSI today, the solution to SAN internetworking has been built within the fabric and goes under the name of InterVSAN Routing (IVR.) Using IVR, a set of policies can be configured on fabric devices to selectively allow nodes belonging to different VSANs to talk to each other. In this way IVR achieves what IP does in the IP/Ethernet world.

By combining VSANs with IVR, it becomes now possible to join together primary and secondary data centres and let relevant traffic flow between the two sites, but still preserve the control plane isolation needed to guarantee that any fault on any of the two sites or along the connection link would not adversely affect the operation of the entire IT infrastructure.

Conclusions
Building a reliable, secure and scalable business continuity solution over IP is possible by astutely combining FCIP, VSANs and IVR. This solution is architecturally very powerful as there is no restriction on the transport technology that can be used for the long distance connection. IVR strictly relies on basic Fibre Channel services and therefore can leverage any transport option available to Fibre Channel such as Fibre Channel itself, CWDM or DWDM, SDH/SONET and of course IP.

Dante Malagrino holds a Ph.D. in Computer Science from Politecnico di Torino in Italy. He joined Cisco Systems in San Jose, CA in 1999 as a Software Engineer and his interests were mainly in the areas of high-speed networking and LAN switching. In 2001, he joined Andiamo Systems as Technical Marketing Manager responsible for the competitive analysis laboratory. In 2002, after the acquisition of Andiamo Systems by Cisco, he became Product Manager for the Cisco MDS 9000 family of intelligent SAN switches. In June 2004, he has relocated to London, UK, as EMEA Marketing Manager for Cisco’s storage networking products. In November 2004 he has been appointed Director of the board for the SNIA Europe Industry Association and Chairman of the Education Committee.

Cisco is exhibiting at Storage Expo the UK's largest event dedicated to data storage . Now in its 5th year, the show features a comprehensive free education programme, and over 90 exhibitors at the National Hall, Olympia, London from 12 - 13 October 2005 www.storage-expo.com

Date: 10th June 2005 •Region: Various •Type: Article •Topic: IT continuity
Rate this article or make a comment - click here




Copyright 2008 Portal Publishing LtdPrivacy policyContact usSite mapNavigation help