Evolving IP storage switches

April 1, 2002
SPECIAL REPORTS: Premises Networks

IP storage switches provide a migration path from the Fibre Channel SANs of today to the IP SANs of tomorrow.

TOM CLARK, Nishan Systems

The emergence of SANs based on a Fibre Channel architecture has created new solutions for storage as well as new problems for customer implementation. Fibre Channel provides gigabit transport for the efficient transfer of block storage data but imposes its own infrastructure and management requirements. The Fibre Channel protocol is optimized for streaming frames with minimal overhead but mandates specific interfaces on both storage devices and hosts.

As a channel architecture replacing parallel SCSI, Fibre Channel succeeds within the circumference of the data center but has difficulty scaling to higher populations of storage devices and extending over long distances. From a customer standpoint, one of the major inhibitors to SAN adoption is the isolation between Fibre Channel solutions and mainstream IP networking. Since Fibre Channel imposes a unique architecture, vocabulary, management, and support skills, all at a higher cost, SAN adoption is largely limited to the top-tier enterprise networks.

The contradiction between the viability of networked storage and the complications associated with Fibre Channel is being resolved by second-generation SANs based on transmission control protocol (TCP)/IP. IP storage brings SANs into mainstream IP networking, enabling administrators to leverage their existing infrastructure and skills to deploy productive shared storage solutions. There are many challenges associated with Fibre Channel SANs and the new IP storage technologies that hope to supercede them.

Although Fibre Channel has a proven value for small SAN configurations, vendors are experiencing difficulty in providing scalable solutions for enterprise applications. Interoperability and management are most often cited as major problems, but at a more fundamental level, the Fibre Channel architecture itself poses significant issues that primarily relate to the Fibre Channel switches that form the fabric and not to the end devices such as storage arrays. While fabric-switch vendors struggle to overcome such issues as scalability, these same problems are either nonexistent or have long been resolved by IP switch vendors.

Scalability. Fibre Channel standards allow for up to 239 fabric switches in a single network. Theoretically, up to 15.5 million devices can be attached to such a fabric. In reality, however, it is difficult to achieve a stable configuration of 20-30 switches in a single network. Unlike IP networking, which uses an open shortest path first (OSPF) routing protocol to provide autonomous regions within the network, a Fibre Channel fabric creates a single network space. The Fibre Channel routing protocol, fabric shortest path first (FSPF), is a subset of OSPF but lacks the ability to provide autonomous regions.

With no means to isolate disruptions, transient errors in one part of the fabric may propagate throughout the network. That is analogous to broadcast storms characteristic of bridged Layer 2 LANs before the advent of IP routing. The consequence is that adding fabric switches to a single network introduces further vulnerability to disruption and, should a problem occur, extended recovery or convergence times.
Figure 1. Scaling SANs to high populations by leveraging director-class Gigabit Ethernet switches.

In addition, due to Fibre Channel's addressing mechanism for fabric switches, temporary disruptions can initiate a process that fragments a single multiswitch fabric into separate SAN islands. Fibre Channel fabrics are designed to require minimal manual configuration, which provides the benefit of reducing administration but can pose problems when connections between switches are momentarily broken.

When connected by expansion ports (E_Ports), Fibre Channel switches undergo a process of principal switch selection. In this scheme, one switch becomes the primary switch in the fabric and allocates address blocks to the remaining switches. In an extreme case, if all E_Port connections break between all switches, each would become its own principal switch and reestablish its own address space. Since this process also causes address reassignment to the attached storage devices, it can be a disruptive reconfiguration for all involved. As the population of a single fabric grows from a handful of devices to several hundred, the potential for disruptive reconfiguration naturally increases.

IP storage switches have addressed these Fibre Channel-specific issues in a number of ways. In terms of device population, IP storage switches leverage large director-class Gigabit Ethernet (GbE) switches to create an IP backbone for the SAN. That enables administrators to leverage existing GbE switches to accommodate hundreds or thousands of storage devices. IP storage switches that can accommodate both Fibre Channel and Internet SCSI (iSCSI) storage devices offer a migration path to grow the enterprise IP SAN over time (see Figure 1). GbE and IP switches have already demonstrated the ability to connect thousands of devices into a coherent network. Now this proven capability can be extended to SANs.
Figure 2. Preserving network stability via open shortest path first for extended IP SANs.

The use of OSPF routing protocol allows IP SANs to scale to high populations without exposing the network to system-wide failures. Autonomous regions allow connectivity between network segments, but also provide isolation of errant nodes on the network (see Figure 2), allowing enterprise networks to scale to thousands of devices, yet maintain stability. IP storage switches that bring Fibre Channel end devices into an IP network offer a significantly higher level of stable operation compared to native Fibre Channel fabrics.

Likewise, IP storage switches may provide address translation for Fibre Channel end devices to avoid the potential disruptive reconfiguration characteristics of Fibre Channel fabrics. Instead of creating a single fabric address space as in Fibre Channel, address translation can preserve local addressing schemes regardless of how many additional switches are connected to the SAN. Address translation mode is part of the Internet Fibre Channel protocol (iFCP), which enables connectivity to Fibre Channel hosts and targets to an IP SAN. Remote Fibre Channel devices appear as local resources and maintain their addressing, regardless of transient disruptions in the network.

Combining traditional IP routing and GbE technology for SANs facilitates scaling to enterprise-wide storage networks without the vulnerabilities of Fibre Channel fabrics. In addition, use of IP enables higher populations of storage devices in the SAN by greatly extending the geographical scope of devices that can be connected into a harmonious network.

Extensibility. Fibre Channel switches were designed for data-center applications, where data-center storage resources are located within a few hundred meters. Although Fibre Channel extension through DWDM can drive longer distances, practical implementations are typically limited to a radius of 50 mi or less. That has made it difficult to extend SAN solutions beyond geologically unstable or disaster-prone areas and fully utilize the inherent benefits of networking for multisite applications such as remote mirroring or remote tape vaulting.

Aside from allowable distance (10 km by Fibre Channel standard over singlemode fiber), Fibre Channel switches are further restricted by port buffering capability. Fibre Channel Class 3 service relies on a credit scheme to ensure proper delivery of frames. A receiving device, for example, may issue five credits to the sender, indicating that five frames can be sent before acknowledgement is required. The receiving device must therefore have sufficient buffering to accept five sequential frames before it responds. Current Fibre Channel switches may provide sufficient buffering to accept 64 credits (frames).
Figure 3. Native Fibre Channel extension cannot sustain bandwidth utilization beyond a metropolitan radius.

Extending over long distances, however, reveals the limitations of Fibre Channel fabric switches for supporting sustained bandwidth utilization. A Fibre Channel switch may issue its maximum of 64 frames, then sit idle as it waits for the long-haul transmission to complete and additional credits to be received from the far end. Bandwidth utilization drops dramatically when native Fibre Channel extension is driven beyond a metropolitan area (see Figure 3).

IP storage switches are now capable of supporting full-link utilization for extended SANs spanning hundreds or thousands of miles. As demonstrated by the Promontory Project, facilitating coast-to-coast IP storage interoperability at 2.5 Gbits/sec on a 10-Gbit/sec link, block storage data originating on both Fibre Channel and iSCSI systems can sustain full gigabit saturation on a more than 5,000-mi roundtrip. That is accomplished by the use of robust TCP/IP for data integrity and very deep buffers within IP storage switches. Some products offer 250 Mbytes of memory for buffering thousands of frames in transit, a feature that is optimal for streaming applications such as tape backup or content distribution.

Use of TCP for transport of block storage data over wide-area connections ensures data integrity and minimal recovery time if a packet is dropped. TCP allows IP storage to be implemented even in potentially congested or shared WANs or at lower link speeds, without data loss. Compared to Fibre Channel's error-recovery mechanism that requires retransmission of an entire sequence of frames if a single frame is lost, TCP's more granular packet-level recovery offers a distinct advantage for extending SANs with minimal overhead.

In addition, the establishment of autonomous regions via OSPF minimizes potential disruptions as enterprise-wide SANs extend to multiple geographically separated locations. Extending SANs with IP storage switches avoids the issues associated with Fibre Channel disruptive reconfiguration, maximizing uptime.

Interoperability. Although Fibre Channel host bus adapters and storage arrays have achieved a fairly high degree of interoperability, the lack of interoperability between Fibre Channel switches inhibits SAN adoption. Recent agreements by fabric-switch vendors on implementation of the FSPF routing protocol and exchange of zoning information as defined in the NCITS/ANSI FC-SW2 standard have enabled departmental fabric switches from one vendor to be attached to director-class fabric switches from another vendor via an open systems mode of operation. Still, the preservation of proprietary modes alongside standards-compliant modes does not make connectivity seamless for the end user.

IP storage switches supporting both Fibre Channel end devices and E_Port attachment to fabrics provide flexibility in deploying SAN solutions that may over time support a mix of Fibre Channel and iSCSI devices. IP storage switches, however, must also provide interoperability with a variety of GbE and IP router products so that storage data can be brought into an IP backbone or WAN link. IP storage switches must therefore provide compliance to IETF, IEEE, and NCITS/ANSI standards as well as IP storage-specific protocols that are still on standards track within IETF.

IP storage protocols include the iFCP, which integrates Fibre Channel devices into IP SANs; the iSCSI protocol for IP storage and hosts; and Fibre Channel-over-IP protocol for simple tunneling of Fibre Channel frames. The University of New Hampshire's iSCSI Consortium and the IP Storage Forum of the Storage Networking Industry Association (SNIA) are driving standards compliance and interoperability for these protocols.

Interoperability with both Fibre Channel and GbE/IP products enables IP storage switches to leverage director-class GbE switches to build large IP SANs. IP storage switches act as edge switches to the central IP core, bringing both Fibre Channel and iSCSI nodes into a homogeneous IP infrastructure.

Fibre Channel technology has reached a plateau of basic functionality that has enabled deployment of standalone shared-store solutions. Advanced functionality for Fibre Channel, however, is either still in standards development or has not yet been formulated. One of the primary benefits of IP storage switching is the ability to apply advanced IP functions such as quality of service (QoS) and security to storage traffic. These features have a proven track record in IP networking, are generally available, and are commonly deployed and supported within mainstream enterprise networks.

While Fibre Channel fabrics still lack robust QoS support, a variety of QoS features are readily available in the IP world. The IEEE 802.1p/Q standard widely supported in GbE switches, for example, allows administrators to assign virtual LAN tagging to specific devices with prioritization of mission-critical data streams.

In an IP SAN environment, an administrator can assign higher priority to online transaction processing storage traffic and lower priority to potentially contending but less critical tape backup streams (see Figure 4). QoS prioritization can be applied to streams that originate on Fibre Channel end devices or fabrics attached to IP storage switches. Thus, while native Fibre Channel lacks prioritization support, IP storage switching can immediately provide it to Fibre Channel-sourced data. Higher levels of QoS are also available in IP via the resource reservation protocol and MPLS options provided by IP router manufacturers.
Figure 4. Using 802.1p/Q virtual LAN prioritization to provide quality of service for mission-critical storage traffic.

In addition, although Fibre Channel fabrics still lack support for data security and encryption, IP storage switching can apply a variety of security mechanisms, including access control lists and IP Security (IPSec). With native IP storage protocols such as iFCP and iSCSI, off-the-shelf IP firewall and encryption products can be used to safeguard data in transit for both data center and storage over WAN applications. Since IPSec is frequently used for sensitive data in mainstream IP networking, it is easily applied to IP storage using existing equipment and support staff.

IP storage switches bring SANs into mainstream enterprise networking. Designed for Fibre Channel and GbE/IP attachment, IP storage switches optimize use of familiar IP infrastructures, management, and skill sets to build IP SAN backbones while supporting a variety of storage end devices such as Fibre Channel and iSCSI.

Immediate benefits of the IP storage switch architecture include scalability, extensibility, interoperability, and application of advanced QoS and security features to storage traffic. By bringing Fibre Channel storage into IP, SANs may be scaled to hundreds or thousands of devices using more economical GbE director switches. By converting storage traffic to uniform IP format, SANs can be extended to metropolitan and wide areas without proprietary equipment or specialized support issues. Use of OSPF routing ensures stability of extended SANs. And by leveraging the well-understood QoS and security features of IP networking, storage data for mission-critical applications can be transported over data-center, metro, and wide-area IP networks.

IP storage switches provide the migration path from the Fibre Channel SANs of today to the IP SANs of tomorrow. By abolishing distance, performance, and scalability limitations imposed by Fibre Channel, IP storage switches are enabling new storage applications that can span the entire enterprise.

Tom Clark is technical marketing director at Nishan Systems (San Jose, CA), a board member of the Storage Networking Industry Association (SNIA), and co-chair of the SNIA Interoperability Committee. He can be reached via Nishan's Website, [email protected].