Ring around the metro

SPECIAL REPORTS / Access Networks

The resilient-packet-ring architecture promises to promote more efficient data transport through the metro and the access.

JOHN HAWKINS, Nortel Networks

Many have discussed the transformation of the metro-access network from a TDM network built to carry voice traffic to a data access point for a variety of next-generation broadband services. Experts say these services can revolutionize the enterprise and the way IT works by outsourcing applications, interconnecting LAN islands, and providing access to the Internet with familiar and inexpensive interfaces. However, interconnecting data across the MAN remains a significant barrier to the realization of this vision in a profit-driven carrier environment.

Resilient-packet-ring (RPR) technology has emerged as the forerunner to overcome these challenges and facilitate the delivery of data services using Ethernet interfaces in a carrier environment. RPR technology addresses factors existing in the current state of the MAN and with an eye toward migrating the current network to a truly futuristic, carrier-class architecture. Most importantly, it tackles the need to explode the bandwidth bottleneck found in today's MAN through efficient use of bandwidth and reliability of transmission.

A resilient packet ring extends the values of packet-based, connectionless networking (low cost and simplicity) to the fiber ring. In addition, it brings with it the ideals championed by optical networks, namely high bandwidth, reliability, and scalability. The result: a resilient, packet-oriented, ring-based solution-a resilient packet ring.

With limited success, previous technologies have attempted to alleviate the bottlenecks that occur within the metro. Frame relay techniques have sufficed for low-bandwidth connections for many years and will continue to do so. ATM, a multiprotocol solution, provided end-to-end connectivity but suffered from complexity and cost issues. Carriers have deployed customized solutions for specific applications such as Escon and Ficon, and even delivered Ethernet services over a combination of FDDI and FOIRL circuits. Yet these approaches all suffer from scalability issues and lack of standardized implementations.

Therefore, carriers continue to force-fit data traffic onto an infrastructure built primarily for voice. While voice remains a critical application in the metro, the growth of data due to the explosion of Internet-based applications suggests carriers adopt a new approach to capitalize on data's unique characteristics. For instance, due to the bursty nature of data, carriers need a new and improved aggregation strategy to make efficient use of expensive fiber in the metro. Carriers desire broadcast capabilities (one-to-many or many-to-many) for a variety of applications, such as digital video broadcasts, as well as less glamorous, but very critical applications, such as address flooding. Creating these capabilities as an overlay to the voice network limits the capability and reliability of the network while driving up its cost.

Figure 1. Time-to-market translates into real revenue as each quarter of delay means sacrificed income and late entry means sacrificed pricing power.

This force fit has created a bottleneck in the MAN in terms of its raw capacity. While the predominant LAN technology, Ethernet, marches along at roughly a 10x increase in capacity every three to five years-with the most recent implementation capable of 10 Gbits/sec-the MAN has not kept up. Meanwhile the network core can switch many Tbits/sec thanks to advances in optical technologies such as SONET and DWDM. Meanwhile, the MAN continues to measure its bandwidth capacity in megabits per second, limiting the end-to-end performance of a WAN to a fraction of its true potential.

Perhaps most importantly, the MAN prevents the rollout of new data services in a timely fashion. The overlay technologies that currently carry data in the MAN are difficult to provision and manage and limit the rollout of new services because of the gap between the order and turn-up of a new circuit. Today's T1 circuits (common to small and medium enterprises), for example, can take weeks or even months to provision.

This long lead-time translates into lost revenue for a service provider creating new services that depend on these circuits. Figure 1 illustrates two mechanisms leading to the loss of revenue. Obviously the delay in offering the service means revenues are not flowing. Less obvious, but equally as important, carriers lose pricing power, because as the new service becomes available, it will likely be at a lower price.

So the bottleneck of the MAN remains a problem, both from the points of view of network capacity, time-to-market, and revenue.

Therefore, carriers must consider changes for a better solution. In the case of existing carrier networks, carriers should embark on evolutionary change. So before significant and long-lasting change can take place, the end result should be considered.

Obviously, service providers want to increase their revenue streams while continuing day-to-day operations with current service offerings. Consider as examples the following revenue-maximizing factors:

Customer loyalty. It's more expensive to gain a new customer than to keep an old one. Service providers want to minimize the number of customers lost because of price or because the current technology simply "runs out of gas."
Value-added services. Maximizing value-added services avoids commodity status that provides differentiation based solely on price.
Customer service. The network should allow the service provider to be proactive in dealing with customer inquiries and issues as they arise.

Several network attributes affect these customer behaviors. Here are a few key questions that need to be asked before making a network decision:

What services will be offered? As mentioned, voice services aren't going away. To the contrary, they remain the "cash cow" for many networks. For one thing, customers have a clear expectation that these services aren't free (unlike many data services). Other legacy services may need to be considered as well, since migrating to a new Internet Protocol-centric infrastructure may not make sense for all of them. Most carriers will need to allow flexibility in supporting legacy services while migrating to a new network in support of data services.

How reliable does the network need to be? Revenue-rich services will be built on so-called carrier-grade platforms that are expected to provide virtually bulletproof reliability-the so-called five-nines (99.999%) network availability, which translates to less than five minutes of downtime per year. "Carrier grade" implies the significant sophistication of hardware and software that have made up voice networks, but increasingly will be required for high-value data services as well.

What quality of service will be of-fered? Service-level agreements (SLAs) specify the contractual obligations of the service provider to the end user and set forth penalties for noncompliance. Customers will pay a premium for services that guarantee a level of bandwidth, latency, availability, response time, etc. Service providers must offer these levels of quality and-perhaps equally important-provide end users with appropriate metrics proving the value they receive.

Is fiber available in my metropolitan area of service? While news of a fiber glut surfaces from time to time, the truth is that in most North American metropolitan areas, fiber is still a significant expense for the network operator. The successful MAN solution will need to take into account the availability of this resource and be able to extract the most efficiency from the installed plant.

How easily can the network grow? The scalability of the network will determine its longevity. Networks can scale in terms of traffic rates (the number of bits carried) as well as traffic sources (number of customers supported). A robust MAN solution will need to pack as many bits and as many customers as efficiently as possible into a fiber strand, yet maintain the quality of service and efficiency described above.

How easy is it to set up and manage the network? It bears mentioning again that the ability to quickly set up new services and upgrade old ones generates added revenue opportunities (see Figure 2). The cost of network operation, for instance, is estimated to be over three-quarters of the full network cost and must therefore be factored into the buying decision. The least expensive solutions (in terms of capital outlay) may turn out to be more expensive in the long run if, for instance, new technical and engineering staff need to be hired. Manageability of the network is as important as the features/services it will deliver.

Figure 2. Quick service setup avoids long payback periods and customer churn while generating operational savings and additional revenue opportunities.

Is the solution available from multiple vendors? This consideration becomes most important as the rollout of the network progresses and dependence on a single source of network equipment raises concerns of over-exposure to a single implementation. Standards-based, multivendor interoperability means solutions can be scaled into a worldwide offering, yet tailored to a given network challenge on a given part of the network. The result will be a simple and cost-effective solution.

These wide-ranging factors trade off against each other in many instances. The most reliable network is not going to be the most efficient, for instance. Similarly, the most efficient may not be the most manageable or scalable.

There are other factors not in consideration here-such as the nature of the traffic flows (meshes, stars, hubs) and the physical arrangement of the traffic sources and destinations. Since these tend to be network- and customer-specific, they require more in-depth analysis. There are multiple solutions to address the MAN network scenario, and all of the factors mentioned here contribute to determining which is optimal. Whatever the solution, it must be flexible and manageable as the selection factors themselves are never static.

Several pre-standard RPR offerings are beginning to emerge based on the proven SONET physical infrastructure already available in virtually all metro fiber rings. Some offerings allow a subset of the ring's bandwidth (up to and including all of it) to be used as shared bandwidth among many sources of packet data. The service provider can quickly roll out Ethernet services in addition to traditional TDM (T1, T3, OC-N, etc.) services using the same infrastructure, platforms, and fiber media (see Figure 3). Thus, the service provider can continue to deliver these traditional (but still profitable) services over the same optical links and network elements.

Figure 3. Bandwidth on the fiber ring can be set aside as either shared data or traditional TDM services.

Provisioning complex and wasteful point-to-point circuits to assemble physical meshes becomes unnecessary as the system enables sharing of the bandwidth on an optical ring. The resulting logical mesh yields significant fiber efficiency that can be quickly upgraded and expanded as the number of users and their bandwidth requirements grow.

The RPR architecture offers a self-healing topology that protects against fiber cuts and node failures by providing duplicate, geographically diverse paths for packet data transport. If a cable cut or other fault impairs a preferred primary route, Layer 2 protection switching automatically forwards affected data packets to the alternate route on the opposite side of the ring. Switching to the protection route requires less than 50 msec (the watermark set by SONET restoration mechanism that has come to be expected in most MAN applications today), with no noticeable service interruption. Thus, with the advantage of the RPR architecture, carriers can offer their customers "five-nines" availability for mission-critical data services.

To support multiple SLAs, RPR must support a means of carrying level-of-service information end-to-end. In one implementation, the system offers flexibility to support customer-generated Ethernet frames either with or without IEEE 802.1p class-of-service priority. In the former case, the eight IEEE 802.1p priorities are mapped to internal edge system priorities while the 802.1p tag itself is carried end-to-end. A fixed, provisionable priority assignment is given to untagged frames, with best effort being the default.

In addition, RPR is envisioned to offer support for guaranteed bandwidth, minimal latency (end-to-end delay), and best-effort services. Currently, that is done depending on the implementation and standardization efforts underway to address vendor interoperability issues.

One of the hallmarks of the nascent RPR technology is its ability to extract significant efficiencies from the network. It does so primarily by means of three statistical gain mechanisms:

Statistical multiplexing. Since packet traffic tends to be "bursty" in nature, the shared bandwidth of RPR eliminates the wasted bandwidth seen in circuit-based connections where fixed bandwidth is "consumed" even in periods of low-packet traffic. This same aggregation effect is provided by a packet switch, the key difference being that RPR comprises a distributed switch with the ring standing in as a backplane.
Spatial reuse. RPR frames traverse the shortest path between communicating nodes on the fiber ring. Therefore, bandwidth is consumed only on those link segments that interconnect those nodes, not around the entire ring. This form of statistical gain increases with the number of nodes in a given ring and can (depending on traffic patterns and number of nodes on the ring) increase the effective bandwidth utilization of the ring by several fold.
Ring protection bandwidth. RPR provides traffic protection via Layer 2 mechanisms, making it unnecessary to reserve 50% of the ring's bandwidth for protection. Depending on the types of traffic offered (based in turn on the SLAs negotiated with end users), this reserved bandwidth could be reduced or even eliminated for a potential doubling of bandwidth efficiency.

Together, these three mechanisms allow more users and more traffic per user on a given fiber ring, which contributes to scalability and flexibility on the network (no more nailed-up circuits required). Carriers can quickly allocate resources to an individual customer based on the negotiated SLA. The distributed-switch approach also allows service providers to quickly scale the RPR network when new customers are added or when traffic increases. The addition of a single new customer interface would traditionally require provisioning of N-1 point-to-point circuits, but is handled automatically by RPR's auto-discovery mechanism.

Before RPR can benefit from widespread use, standards need to be developed that facilitate interoperability among multiple vendors and across international boundaries. As is the case with all shared-media networking technologies, a control function is needed to define which network elements can have access to the media at which times. This media-access-control (MAC) protocol is required for RPR, as the behavior of a ring-based medium is significantly different from other point-to-point implementations. In addition, physical-layer definitions (or PHYs) that characterize the medium itself are required (or existing PHY definitions need to be referenced) before interoperability is possible. The Institute of Electrical and Electronics Engineers (IEEE) 802.17 Working Group and the Resilient Packet Ring Alliance are working in these areas.

The IEEE 802.17 Working Group brings together the leading technical experts in the field of networking to generate a standard specification for the RPR MAC and PHY(s). But before forming a standards committee, proponents must justify its existence to the IEEE sponsor executive committee. They need to rationalize the proposed standard against five well-established criteria:

Broad market potential. A new standard should have broad applications, which can be addressed by multiple vendors to the benefit of numerous users.
Compatibility. The standard should be compatible with the family of 802 LAN standards (embodied in the 802.1 architecture, management, and internetworking requirements).
Distinct identity. The standard must be substantially different from other 802 standards (for example, Ethernet) and provide a unique solution to the problem addressed.
Technical feasibility. The standard must demonstrate technical feasibility, including its implementation and testing.
Economic feasibility. The standard must demonstrate economic feasibility, taking into account known cost factors, including installation costs.

Having completed this work, IEEE 802.17 is now well on its way to defining its objectives and expects to release a first draft of the standard in early 2002. The final draft is estimated to be available a year later. Interest in the group has been high, as some 120 participants from over 60 corporations are voting members of the working group.

The efforts of the IEEE will help accelerate the approval of the standard and hasten availability of multivendor solutions. Several key objectives have received early support, including destination stripping (spatial-reuse concept), dual counter-rotating ring architecture, 1-Gbit/sec and 10-Gbit/sec data rates, multicast traffic, SONET- and Ethernet-based physical layers, protection switching of 50 msec or better, service-layer and physcial-layer independence, and multivendor interoperability on a given ring.

To complement the activities of the IEEE working group, the RPR Alliance was formed as a multivendor industry consortium to focus on marketplace acceptance for the new technology. The RPR Alliance represents all of the major RPR industry suppliers who collaborate to facilitate worldwide multivendor interoperability for their solutions. The charter of the RPR Alliance includes the following:

Educate users, the press, and the public through Web postings, seminars, presentations, trade press articles, technical publications, and press releases.
Support the RPR standards effort conducted in the IEEE 802.17 Working Group.
Promote industry awareness, acceptance, and advancement of the RPR standard.
Provide resources to facilitate convergence and consensus on technical specifications.
Accelerate the adoption and usage of RPR products and services.
Provide resources to establish and demonstrate multivendor interoperability and generally encourage and promote interoperability and interoperability events.
Facilitate communication between suppliers and users of RPR technology and products.

The metropolitan fiber ring continues to experience significant traffic and revenue growth despite periodic slowdowns in other areas of communications. This fact, coupled with the need to improve the MAN's ability to carry data, suggests continued market and customer demand for network and bandwidth solutions enabled by RPR technology.

John Hawkins is senior marketing manager, optical Ethernet, at Nortel Networks (Atlanta). He is also a member of the 802.17 working group and a board member of the RPR Alliance. He can be reached via e-mail at [email protected].