Untangling the web of fiber-optic Internet backbone approaches

Sept. 1, 1998

15 min read

Gordon C. Saussy

Torrent Networking Technologies

The popular theme for 1998, and perhaps the rest of the decade, is convergence--convergence of traditional local area network-based data networking with traditional telecommunications networks into a supercharged Internet. Both incumbent and emerging carriers are aggressively laying out miles and miles of new fiber in anticipation of the bandwidth demand that will be fueled by Internet expansion.

Yet, even as this build-out takes place, standards bodies and ad hoc consortia in the data-communications and telecommunications industries are still battling over different approaches to the architecture of such networks. Two facts are clear: These new networks will carry Internet protocol (IP) traffic, and they`ll run over fiber-optic cables. The alternatives have been proposed for the infra structure between the IP datagrams and the fiber-optic cable. It`s important to identify the trade-offs and implications that accompany these alternatives.

"Protocol sandwich" alternatives

IP is a network-layer protocol. The network layer--Layer 3 in the International Organization for Standardization (iso) model--provides for end-to-end (host-to-host) delivery of datagrams (or packets) across a complex, hierarchical network (potentially built from many different technologies). Every IP datagram includes a Layer 3 destination address that defines a unique endpoint in the global Internet, a Layer 3 source address that defines the originating host, and Layer 4 port numbers that indicate which upper-level applications within each host are exchanging the datagram.

IP doesn`t specify any characteristics underneath the datagram level--it relies on the underlying Layer 2 (Data Link Layer) and Layer 1 (Physical Layer) to establish these characteristics and allows many different options for these layers. Between the IP layer and the physical fiber-optic cable, there are many different Layer 2-Layer 1 "protocol sandwiches" that will solve the problem of transporting IP datagrams across a fiber-optic network. The three sandwich options currently in contention are IP over atm over sonet over fiber, IP over sonet over fiber, and IP over fiber. Implicit in each sandwich is the option of using wavelength-division multiplexing (wdm) in layers above the fiber-optic cable to create multiple virtual paths along the same optical strand.

IP over atm over sonet over fiber: This approach maps IP datagrams to Asynchronous Transfer Mode (atm) cell streams with an atm virtual circuit, segmenting the datagrams into two or more cells. The atm virtual circuits are then mapped to point-to-point channels over a Synchronous Optical Network (sonet).

Proponents of this approach point to the many value-added features of both the atm and sonet layers. atm standards provide for quality of service (with numerous options), signaling of virtual circuits, multiplexing of IP traffic with other traffic types, and numerous other benefits. sonet mechanisms guarantee resiliency of physical paths within the network and allow for further circuit-level multiplexing.

The primary argument against this approach is the overhead of atm. A minimum-sized IP packet (64 bytes) maps to two 53-byte cells--that`s more than 65% overhead! Even the largest IP packets encounter the fundamental overhead of atm`s cell structure; 5 out of 53 bytes, or nearly 10% of the data stream, are used for nonpayload functions.

atm`s detractors also contend that the quality-of-service benefits of atm will be provided just as effectively (or even more so) through new quality-of-service mechanisms being defined for IP, since IP is the end-to-end protocol. If atm isn`t used for quality of service, and if there`s no requirement for multiplexing non-IP traffic with IP traffic, the case for running IP over atm is certainly weaker.

IP over sonet over fiber: This approach throws away the atm layer, and instead maps IP datagrams directly to sonet frames with minimal encapsulation. sonet is still used to provide resilient physical paths across the backbone network. Different encapsulation techniques may be used to provide for some multiplexing of different IP data streams.

Relative to IP over atm, the benefit here is obvious: Nearly all the overhead of "cellification" is recovered, although there`s still a small amount of overhead due to encapsulation. This difference is most obvious for short packets; but for all packet sizes, some amount of bandwidth is recovered. As noted, it`s still possible to multiplex packet streams using some sort of identifier in the packet header.

Because variable-length packets rather than fixed-size cells are being transmitted and handled by intermediate nodes, there`s statistically more jitter in IP-over-sonet networks. Other aspects of quality of service are still debated, as noted previously. New IP standards for quality of service are focused on end-to-end delivery across IP networks and will work in both IP-over-atm and IP-over-sonet environments.

IP over fiber: This is the latest approach to be proposed. It really involves mapping IP datagrams to a simplified sonet-like frame optimized purely for raw data traffic, with all overhead associated with voice delivery stripped out.

The argued benefit here is again more efficient use of bandwidth. If the sonet network is being used only for IP datagrams, and if it`s really using sonet only as an interface into a wdm network, then why use any of the overhead for multiplexing voice circuits and implementing resilient rings? Instead, allocate the entire payload to IP traffic.

This is certainly the logical conclusion of the IP-over-sonet argument. If the requirement for carrying traditional voice circuits across the backbone is removed, it makes sense to remove all of the overhead associated with this requirement. But as yet there are no standards in place for doing this, so implementers will need to proceed with caution to avoid being locked into a proprietary system.

Building IP networks

There`s more to the problem than simply the protocol sandwich that puts IP datagrams onto optical fibers. As stated previously, IP is a network-layer protocol, operating end-to-end in a large, complex hierarchy. Within an IP network, routers perform a "Layer 3 relay" function; they inspect the destination address on incoming packets and determine the appropriate "next hop" by looking into their routing tables. Routers communicate with other routers using an extensive suite of standard protocols (ospf, bgp, dvmrp, pim, etc.) to maintain up-to-date information in their routing tables. As networks and vendor product lines have evolved, a number of different architectural paradigms have been proposed for Internet backbone networks.

Routers interconnected by point-to-point links: This arrangement could be called the "classical" approach: Each router in the backbone network connects directly to a "peer" router via a point-to-point wide area network link (leased-line), and all packet handling is done exclusively by the routers. The classical approach is widely deployed today on T1/T3 leased lines and scales readily to OC-3 (155-Mbit/sec) and OC-12 (622-Mbit/sec) sonet links (see Fig. 1).

The primary drawback to this simple approach is scalability. If each router in an N-router mesh connects to every other router point-to-point, N*(N-1) links are required. For a 100-router network, that`s 9900 wide-area connections! Besides being expensive to the point of impracticality, this scenario would require every router to have 99 high-speed ports dedicated to interconnection, leaving few ports for the actual delivery of traffic. The obvious alternative--a reduced mesh with less than full interconnect--means bottlenecks and increased transit time (router "hops") for some traffic.

It`s also possible to build a hierarchical mesh, with some routers acting as "hubs" for other routers. This approach can scale more gracefully, but it requires the "hub" routers to operate at very high speeds. Until recently, there were no products offering the necessary density and performance to scale to tens of millions of routed packets per second. While new super-routers may be able to scale point-to-point meshes to gigabit and even terabit levels, a number of other approaches have been developed that also propose to solve the scalability problem.

Routers interconnected by Layer 2 "clouds": This is a refinement of the previous approach. Instead of leased-line interconnections, routers interconnect via Layer 2 switch clouds (atm switches, frame-relay switches). Within these clouds, Layer 2 addresses (or pvc labels) are used to switch traffic, which simplifies processing requirements and potentially increases scalability. Instead of using a hierarchy of "hub routers," networks use atm or frame switches as hubs for router-to-router interconnects (see Fig. 2).

The benefit of this approach has traditionally been speed and scalability. Looking up an IP address in a routing table is complex: There are hundreds of millions of active addresses in the Internet, and the complex "longest-prefix-match" search technique required in the compressed routing table has meant slow performance. Simpler Layer 2 switches (such as atm and frame-relay switches) operate in a much smaller pool of active addresses and can perform simple "exact match" searches. This capability has allowed vendors to build fast, inexpensive switches--two qualities at which routers traditionally didn`t excel. Proponents of these clouds also argue that Layer 2 switching is inherently simpler than routing to configure and manage.

It`s interesting to note that most actual implementations of these clouds use the Layer 2 switches to implement "virtual" point-to-point connections between the routers. atm and frame-relay pvcs replace sonet circuits, but an N-router network still requires N*(N-1) circuits to be defined. There`s some bandwidth gained through statistical multiplexing of traffic on these pvcs, but the overall complexity of the network is just as high as in the full-mesh case. This complexity has driven the industry to seek other alternatives.

Traffic-driven cut-through routing: This is a totally different scheme, developed with the goal of minimizing the requirement for complex route lookups in the data path. In this approach, a centralized "route server" handles routing of the first packet (or first few packets) in every host-to-host stream, then enables a direct cut-through path across a switched network (using some protocols to communicate with switches and edge routers). atm Forum multiprotocol over ATM (mpoa) and Ipsilon`s IP Switching are both examples of traffic-driven cut-through architectures. A general model, applicable to either of these specific approaches, is presented in Figure 3.

Here`s how it works in more detail: The first packet of a new host-to-host data stream arrives at an edge router in a cut-through backbone. The edge router consults a small local table--a "cache"--and concludes it doesn`t know where the destination host resides in the network. It then forwards the packet to the central route server on a default virtual circuit.

The route server inspects the packet and performs a full longest-prefix-match lookup to determine the location of the destination host. It then forwards the packet along to its destination. At the same time, the route server sends a message back to the originating edge router telling it the location of the destination (e.g., the atm address of the next-hop edge router).

Now the originating edge router can signal a virtual connection (a switched virtual circuit) across the switched backbone and forward traffic directly. All subsequent packets in this stream are forwarded along a cut-through switched virtual circuit, with no requirement for longest-prefix-match lookups.

This approach may seem elegant and powerful, but it`s not suited at all to the traffic requirements of Internet backbones. Most Internet host-to-host sessions are short-lived, perhaps on the order of a couple of dozen packets, and within a large Internet backbone, there may be tens of thousands of concurrent sessions. With cut-through routing, the level of signaling traffic created by each session is an enormous added load on the network`s capacity and on the processing capacity of its elements. For this reason, there are few proponents of cut-through routing for Internet backbone environments.

Topology-driven label switching: This is another unique approach. Here, routers surround a label-switching cloud (much like the Layer 2 cloud in Figure 2). Routers and switches in the cloud use a dynamic protocol to associate a label (a Layer 2 address) with each "next-hop router" around the cloud. The routers at the edge of the cloud apply these labels to packets as they process them, and switches within the cloud switch using the label. Cisco`s Tag Switching, Ascend`s IP Navigator, and the emerging Multiprotocol Label Switching (mpls) standard are all versions of topology-driven label switching (see Fig. 4).

This may seem similar to the Layer 2 cloud approach, but it adds a powerful twist. In the Layer 2 cloud model, pvcs had to be administered by the network operator, and each edge router required (N-1) pvcs to every other router. In the label-switching model, virtual circuits are dynamically created and associated with each edge router, simplifying the administrative requirements. This task is done using a new "tag distribution protocol" between edge routers and label switches. Tag distribution also can reduce the number of virtual circuits required in a large mesh. Rather than having every router-to-router virtual circuit span the backbone, these circuits can be aggregated at switches and traffic can be merged.

Overall, label switching is a powerful architectural concept that addresses the scaleability requirements of the Internet. But it`s based on new protocols and standards that are not completed and not proven in large environments. As new routers emerge that can handle longest-prefix-match processing at line speeds, the case for any of these alternative solutions becomes weaker. There are environments where they`ll be deployed, but none is likely to sweep away the others.

Putting it all together

With three different protocol sandwiches, four different architectural paradigms, and multiple vendor-specific proprietary implementations, there could be dozens of alternative solutions. Actually, there are really six fundamental ways to build networks based on the possible combinations, with a few variations within each:

Approach #1: IP-over-sonet (or IP-over-fiber) router mesh. This approach is the simplest solution--just connect a mesh of routers together using IP-over-sonet (or simplified IP-over-fiber) links. Implement a full mesh or a hierarchical mesh, use OC-3 and OC-12 interconnects (with future migration to OC-48 [2.5 Gbits/sec] and OC-192 [10 Gbits/sec]), and route everything. The one drawback is that fast routers capable of handling tens of millions of route lookups per second in a large table are needed. While existing routers can`t do this, new products are emerging that are capable of these speeds.

Approach #2: IP-over-sonet (or IP-over-fiber) frame-switched cloud. This approach is also fairly simple. Edge routers connect to fast frame switches over sonet (or simplified) links. The fast frame switches make forwarding decisions based on some packet label (pvc identifier)--this could be fast frame relay or some other encapsulation scheme. Routers at the edge still need to handle fast longest-prefix-match lookups. Switches in the cloud do simple label lookups.

Approach #3: IP-over-atm pvc-switched cloud. This approach is quite similar to approach #2 but uses atm cells inside the cloud instead of IP frames. It adds the benefits of atm multiplexing and quality of service, but also adds the overhead of atm cells. Edge routers must not only perform longest-prefix match, but also must implement atm segmentation and reassembly at OC-3 or OC-12 speeds.

Approach #4: IP-over-atm traffic-driven cut-through. This is essentially the approach used in atm Forum mpoa and Ipsilon`s IP Switching. While cut-through is possible in any type of network, it has only been implemented in atm networks. Here, longest-prefix match lookups are only performed by a central "route processor"; all edge router and switch processing uses simple labels. This is the only approach that truly can`t scale to meet Internet requirements, due to high levels of signaling traffic for short-lived sessions.

Approach #5: IP-over-atm label-switched cloud. Label switching was originally proposed as an alternative model for mapping IP traffic to atm networks, with the introduction of Cisco`s Tag Switching. Here, edge routers perform full longest-prefix match on incoming traffic and apply a label to each packet as determined by tag-distribution protocol interactions with switches in the cloud. These switches simply forward traffic based on the assigned labels. Since the label in an atm network is typically a virtual circuit identifier, circuit atm traffic and IP label switched traffic can be mixed in an atm network.

Approach #6: IP-over-sonet (or IP-over-fiber) label-switched cloud. The scope of label switching was extended beyond atm to apply to IP-over-sonet (or IP-over-fiber) environments as well. Via some encapsulation protocol, a label is associated with every packet. Label switching can then be performed by large frame switches--or by routers operating in a "dual mode." The elimination of the atm layer drops multiplexing with other traffic types but improves efficiency on the use of the data path.

Who wins?

The end-to-end protocol for the Internet will be IP. The backbones will be built from fiber-optic cable. At least the edge routers in the network, and maybe all the routers, will need to do full longest-prefix-match lookups on every packet. And that`s about all that`s clear.

There are six fundamental approaches to building these backbones, and there are multiple variations of each approach. Each has strong equipment-vendor proponents; most have a basis in standards and some customers as well. At least five of these fundamental approaches have the capability to solve the scaling and routing problems of the Internet.

So who wins? The market probably will settle on two or three primary approaches, with no single dominant architecture. Each major vendor will push its own approach, usually with some proprietary hooks built in. Architects and managers of Internet backbones will have their hands full evaluating these new technologies and making sure vendors don`t capture them with proprietary solutions. u

Gordon C. Saussy is marketing vice president for Torrent Networking Technologies (Silver Spring, MD).