Ensuring cloud computing performance on data communications networks, Part 2

Jan. 9, 2012
Let’s look at one of the most common communication paths within the cloud, the client-to-server reference point, and the challenges faced in delivering a high quality of experience for cloud service users.

In Part 1 of this three-part series on cloud computing, we introduced you to the concept of reference points (RPs) as a means for software designers and network architects to better understand and address specific problems in the cloud communications infrastructure. Now let’s look at one of the most common communication paths within the cloud, the client-to-server reference point, and the challenges faced in delivering a high quality of experience for cloud service users.

The client-to-server RP corresponds to communication between a client device and the application server in the cloud. These reference points are labeled 1 and 6 in the diagram we shared with you last time (see figure below).

This type of communication happens through the access network, consisting of a combination of intermediate broadband access devices (i.e., cable, DSL, fiber, wireless), switches, routers, and possibly other communication devices. For the public Internet access case (RP1), the communication is likely to happen over an unmanaged network with limited control over quality of service (QoS). For the enterprise case (RP6), communication is likely to happen over a managed network with controlled QoS and other characteristics. The QoS enforcement on a per-user session basis, with support for bandwidth guarantees and fairness, is of paramount importance. This directly affects user experience, which is expected to be on a par with a locally deployed application. This type of connectivity is not widely available today in the public Internet access mode because most Internet access connectivity is on a “best effort” basis. QoS treatment, therefore, should accommodate the bursty nature of the traffic, provide proper fairness among users in cases of congestion, and offer guarantees for the worst case performance. It is also important to provide mechanisms for the monitoring of per-user session QoS operations, like measurement of packet loss and delay, to diagnose and repair problems. Per-user session QoS ensures that different applications receive the appropriate QoS treatment. For example, voice services would receive the highest-cost, highest-quality treatment while data transfer might work well with low-cost best effort QoS treatment.Meeting QoS requirements The widely used Carrier Ethernet specifications from the Metro Ethernet Forum (MEF) solve these problems. The MEF model includes creation of an “Ethernet Virtual Connection” (EVC) between two User Network Interface (UNI) Ethernet ports that provide points of access for communication services. In a cloud access scenario, one UNI is at the customer site and another UNI is at the cloud service provider. The MEF is in the process of defining new specifications for providing convenient methods of provisioning or dynamically setting many EVCs from multiple locations to one location in a hub-and-spoke topology. Today the MEF defines encapsulation of EVC traffic across Carrier Ethernet networks using VLAN tags. According to 802.1ad, in the future there may be new encapsulations. MEF QoS specifications define traffic service-level agreements (SLAs) per EVC. SLAs include a traffic profile descriptor and traffic delivery guarantees that apply to the traffic within the profile. The traffic profile contains parameters like Committed Information Rate (CIR), Excess Information Rate (EIR), and Committed Burst Size. CIR, also sometimes called minimum or guaranteed rate, defines the average traffic rate that must be delivered across the network with performance guarantees. Committed Burst Size defines how bursty or uneven the instantaneous traffic rate may be so that it still can be considered within profile. EIR (also sometimes called maximum rate) defines an average traffic rate on top of the CIR that may be delivered by the network if the network is not congested. The MEF and ITU-T also provide mechanisms for service QoS performance monitoring using Ethernet service OAM packets. These mechanisms ensure that communications providers meet required SLAs. In particular, use of service OAMs enables the measurement and monitoring of the following characteristics:
  • continuity of connection
  • loopback response from intermediate nodes
  • trace of intermediate nodes on the path
  • frame delay measurement
  • frame loss measurement.

In addition, MEF services use Layer 2 Ethernet service definitions. This practice minimizes problems with migrating IP addresses across IP networks. As a result, the service can transparently switch to another computer with the same IP address in a different geographical location as long as they are connected using MEF EVC.

Server-to-server communication

Server-to-server communication occurs among the application servers within the cloud. This communication happens in very specific and well-defined scenarios. The server-to-server RP is shown at Point 2 in the figure above.

An example of server-to-server communication is when the application uses multiple geographically distributed machines to work together to accomplish certain intensive computing tasks. This is sometimes called clustered applications. Often these computers are collocated, and therefore the high bandwidth they require is relatively inexpensive to provision in a data center environment.

Another example of server-to-server communication requirements is when a virtual machine with applications running on it is migrated from one physical box to another. This may be done for load-balancing, maintenance, or other operational reasons. This presents a different communications challenge for the cloud network.

For a virtual machine to be moved smoothly and semi-transparently for the client, it is better to be moved with its IP address unchanged. This way the state of the TCP connection is preserved inside the networking stack and the communication may be resumed immediately after the virtual machine state image is transferred to another box. The fact that the IP address does not change means that servers must be connected at Layer 2, which fits well with the MEF Layer 2 communication model.

Bandwidth-wise, migration of a virtual machine usually requires communication of gigabytes of traffic between machines. If it is to be done within a reasonable amount of time, up to 1-Gbps sustained-rate bandwidth should be provided. No encryption of traffic is required in this case since the information is not application-specific and is communicated in a controlled cloud environment, often within the same room, building, or campus.

Future work
Communication technologies are key enablers of cloud computing. Protocols, traffic management, and control of wide-area communications are rapidly evolving. The reference points defined above are offered to facilitate discussion about communications within the cloud. Applying industry standard techniques to these reference points can assure cloud network designers and application developers robust application performance on cloud networks

More work is being done on cloud communications protocols. As cloud networks grow, new separations will occur between and among cloud elements. New cloud elements will arise. New and specialized APIs to connect cloud elements will be developed. These will also further integrate telco and Web applications into cloud networks.

However, there are already sufficient bandwidth and adequate traffic management techniques to ensure the continued growth of cloud networks. Looking out a few years, it is easy to imagine a time when cloud services become indispensable to business operations and to people’s daily activities.

Mannix O’Connor is director of technical marketing at MRV Communications. He was chair of the Access Working Group for the MEF and founding secretary of the IEEE 802.17 Working Group. Mannix is a coauthor of the recent book Delivering Carrier Ethernet, published by McGraw-Hill.

Vladimir Bronstein is an independent consultant and has more than 20 years’ experience in telecom and data networking as a systems architect and director of software engineering. His experience encompasses broadband access, optical networking, and wireline and wireless networking as well as cable technologies. He has several patents pending for his networking innovations and has participated in industry standardization activities.