The last few years have seen a substantial increase in interconnect speed, from to 2.5 to 10 Gbits/sec per lane, and from 1 to 32 lanes per interface. When it comes to coding techniques for interconnects, 8B10B is the most widely deployed technique because of its design simplicity and superior performance. However, it has a major drawback: its 25% overhead. Designers have begun to investigate alternative coding methods to overcome this problem. This article explores the strengths and weaknesses of several of these low-overhead coding techniques.
High-speed interconnects, sometimes higher than 10 Gbits/sec per link, are being used in a plethora of standards (Serial ATA, SAS, PCI Express, etc.). These high-speed links are also available in all major ASIC and FPGA platforms. Each one of these interconnects can be split into three main parts:
- Physical (coding).
- Link and protocol (higher layers).
Multirate, multiprotocol SerDes devices are already a reality. For example, the Optical Internetworking Forum (OIF) has defined electrical specifications for two sets of rates, from 5 Gbits/sec to 6.375 Gbits/sec and above and from 10 Gbits/sec to 11 Gbits/sec and higher. OIF also defines the specification for both short reach (8 inches and one connector) and long reach (40 inches and two connectors). SerDes implementations can be made to meet additional specifications (other rates, reaches, or electrical specs).
The physical layer is where data is “coded” to ensure proper operation of the SerDes. It usually aims to ensure enough transition (1 to 0 and 0 to 1), a good DC balance (same number of zeros and ones), as well as other criteria (maximizing bandwidth on the channel, robustness in presence of errors, etc).
For example, the Manchester coding used in 10-Mbit/sec Ethernet is a very simple coding scheme that maps a one to “01” and a zero to “10.” We can deduce many properties of this scheme:
- The maximum run length is is two (there are never more than two zeros or two ones).
- There is always DC balance (the number of ones is always equal to the number of zeroes).
- We can detect 1-bit errors (00 or 11 are invalid codes).
- There is extremely high overhead (100%).
Scrambling is another technique used in serial links. This method uses a pseudorandom generator sequence that is “mixed” with transmitted data to provide transitions, DC balance, etc. A pseudorandom generator is a sequence of maximum length generated by a linear feedback register.1 This sequence repeats after 2n (where n is the polynomial degree). For example, SONET/SDH uses x7+x6+1, for a sequence of length 127.
There are two types of scramblers: side scramblers and self-synchronous scramblers. Side scramblers are generally XORed with transmitted data and require a mechanism to “reset/synchronize” the scrambler state between the receive and the transmit. Self-synchronous scramblers feed the data into the scrambler state, and as such do not require any reset or synchronization.
8B10B is the most widely used type of coding. It is used for Serial Attached SCSI, Serial ATA, Fibre Channel, Gigabit Ethernet, XAUI (10-Gigabit Interface), PCI Express, InfiniBand, Serial Rapid IO, HyperTransport, and IEEE 1394b (FireWire).
8B10B2 maps 256 data and 12 control codes into 10-bit codes (see Table 1). The coding method was carefully selected to provide different “good” properties. These properties include sufficient transitions to guarantee proper SerDes function, DC balance by ensuring that there are an equal number of ones and zeros in a stream, ease of alignment (finding where the byte starts in a bit stream), robustness (tolerance to errors), and low design complexity.
All codes used in 8B10B have from 3 to 10 transitions. Each code word never:
- Generates more than four ones or zeros in a row.
- Creates an imbalance greater than one (five ones/five zeros, four ones/six zeros, or six ones/four zeros).
DC balance in 8B10B is guaranteed by following a simple scheme. Using the properties described previously, each character is assigned two mappings (the code and the inverse of the code); the transmit process selects the appropriate code (+/-) to keep the running disparity between ±1.
For alignment, a special pattern (a comma) that can only be found in control characters (K28.5, K28.1, K28.7) is used to find the alignment.
8B10B can also detect simple errors (268 codes and their inverse are used) as well as any error that causes the disparity rule to be violated. However, this rule can sometimes cause error replication, a case where error bit(s) cause the receiver to error following data even though it was received with the correct value. 8B10B also was designed with attention to coder and decoder complexity. The coding process is really a 3B4B code and a 5B6B code.
One of the few improvements not part of the original 8B10B code is adding scrambling before the data is encoded. There is evidence that repeating patterns may introduce unacceptable errors due to either specific patterns3 or differential group delay. The most straightforward solution to this issue is to use a scrambler to randomize the data before encoding.
All of these features have made 8B10B the most widely used coding method. There is only one major drawback to 8B10B, and it’s the large 25% overhead.
Recently, several low-overhead alternatives to 8B10B have been proposed. These improvements mainly address two issues. First, as the link speed and number of links increase, 25% overhead is too high. Second, because of the progress in integration, hardware complexity (in terms of gate count) is not as important as it once was.
Four low-overhead encoding methods that are very similar have received the most attention:
- 64B66B defined in 10-Gigabit Ethernet4 (10GbE).
- OIF CEIP5.
- 10GBase-KR6 defined for 10GbE over backplane.
Each of these methods significantly reduces overhead at the expense of higher gate counts.
64B66B coding. This code was defined for 10GbE (10GBase-R) and is based on a hybrid (coding and scrambling) technique. First, data is grouped into 8-byte words (64 bits). This data is then scrambled using a self-synchronous scrambler (x58+x39+1).
If the 8 bytes are data, “01” bits are added. If one or more bytes are control bytes, then “10” bits are added.
The process of mapping 8 bytes of data or control (as defined in 8B10B) into a 64-bit word is usually referred to as transcoding. GFP-T8 provides a method of mapping any combination of 8B10B codes into a 64-bit word. 10GbE4 maps only a subset relevant to 10GbE and 10-Gbit/sec Fibre Channel. Sync bits (01 or 10) are used for alignments and are never scrambled. Alignment is possible because over the long run the sync bits are always “01 or 10”; other bits take all values (00, 01, 10, 11) because of scrambling. The sync bits also guarantee that there is at least one transition every 66 bits.
One drawback of self-synchronous scramblers is error replication; each bit that is errored produces three errors because of the descrambling process.
CEI-P coding. CEI-P was defined by the OIF; it uses the same overhead as 64/66 (~3%). However, CEI-P has many differences:
- CEI-P uses a side scrambler (x17+x14+1). This has the advantage of no error replication (the scrambler state is not affected by errors on the line), but it also has the drawback of requiring a method of syncing the scrambler state between the transmit and receive.
- While side scramblers are immune to errors on the line, they can produce very long sequences of ones and zeros if the transmitted data is the same (or inverted) as the value of the scrambler. Very large scramblers (longer sequences) are more robust in this aspect than smaller ones.
- CEI-P uses framing instead of alignment. While 64B66B uses the sync bit to find the word, CEI-P puts 24 words into a “frame.” Only 1 bit is used to determine if a word is data or control. The additional 24 bits are used for error correction and signaling.
- Error correction uses 20 bits and is based on fire-code (correcting error bursts of up to 7 bits).
10GBase-KR (Ethernet in the backplane). This coding is similar to CEI-P, as it uses the same overhead (3%). The main difference is the frame size (32 words instead of 24). The error correction code is 32 bits (instead of 20 bits for CEI-P), which allows for correction of larger error bursts. The scrambling is similar to 10GbE. However, it uses an additional side scrambler with the same polynomial as 10GBase-R, with the initial pattern of “010101…,” and this is reset every frame.
Interlaken PHY. Interlaken is another method for coding data. It has an overhead of 4.5% (64/67). It is based on a word of 64 bits. However, Interlaken is different from the other low-overhead coding methods in the following ways:
- The sync field is 3 bits (2 bits provide the distinction between data and control as well as 1 bit for inverting the data). The data invert bit uses a similar technique as 8B10B for guaranteeing DC balance.
- Interlaken uses a side scrambler (with the same polynomial as 10GbE); this avoids the issue of error replication associated with self-synchronous scramblers. A sync word is used to carry the scrambler state. The very large scrambler length also makes it immune to long streams of zeros or ones.
- The link is protected using a CRC32 checksum (error protection).
Table 2 highlights the differences between the coding techniques we have discussed. The criteria used for comparisons are:
- Transition density and DC balance.
- Error (protection, detection, replications).
- Complexity (gate count).
While 8B10B is a very effective and widely used code, its large overhead is becoming a major challenge as we go into interconnects and systems with several hundreds of gigabits. There are several low-overhead alternatives, each of which has various advantages and drawbacks. So far, there is no single widely used technique for low-overhead codes (as is the case for 8B10B). However, the approaches highlighted in this article address a range of requirements, and all offer low overhead at the expense of higher gate complexity.
Med Belhadj is a systems architect at Cortina Systems (www.cortina-systems.com).
1. PN Sequence Generator, http://www.mathworks.com/.
2. A. X. Widmer, P. A. Franaszek, “A DC-Balanced, Partitioned-Block, 8B/10B Transmission Code,” IBM J. Res. Develop., Vol. 12, No. 5, Sept. 1983.
3. Andrew W. Moore, et al., “Explaining Structured Errors in Gigabit Ethernet,” Intel Research Report, IRC-TR-05-032, 2003.
4. IEEE 802.3 Standard, 2006.
5. CEIP, “Common Electrical I/O - Protocol (CEI-P) Implementation Agreement,” IA# CEI-P-01.0, OIF March 2005.
6. 10GBase-KR draft 3.0, http://www.ieee802.org/3/ap/.
7. Interlaken Protocol Definition, version 1.03, July 2006, http://www.cortina-systems.com.
8. ITU-T G7041/Y.1303, pp. 30-32, December 2001.