Troubleshooting IP broadcast video quality of service
By John Williams, JDSU -- IP-based video transmission places unique demands on today's data-optimized networks. To ensure proper service delivery, measurement of the accumulated effects of the IPTV network on critical, application-specific quality-of-service parameters must be made during service installation.
IP-based video transmission places unique demands on today's data-optimized networks. To ensure proper service delivery, measurement of the accumulated effects of the IPTV network on critical, application-specific quality-of-service parameters must be made during service installation.
By John Williams
Optimized for data transmission, today's data networks meet specific performance levels that are generally based on packet loss parameters. Typical data applications compensate for lost packet events by retransmitting lost data, and this retransmission enables error-free performance at the application level. However, voice over Internet Protocol (VoIP) and IP video services place more stringent requirements on data networks.
New class-of-service (CoS) mechanisms enable data networks to achieve acceptable levels of performance to support both VoIP and IP video. For example, packet loss rates of up to 4% of the traffic flow and packet jitter approaching 40 ms can be tolerated without negatively impacting the delivery of toll-quality voice. But broadcast video services cannot tolerate a packet loss rate of more than 0.1% or 5-10 ms of packet jitter. In actuality, voice CODECs can hide lost data more effectively than lost video can be masked. Thus, network performance demands of today are much greater than in the past.
To ensure proper service delivery, measurement of the accumulated effects of the IPTV network on critical, application-specific QoS parameters must be made at the customer premises during service installation. Gathering and recording QoS data is the key to rapid, efficient trouble resolution. The quality of the source material or video content and the quality of the video decoder in the set top box (STB) determine the "potential" quality of the video. The network is a variable, and it can only detract from the design quality. Thus, it is important to measure all network-specific variables, including both content and network quality.
The quality of the content is the starting point. Decisions made in the video head end, where the content is acquired, determine variations in quality. The initial quality of the video stream is established by the content sources used, the compression algorithms implemented, the encoders employed, and the source quality monitoring system that is present. The data output of the encoders starts the video packet flow. Two critical content quality parameters can be measured in MPEG-2 transport stream video flows: The video transport packet error indicator count and program clock reference (PCR) jitter.
The error indicator is a bit that is set by the encoders in any transmitted video packet where the encoders detect corrupted source content. The presence of packets with this indication is strictly related to content quality; it is not related to the performance of the distribution network. Monitoring video encoder output streams in the head end can detect this condition and provide an early opportunity for problem resolution. Error indicator counts seen at the customer premises reveal a source quality problem.
Timing in the transport stream is based on the 27-MHz system time clock (STC) of the encoder. To ensure proper synchronization during the decoding process, the decoder's clock must be locked to the encoder's STC. In order to achieve this lock, the encoder inserts a 27-MHz time stamp into the transport stream for each program. This time stamp is referred to as the program clock reference (PCR). Video decoders use the timing signal to synchronize to the encoded data stream so they can properly decode the audio and video program material. Excessive PCR jitter will adversely impact the decoder, resulting in visual impairments such as pixelization, frame freezes, and loss of color. The amount of PCR jitter that is considered excessive is not a constant; it is determined by various parameters, including the input buffer sizes of the decoder and the design of the STB software.
Several factors can cause PCR jitter. The mostly likely include overall network packet jitter, transcoding problems in the encoder, and local ad insertion issues.
Network performance is another determinant of video quality. It can be divided into a few specific parameters, including IP packet loss, IP packet jitter, and Internet Group Management Protocol (IGMP) latency. Each of these parameters can be analyzed at the customer premises or in the last-mile access network.
Packet loss is measured by analyzing video packet flows and determining the presence of a continuity error event. Missing packets, out of sequence packets, and duplicate packets are all counted as errors. Because each video transport stream packet carries a sequence number, continuity errors can be determined with certainty. An MPEG-2 transport packet is 188 bytes in length, and an IP frame typically carries seven MPEG-2 transport packets. Thus, losing one IP frame results in the loss of seven MPEG-2 transport stream packets. Each of these events can cause decoding errors. Depending on the temporal or spatial components contained in the missing packets, a single packet error event may or may not be seen on the TV screen. However, actual network performance is measured by the packet loss parameters, regardless of whether the decoder can hide the problem.
If the overall packet flow experiences excess jitter due to congestion problems and resulting CoS mechanism performance issues, packet jitter can be the cause of PCR jitter. If it is excessive enough, packet jitter can cause decode buffers to deplete, which, in turn, causes gaps in the decoder output. Gaps may appear as freeze frame or pixelization events seen on the TV screen.
IGMP is the signaling protocol used to access broadcast video services that use a multicast network design to efficiently manage network bandwidth. In this implementation, a join message is sent from the STB to the network. The join message asks the network to send the requested program or channel to the STB by joining a multicast group carrying the desired broadcast channel. IGMP latency, then, is the time between when the join message is sent and the first video packet is received by the STB. This parameter measures network performance, but not the end user's experience with regard to channel changing time. The IGMP latency plus the time it takes to fill the decode buffer and to decode and display the content is the total user experience time. However, the buffer fill time and the decode time are functions of the network architecture and are not variables.
The following parameters measure the critical variables in the delivery of broadcast video service:
|Packet loss||≤ 0.1%|
|Packet jitter||≤ 5 ms|
|PCR jitter||≤ 10 ms|
|Error indicator||Zero count|
|IGMP latency||≤ 250 ms|
Note: Specific STB/decoder designs may be able to tolerate higher levels of jitter based on larger buffer designs. In some cases, jitter levels of 15 ms or higher may be acceptable.
All Channels vs. One Channel
When the above measurements are made simultaneously on more than one channel, it is easy to separate content problems from distribution network problems. This is a critical determination for problem resolution.
Packet loss or continuity error problems are typically seen on all channels or programs coming to the customer premise because they are not source- or content-related problems. If packet loss is present, analysis of the physical layer at the xDSL interface or Ethernet interface will aid in the sectionalization of the problem. If no physical layer errors are present, then packet loss is most likely caused by the distribution network and not by the access network. In this case, congestion is most likely the issue.
Further analysis of the temporal component is important. Are packets lost during known peak traffic times during the day? Are packet losses coming in bursts with intervals of no loss? Or are they random, single, or small packet loss events? Bursts of loss are symptomatic of buffer overflows related to heavy traffic. Random, single, or small events are more likely caused by noise hits on the access network that are impacting packet flows.
Packet loss may be due to DSL loop performance in an access network that is pushing the bandwidth limits of particular areas where signal-to-noise margins are low and the addition of a second or third channel flow reaches 100% capacity of the loop. In addition, the copper may be poorly balanced, thus allowing high impulse noise to impact data flows. Or in-home wiring may introduce noise, which damages data flows.
PCR jitter problems may be due to content quality problems or overall network packet jitter. The source of the problem can be differentiated by evaluating IP packet jitter and PCR jitter on more than one channel or program at a time. If excessive PCR jitter is present on more than one channel, network jitter is most likely at fault. If excessive PCR jitter is present on only one channel, then a source problem is typically the cause.
Error indicator analysis further reveals content problems. Since the indicator can only be set by the encoder, it specifically reveals content-only problems. Typically, this affects only one program or channel. However, if a multiple program feed in the head end is experiencing problems, more than one program or channel may be affected. In this case, analyzing a channel from another source or feed is recommended.
Typically, IGMP latency is similar for multiple channels. However, if network topology and network management locates access to certain program materials deeper in the network, then differences in IGMP latency may be experienced. If such a hierarchical approach is used, differences may be detected based on network access points. Testing multiple channels or programs to exercise this network design is useful.
In summary, determining whether poor video QoS performance is seen on more than one channel will effectively initiate the resolution process. Simultaneous analysis of the key QoS parameters on multiple video flows will further refine this analysis and lead to efficient and effective trouble resolution.
John Williams is director of emerging markets and alliances at JDSU (formerly Acterna). He may be reached via the company's Web site at www.jdsu.com.