In this contribution, an evaluation of the effectiveness of Application Layer-Forward Error Correction (AL-FEC) scheme in video communications over unreliable channels is presented. In literature, several AL-FEC techniques for reducing the effect of noisy transmission on multimedia communication have been adopted. Recently, their use has been proposed for inclusion in TV over IP broadcasting international standards. The objective of the analysis performed in this paper is to verify the effectiveness of AL-FEC techniques in terms of perceived Quality of Service (QoS) and more in general of Quality of Experience (QoE), and to evaluate the trade-off between AL-FEC redundancy and video quality degradation for a given packet loss ratio. To this goal, several channel error models are investigated (random i.i.d. losses, burst losses, and network congestions) on test sequences encoded at 2 and 4 Mbps. The perceived quality is evaluated by means of three quality metrics: the full-reference objective quality metric NTIA-VQM combined with the ITU-T Rec. G.1070, the full-reference DMOS-KPN metric, and the pixel-wise error comparison performed by using the PSNR distortion measure. A post-processing synchronization between the original and the reconstructed stream has also been designed for improving the fidelity of the performed quality measures. The experimental results show the effectiveness and the limits of the Application Layer protection schemes.
Keywords:Quality of Service; AL-FEC Techniques; IP Transmission; Video Quality estimation
In the last years, communication architectures based on TCP/IP suite have been widely adopted, mainly due to the spreading of Internet. Nowadays, Internet has become essential for data exchange, and it is used as enabling platform for a wide range of applications. Among the services offered by these architectures, video delivery is one of the most challenging. Video over IP can offer a wide range of added-value services: video telephony, videoconferences, and TV broadcasting over IP are few examples of them. However, these services pose severe constraints on both data integrity and temporal delays.
Since the basic TCP/IP framework is based on a best-effort paradigm, many efforts have been spent in designing additional mechanisms able to handle the Quality of Service (QoS). They include protocols specifically designed for real-time multimedia streaming, like RTP/RTCP and TP++, [1,2], as well as mechanisms for error recovery at application layer, like Automatic Repeat reQuest (ARQ) and Application Layer-Forward Error Correction (AL-FEC).
QoS is usually evaluated in terms of traffic parameters related to the integrity of the received stream, such as the packet loss ratio (PLR), the error packet ratio, the duplicated packet ratio, and the out-of-sequence packet ratio, and it is related to latencies and delays, as the average end-to-end delay and the jitter statistics (e.g., pth percentile). Nowadays, it is widely accepted that customer satisfaction should be evaluated in terms of the so-called Quality of Experience that encompasses the quality of the multimedia stimuli, as perceived by the human sensory system, as well as the effectiveness of the rendered multimedia document in accomplishing a given task (e.g., correctness of the diagnosis performed by a physician looking at an encoded/decoded magnetic resonance). The aim of this paper is to evaluate the effectiveness of recently introduced mechanisms for improving the QoS for real-time video delivery with respect to the QoE in the presence of significant packet losses. In particular, the effective-ness of AL-FEC techniques (proposed for inclusion in TV over IP broadcasting international standards) is analyzed in real-time unicast and multicast multi-media streaming.
While several works have addressed the analysis of the QoS gain achieved by adoption of AL-FEC, very few contributions concern the gain on QoE. In particular, in , the authors analyze the reduction of the wireless packet loss by the integration of different techniques including FEC. In , an AL-FEC scheme able to deal with packet loss during real-time delivering of High Definition Video (HDV) over IP networks is presented. The performed experimental tests show the effectiveness of the proposed approach in terms of protection of real-time HDV data and Quality of Service requirements. The QoS gain that can be obtained by applying multi-burst forward error correction for DVB-H streaming services with AL-FEC with respect to the conventional approach with MPE-FEC, based on a burst by burst protection, is reported in . In , an algorithm for the optimization of 1D-FEC for maximizing the QoE for IP video telephony as modeled by the ITU-T R G.1070 standard is considered.
We observe that the QoE accounts for the limitations of the human sensory system in perceiving differences in stimuli. These limitations are usually expressed by means of the Just Noticeable Distortion (JND) concept that defines the error profile whose associated distortion is perceived by a small (predefined) fraction of potential users only. On the other hand, since the measure tries to capture the perceivable differences between the original multimedia stream and the rendered one, the QoE metric also accounts for distortions introduced/removed by sensors, source coding algorithms, error concealment post-processing, and rendering devices. The modeling of the perceived distortion can be performed by means of the NTIA/ITS Video Quality Metric also adopted by the ANSI (ANSI T1.801.03-2003)  and by ITU (ITU-T J.144  and ITU-R BT.1683 ). A review of different methods for achieving forward error control in video communication is in .
In this contribution, we discuss the effectiveness of the state of the art 1-D and 2-D AL-FEC schemes in terms of QoE applied to video communications over best-effort networks. In the first part of our study, we analyze the noise sources that can affect the quality of the video, and we illustrate some effects of transmission errors on the visual quality of the received video. After a short introduction on the QoS concept and an overview of the techniques employed for its evaluation in IP-based networks, an analysis of the effectiveness of the considered FEC systems is presented. As described in Section 4, the QoS is evaluated by means of video quality metrics. These scores depend on several factors including quality parameters that should be ad hoc defined for the specific terminals and networks employed in the process of video delivery. With the performed analysis, it is possible to evaluate the overall effectiveness of the use of AL-FEC technics by also considering the distortion effects caused by the percentage of packet loss not recovered by the correction schemes. We demonstrate that the adoption of an objective model (see Section 5) allows to automatically determine the parameters of the video quality ITU metric, without requiring expensive and time-consuming subjective experiments. As shown in Section 5, the use of FEC can improve the quality of video transmissions by means of systematic redundancy addition. Finally, in Section 6 the conclusions are drawn.
2 Video transmission over IP
The amount of data needed for delivering video information is very large, thus posing severe challenges to both storing and transmission . To cope with these issues, the most adopted solutions are those compliant with MPEG-x or H.26x standards. After compression the video size is significantly reduced although the resulting stream is very sensitive to transmission errors that may severely affect the efficiency of the decoding process. Moreover as well known, the IP communication infrastructure was not designed to support services with severe delay constraints and transfer reliability as required by audio/video distribution services. It is therefore, crucial to analyze the causes of the errors that are mostly related to the unreliable delivering of IP packets.
It is possible to identify several errors:
• packet drops can be caused by the congestion of routers interfaces that leads to exceed the maximum queueing buffer capacity;
• packet discards can be performed by receivers for out-of-sequence packets, duplicated packets, and packets received with a delay exceeding the maximum acceptable one;
• packet discards can be performed by routers or receivers in case of reception of corrupted packets whose errors can be detected but cannot be recovered by FEC algorithms (e.g., packets with wrong Cyclic Redundancy Check);
• corrupted packets when affected by residual error bits uncorrected by FEC algorithms (typically in the presence of radio links, such as in WiMAX, UMTS, HSPA, and LTE);
• packet losses can be due to collision in the physical layer;
• physical link failures requiring the setup of alternative network paths.
As can be noticed from the previous analysis, at the receiver side, packet corruption or delay may have the same effect as packet loss. A good transmission quality level can be obtained by counteracting the different causes that affect the perceived data integrity, implementing prevention and protection mechanisms both at transport and application layer.
At transport layer, it is possible to exploit traffic management mechanisms that take into account the type of service associated with the information flow. On the other hand, the QoS mechanisms enable the network operator to manage the effects of congestion on application performances as well as to provide differentiated services to selected network traffic flows or to selected users. However, those mechanisms are unfeasible in a best-effort network.
End-to-end recovery techniques provide an alternative solution, counteracting the QoS degradation by partially recovering transmission errors: either AL-FEC or retransmission techniques can be applied to obtain the target QoS. While retransmission-based techniques imply a smaller amount of traffic, at least for small and moderate Packet Loss Rates, the larger delay associated with retransmission may be incompatible with real-time constraints. In addition, transmission of dedicated corrections may be unmanageable in case of multicast distribution. Since in IP-TV services the real-time requirement is fundamental, a feedback channel cannot be used. Therefore, the AL-FEC techniques are usually adopted. End-to-end recovery techniques operate on packet data streams and are usually adopted in unmanaged networks. These strategies are mainly based on two approaches:
• retransmission of lost packets;
• error correction by means of FEC coding schemes implemented in the application layer.
In the first case, the RTP-based retransmission strategy (RET), proposed in , is applied to unicast- and multicast-based services with different procedures. In the case of unicast-based services, following a packet loss detection, the user terminal sends a NACK (Negative ACKnowledgement) to the retransmission server. The server returns the requested packet (REP packet), over RTP, to the user terminal. In case of multicast-based services, the RTP retransmission procedure is more complex: while the media stream is multicast, the repair stream could be either multicast or unicast. In the unicast recovering procedure, the user terminal detects a packet loss, and it transmits a NACK, whose response is a unicast RET packet.
Usually multiple retransmission servers are located in the network. When a retransmission server detects the packet loss, it prevents the terminals from transmitting NACKs by sending an RTCP Feed Forward message in multicast mode. After the lost packet is received, the retransmission server sends it to the group of terminals in multicast mode. This mechanism requires that terminals wait a minimum time between packet loss detection and NACK sending.
As far as AL-FEC techniques are concerned, a general consensus seems to arise among international standardization bodies on the use of the coding scheme and the corresponding transmission strategies proposed by DVB . The proposed approach envisages the combined use of two FEC schemes: SMPTE 2022-1  and a Raptor code . FEC techniques are applied to the original RTP packets of a Single Program MPEG-2 Transport Stream, for generating additional RTP streams of FEC packets that contain the redundant information used by the receiver to reconstruct lost packets. The use of this approach has two drawbacks that need to be considered:
• additional bandwidth is needed to transmit FEC packets;
• additional latency is introduced by FEC operations both at the transmitter and the receiver stage.
Since the FEC correction packets are sent in a separate stream with respect to the media data stream, the receivers, which are not able to manage additional repair streams, can discard the FEC packets.
In this work, we consider the SMPTE 2022-1 AL-FEC technique. RTP media packets generate corresponding FEC packets whose payload contains redundant information used for recovering lost packets. The mechanism is based on the scheme defined in RFC 2733 , in which traditional FEC codes are applied to a set of consecutive packets. This scheme is applicable to any standardized transport format of audio/video information based on RTP encapsulation. To recover burst losses, SMPTE standard extends the RFC 2733 scheme to allow FEC codes to be applied to nonconsecutive packets. In this case, each FEC packet is periodically associated with selected packets. Hence, consecutive lost RTP media packets can be recovered from consecutive FEC packets. The complete process to generate FEC packets from the RTP media ones can be decomposed into two steps:
1. RTP packets are stored in an interleaver matrix consisting of D rows and L columns, in row-major order. The repair packets are then computed as bitwise XOR on the RTP packets (header and payload) along columns. Hence, the FEC packet payload is evaluated on the D packets numbered nL+k (with 0 ≤ n ≤ D−1). This scheme involves L × D media packets and presents a period between media packets corresponding to a FEC packet equal to L. This scheme allows to recover one packet per column. If a block of L × D packets is considered, it is possible to recover a maximum of L consecutive lost packets.
2. A second set of repair packets can be generated as bitwise XOR of L consecutive media packets stored in each row. The recovery capability is increased because the second stream of FEC packets allows the receiver to recover every single packet loss arising along a row, as well as a subset of losses of multiple packets.
It is possible to identify two levels of FEC protection:
• Level A (only-columns or 1-D FEC): devices are able to manage zero or one FEC stream;
• Level B (row-columns or 2-D FEC): devices are able to manage zero, one, or two simultaneous FEC streams.
The two FEC streams shall use the same IP destination address (either unicast or multicast) for the associated media stream, but they must address different UDP destination ports in order to have independent packet sequence numbering for the two streams. This allows backward compatibility toward Level A receivers that are able to manage only one FEC stream. If the packet streams are sent to port N, the first FEC stream has to be sent to port N + 2, while the second FEC stream, when used, shall be sent to port N + 4.
In conclusion, SMPTE 2022-1 standard defines two FEC configurations. The only-columns-FEC configuration is characterized by low overheads (OH), so it is used in case of low Packet Loss Rates. On the other hand rows-columns-FEC configurations is preferred in case of high Packet Loss Rates.
3 Quality evaluation of transmitted video
The measurement of the video quality degradations occurred in the transmission chain is fundamental for maintaining, controlling, and optimizing the system performances based on the quality of the video data. Thus, several studies have been carried out in the last two decades concerning the artifacts introduced by the compression and transmission of digital video.
When dealing with video communications, one of the main factors that has been considered is the quality perceived by the end user. The subjective evaluation of the quality of the received video depends on many elements related both to objective and to subjective issues. Video characteristics and content (e.g., size, smoothness, amount of motion, high frequencies content, spatial and temporal resolution), the actual network conditions and its evolution in time (e.g., congestion, packet losses, bit error rate, time delay, time delay variation, and out-of-sequence packet delivery), the condition of the viewer (e.g., display size and contrast, processing power, available memory, and viewing distance), and the status of the viewer (feeling, expectations, experience, and involvement) are only some of the factors influencing the human perception. Many of these factors cannot be predicted or measured since they totally depend on unpredictable causes.
The overall quality evaluation can be performed by means of objective and subjective scores. The latter methodology is intrinsically more accurate when the final user of the communication scheme is a human being. Unfortunately, to be of wide applicability, it requires a set of people to be interviewed on the perceived quality of several videos in a controlled environment, thus resulting a time-consuming and expensive procedure. Objective quality metrics can be classified according to the amount of side information required to compute a given quality measurement. Using this criterion, three generic classes of objective metrics can be described:
• Full-reference metrics (FR): the evaluation system has access to the original media. FR metrics can also be subclassified according to the computational complexity required to attain the metric: simple or complex objective metrics. Simple objective metrics are attractive because they are computed in a fast way, while often resulting in low fidelity matching of perceived quality of images and video. Probably, the most relevant example of a simple objective metric is the PSNR, which is widely used to perform a fast (and simple) quality evaluation. The MSE (mean square error) is also in this class. Some examples include the works by Daly , Lubin , Watson et al. , Wolf et al. , KPN research (DMOS-KPN) , and Winkler . The Structural SIMilarity (SSIM) index is a widely used method for measuring the similarity between two images . A more complete survey of the available FR video quality metrics is presented in . The commonly adopted metric for assessing video quality is the NTIA-VQM model that has been developed by the National Telecommunications and Information Administration (NTIA). It uses objective parameters to measure perceptual effects such as blurring, block distortion, error blocks, noise, and unnatural motion. Given the good correlation with the subjective quality scores, the NTIA General VQM was adopted both as a national standard and as an international recommendation. The model returns values in the range from zero to one, which, respectively, denote no perceptual impairment and maximum perceived impairment.
• Reduced reference metrics (RR): the evaluation system has access to a small amount of side information regarding the original media. In general, certain features or physical measures are extracted from the reference and transmitted to the receiver as side information to help the evaluation of the video quality. Metrics in this class may be less accurate than the FR metrics, but they are less complex and make real-time implementations more affordable. Some examples include the works of Webster et al.  and Brétillon et al. .
• No-Reference metrics (NR): the evaluation system has no reference to any side information regarding the original media. This kind of metrics is the most promising in the context of video broadcast scenario, since the original images or video are in practice not accessible to end users. Requiring the reference video or even limited information about it becomes a serious impediment in many real-time applications. Designing effective NR metrics is a big challenge. Although human observers can usually assess the quality of a video without using the reference, creating a metric implementing such a task is difficult and, most frequently, it results in a loss of performances if compared to the FR approach. Most of the proposed NR metrics evaluate the annoyance by detecting and estimating the strength of commonly found artifact signals. For example, the metrics by Wu et al. and Wang et al. estimate quality based on blockiness measurements [26,27], while the metric by Caviedes et al. takes into account measurements of five types of artifacts .
The RR and NR quality metrics may result in a less accurate evaluation, but only few parameters will be transmitted, requiring less bandwidth and allowing real-time, or quasi real-time, quality assessment.
A different approach for assessing the video quality has been introduced in the ITU-T Recommendation G.1070, Opinion model for video telephony applications . In this recommendation, there is the tentative to map the parameters defining the QoS into the QoE perceived by a user. Speech and video communication and their peculiarities for real-time multimedia services are considered. The ITU model has been validated through expensive and time-consuming subjective experiments. Furthermore, it requires the setting of several parameters that are strictly dependent on the particular application. In the following, we focus our attention on the video quality evaluation part.
According to this model, the video quality score can be computed using the video quality parameters defined for the terminals and networks employed in the video delivering. Video delivering delay, codec specifications, codec type and implementation, spatial resolution, key-frame interval, video packet loss rate, video frame rate, and video bit rate are considered as key elements in the model. In more details, the video quality Vq can be computed as follows:
where Icoding represents the impact of coding distortion on video quality (see  for more details), Dplr is the packet loss robustness factor that expresses the degree of video quality robustness due to packet loss, and Pplr represents the Packet Loss Rate. This relationship has been used in  for optimization and design purposes of a Level A 1-D FEC. The authors exploited the outcomes of the ITU-T R G.1070 model for optimizing the trade-off between overhead and protection in video telephony over IP networks services.
We note that when AL-FEC is employed, the value of Packet Loss Rate to be used in Equation 1 corresponds to the residual Packet Loss Rate after packet recovery . Since Level A 1-D FEC is able to reconstruct one packet only, the Packet Loss Rate is reduced by an amount equal to the probability that, when the current packet is lost, the remaining L packets are correctly received. Thus we have:
Here we extend the aforementioned method to Level B 2-D FEC for which a more complicated situation arises. In fact, in this case, the FEC scheme is able to recover not only single errors along both rows and columns but also multiple errors along rows and columns, as long as the linear equations involved by the missing packets constitute an independent set and their number is greater or equal to the number of lost packets. Although an analytic expression for the residual Packet Loss Rate could be evaluated with the aid of computer-based tools (e.g., MATLAB) by considering the whole set of packet loss patterns for which the reconstruction matrix is nonsingular, here we propose an approximation, tested by the performed Monte Carlo runs, accounting for the following events: (a) all packets are received, (b) only the current packet has been lost (c1, c2, c3), in addition to the current packet, an additional packet is lost either along rows or along columns. In more details:
(a) let Q be the probability that the current packet is received
(b) let P1 be the probability that only the current packet is lost, either along rows or along columns:
(c1) let P2 be the probability that in addition to the current packet located in the jth row and in the ith column, one more packet in the jth row and in the hth column is lost together with the packet in the mth row and in the ith column, while the packet in the mth row and in the hth column is received:
(c2) let P3 be the probability that in addition to the current packet, located in the jth row and in the ith column, one more packet in the jth row and in the hth column is lost together with two or more packets in the ith column, while the packet in the mth row and in the hth column is received:
(c3) let P4 be the probability that in addition to the current packet, located in the jth row and in the ith column, one more packet in the ith column and in the mth row is lost together with two or more additional packets in the mth row, while the packet in the mth row and in the hth column is received:
Then, the residual Packet Loss Rate can be approximated by the following expression:
In the following, the tests that prove the theoretical analysis of the 2-D FEC are reported.
4 Proposed approach and simulation environment
To assess the effectiveness of AL-FEC TV streaming protection with respect to the QoE, several tests have been performed. In more details, a set of videos has been protected with AL-FEC techniques, and the quality of the resulting videos has been compared to the quality obtained in the case when no recovery strategies have been applied. Real media streams encoded at different bit rates have been considered; more specifically, we used videos at standard definition resolution (720 × 544 pixels), encoded at 2 and 4 Mbps. The selected videos present different content (news and cartoon) to allow the verification of the effectiveness of AL-FEC techniques depending on the video semantic.
In order to perform the mentioned analysis, an experimental setup has been developed, as shown in Figure 1. This is constituted by one streaming source, a network segment, and two receivers. The transmitter side consists of a video server, a USB to DVB-ASI (Asynchronous Serial Interface) converter, and a DVB-ASI to IP gateway. The video server transmits CBR (Constant Bit Rate) Single Program MPEG-2 TSs on the USB interface. After conversion from USB to ASI, an ASI to IP gateway encapsulates the MPEG2 TS in the RTP/UDP/IP protocol stack, and it generates the corresponding SMPTE 2022-1 FEC packets and transmits both media packets and FEC packets on a 100/1000 Base-T Ethernet interface. For simulation purposes, at the transmission side, several FEC overheads have been selected. Among these, we report the results obtained for 10, 16.67, and 20%. The network segment has been modeled by means of an open-source network emulator: NETEM (NETwork EMulator) . The emulator has been used for introducing packet losses in the incoming streams (data and protection streams), in order to simulate the behavior of a real network based on the best-effort paradigm. Each considered media stream with a selected AL-FEC overhead has been processed in order to simulate a set of increasing Packet Loss Rates. In the first receiver, an IP to DVB-ASI video gateway receives the media packets and tries to recover the packet losses by means of the received FEC packets. The recovered TS is sent to an ASI interface as input to a PC that records the concealed data stream. The second receiver receives the RTP media packets from the network, and it discards the FEC packets. The presence of a double receiver allows to perform the comparison between the decoded video streams in the two cases in which the FEC techniques are and are not applied. After both streams are received, their quality is evaluated through the pixel-wise comparison PSNR, the DMOS-KPN , and the NTIA-VQM FR video quality metrics.
Figure 1. Block diagram of the experimental setup.
5 Experimental results
In the following, the obtained results are reported and discussed. In Figure 2a, visual comparison of the effectiveness in using AL-FEC techniques is presented. Two frames of size 720 × 544 pixels, extracted from the videos 'News' and 'Cartoon' of length 60 s, encoded at 2 Mbps, are shown before Figure 2a,d and after Figure 2b,c e,f the coding-transmission-decoding procedure. The considered overhead is equal to 16.67% and the Packet Loss Rate is equal to 3%. As can be noticed in Figure 2c,f, in both cases, the use of the FEC technique allows the complete recovering of the original data.
Figure 2. Comparison of visual effects on decoded frames for the videos 'News' and 'Cartoon' at 2 Mbps with and without FEC technique. The overhead is equal to 16.67%, and PLR is equal to 3%.
Other samples from the same videos are shown in Figure 3. The original frames are shown in Figure 3a,d, while the results of the reconstruction phases are shown in Figure 3b,e when no FEC protection is applied and in Figure 3c,f in the presence of FEC protection. The considered overhead is equal to 20% and PLR is equal to 3%. Also in this case, the FEC technique is able to recover all lost packets, and the quality of recovered streams is high. In both the aforementioned cases (OH = 16.67% to PLR = 3% and OH = 20% to PLR = 6%), the perceived quality is confirmed by the PSNR values.
Figure 3. Comparison of visual effects on decoded frames for the videos 'News' and 'Cartoon' at 2 Mbps with and without FEC technique. The overhead is equal to 20%, and PLR is equal to 3%.
As well known, there is a limit in the effectiveness of the AL-FEC procedure. When the packet loss is higher than a threshold, the recovering procedure fails. In this case, annoying artifacts can occur in several parts of the video, and their entity can vary from a JND artifact to an extremely annoying one till the impossibility for the decoder of reconstructing some frames of the video. The content of the unrecovered lost packets determines the perceived annoyance of the resulting artifacts. In Figure 4, the artifacts created by packet loss in the presence of FEC protection are shown. The original frame is in Figure 4a,d, while the results of the reconstruction phases are in Figure 4b,e when no FEC protection is applied and in Figure 4c,f in the presence of FEC protection. The considered overhead is equal to 20%, and PLR is equal to 9%. As can be noticed, the quality of the frame after the coding-transmission-decoding procedure is lower than in the previously analyzed cases, and this behavior corresponds to a decrease in the PSNR values.
Figure 4. Comparison of visual effects on decoded frames for the videos 'News' and 'Cartoon' at 2 Mbps with and without FEC technique. The overhead is equal to 20%, and PLR is equal to 9%.
It is worth noticing that the frame content influences the perceived effectiveness of the FEC procedure. In fact there is a perceivable quality difference in Figure 4c,f. In the first case, the artifacts are noticeable and annoying while in the second one, being localized in the bottom-right corner of the figure, the artifacts are masked.
In Tables 1 and 2, the results for three significant AL-FEC overheads (10, 16.67, and 20%) are reported. The selected Packet Loss Rates have been chosen to verify the limits of the FEC technique performances. In particular for the 1D-FEC configuration, the considered PLRs are 0.3, 1, and 3%, and for the 2D-FEC configuration, the considered PLRs are 3, 6, and 9%. As can be noticed, small PLRs are completely recovered by FEC techniques. On the other hand, as the PLR increases, the performance of the FEC technique decreases. The reported 'Packet Not Recovered Rate' (PNRR) is the percentage of the information lost after the recovering procedure.
Table 1. Packet Loss Rate (PLR) and residual Packet Loss Rate (PNRR) for different overheads for the video 'News' encoded at 2 and 4 Mbps after FEC recovering procedure
Table 2. Packet Loss Rate (PLR) and residual Packet Loss Rate (PNRR) for different overheads for the video 'Cartoon' encoded at 2 and 4 Mbps after FEC recovering procedure
For each received and reconstructed video, three full-reference quality metrics have been computed. It is useful to notice that, in order to be effective, the full-reference metrics need a perfect synchronization between the original and the reconstructed frame. To this aim, a synchronization loss estimation procedure has been designed and implemented. In Tables 3 and 4, the average values of the VQM scores for the selected different Packet Loss Rate and different overheads for the 'News' and 'Cartoon' sequences with and without the FEC recovering system. As can be noticed, the use of AL-FEC techniques allows to improve the perceived quality of the rendered videos, until PLR is within the error correction capabilities limited by the selected overhead. This positive trend is confirmed by the other HVS perception-inspired DMOS-KPN metric (cfr. Tables 5, 6) and by the pixel-wise comparison PSNR metric, as reported in Tables 6, 7 and 8. Occurrences of VQM scores are analyzed in Figures 5, 6, 7, 8, 9 and 10. The VQM metric returns values in the range from zero to one where '0' corresponds to 'Very High Quality' and '1' to 'Very Low Quality'. In the following, we report some of the results collected during the performed tests.
Table 3. Average values of the VQM scores for different Packet Loss Rate and different overheads for the 'News' sequences with and without the FEC recovering system
Table 4. Average values of the VQM scores for different Packet Loss Rate and different overheads for the 'Cartoon' sequences with and without the FEC recovering system
Table 5. Average values of the DMOS-KPN scores for different Packet Loss Rate and different overheads for the 'News' sequences with and without the FEC recovering system
Table 6. Average values of the PSNR (dB) scores for different Packet Loss Rate and different overheads for the 'Cartoon' sequences with and without the FEC recovering system
Table 7. Average values of the DMOS-KPN scores for different Packet Loss Rate and different overheads for the 'Cartoon' sequences with and without the FEC recovering system
Table 8. Average values of the PSNR (dB) scores for different Packet Loss Rate and different overheads for the 'News' sequences with and without the FEC recovering system
Figure 5. NTIA-VQM: objective quality evaluation of received videos with and without the use of AL-FEC for the video 'News' (2 Mps) with overhead 10% and PLR 1%.
Figure 6. NTIA-VQM: objective quality evaluation of received videos with and without the use of AL-FEC for the video 'Cartoon' (2 Mps) with overhead 10% and PLR 1%.
Figure 7. NTIA-VQM: objective quality evaluation of received videos with and without the use of AL-FEC for the video 'News' (4 Mps) with overhead 20% and PLR 6%.
Figure 8. NTIA-VQM: objective quality evaluation of received videos with and without the use of AL-FEC for the video 'Cartoon' (4 Mps) with overhead 20% and PLR 6%.
Figure 9. NTIA-VQM: objective quality evaluation of received videos with and without the use of AL-FEC for the video 'News' (4 Mps) with overhead 20% and PLR 9%.
Figure 10. NTIA-VQM: objective quality evaluation of received videos with and without the use of AL-FEC for the video 'Cartoon' (4 Mps) with overhead 20% and PLR 9%.
An analysis of the VQM scores corresponding to the videos 'News' and 'Cartoon' (2 Mbps) is reported in Figures 5 and 6 in case of OH = 10% and PLR = 1%. Figures 5a and 6a show the case in which the FEC protection is not used, while the histograms in Figures 5b and 6b represent the case in which the FEC protection is considered. It is possible to notice that when the FEC protection is not adopted, the VQM scores are spread on the interval [0-1], while in the case in which the FEC is utilized, the scores show a peak in the value '0' thus implying that the objective quality is significantly increased. A similar trend is shown by the analysis performed for the streams 'News' and 'Cartoon' encoded at 4 Mbps for the case OH = 20% and PLR = 6% (Figures 7, 8) and for the case OH = 20% and PLR = 9% (Figures 9, 10).
In Figures 11 and 12, the experimental NTIA Video Quality Metric, VqNTIA, obtained by re-normalizing the VQM with respect to the ITU, is plotted versus the Packet Loss Rate for different values of overheads for the encoding at 2 Mbps (Figures 11a, 12a) and for 4 Mbps (Figures 11b, 12b). These results prove that 2-D FEC is more effective in recovering the packet losses. Nevertheless, due to the inherent lower overhead, 1-D FEC is preferred at low PLR when the 2D FEC gain is reduced.
Figure 11. Normalized NTIA-VQM versus ITU scores for the video 'News' encoded at 2 Mbps (a) and 4 Mbps (b); normalized NTIA-VQM versus net PLR percentage when the FEC protection is not applied (c); normalized NTIA-VQM versus net PLR percentage when the FEC protection is applied (d).
Figure 12. Normalized NTIA-VQM versus ITU scores for the video 'Cartoon' encoded at 2 Mbps (a) and 4 Mbps (b); normalized NTIA-VQM versus net PLR percentage when the FEC protection is not applied (c); normalized NTIA-VQM versus net PLR percentage when the FEC protection is applied (d).
To verify the applicability of the ITU G.1070 in the case in which the error concealment is implemented in the decoder, we have compared the overall (2 and 4 Mbps) values of [VqNTIA - 1] versus the residual Packet Loss Rate at the output of the AL-FEC decoder. In particular, Figures 11c and 12c report the results obtained when no FEC protection is used, while Figures 11d and 12d show the impact of the use of FEC schemes. The fittings are obtained by means of an exponential regression between the objective quality metric score and the net PLR at the output of the channel decoder, as shown in the figures. The high value for the percentage of data variance explained by the model confirms the validity of the ITU approach for accounting for packet losses. We remark that the use of the proposed objective model allows to estimate the ICoding and Dplr parameters that characterize the ITU model without requiring the execution of subjective tests.
The fitting of the experimental data with the analytic expression of Equation 1 allows to extend the applicability of the ITU model even to AL-FEC performances for a wider range of PLRs. In fact, Equation 1 combined with Equation 2 for and with Equation 8 for allows the computation of MOS as a function of the redundancy as reported in the following equations:
In Figure 13, the benefits of using a 2D-FEC approach with respect to the 1D case are shown. From this figure, it is evident that when a 20% overhead is employed, the 2D-FEC outperforms the 1D-FEC by one MOS level for PLRs in the range [5, 10%]. Moreover, for PLR values smaller than 1%, the performance gain is nonsignificative.
Figure 13. Video Quality Score versus PLR for single column (1D) and row-column (2D) FEC. From left to right the curves represent: 1D--5, 10, 12.5, 16.6, 20%; 2D--20, 30, 40%.
In this work, an evaluation of the effectiveness of AL-FEC schemes in video communications over unreliable channels has been presented. The experimental tests have been performed on four video sequences 60 s long with different content and bit rate (2 and 4 Mbps). The performed analysis has been carried out by both theoretical and experimental tests. Three FR metrics, the NTIA-VQM, the PSNR, and the DMOS-KPN metrics, can be efficiently used for computing the effects of PLR variation.
Based on the performed analysis, an optimization of the quality assessment and of the monitoring system can be done. The adoption of an objective model allows to determine the parameters of the ITU metric in an automatic way, without requiring expensive and costly subjective experiments. We have also verified the applicability of the NTIA-VQM model in the presence of an error concealment block at the receiver. The analytical approximation of the residual PLR also for the 2D-FEC allows to evaluate the perceived quality as a function of the inserted redundancy and of the actual PLR in the network. Therefore, it is possible to extend the ITU model parameters also to the 2D case.
7 Competing interests
The authors declare that they have no competing interests.
H Chih-Wei, S Sukittanon, J Ritcey, A Chindapol, H Jenq-Neng, An embedded packet train and adaptive fec scheme for voip over wired/wireless ip networks. Proc IEEE Int Conf Acoust Speech Signal Process 5, 5 (2006)
Z Qingguo, L Chongrong, Z Xuan, L Dongtao, A new FEC scheme for real-time transmission of high definition video over IP network. Proceedings of International Conference on Intelligent Pervasive Computing, 232 (2007)
R Vaz, B Kuipers, M Nunes, Video quality optimization algorithm for video-telephony over IP networks. IEEE 21st International Symposium on Personal Indoor and Mobile Radio Communications (PIMRC), 477–482 (2010)
A Watson, Q Hu, J McGowan, Digital video quality metric based on human vision. J Electron Imaging 10, 20–29 (2001). Publisher Full Text
Z Wang, AC Bovik, HR Sheikh, EP Simoncelli, Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4), 600–612 (2004). PubMed Abstract | Publisher Full Text
S Winkler, Issues in vision modeling for perceptual video quality assessment. [http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.4303] webcite
AA Webster, CT Jones, MH Pinson, SD Voran, S Wolf, Objective video quality assessment system based on human perception. in Proceedings of SPIE Human Vision, Visual Processing, and Digital Display IV, vol. 1913, 1993, ed. by Allebach JP, Rogowitz BE, pp. 15–26 (1993)
P Bretillon, J Baina, M Jourlin, G Goudezeune, Method for image quality monitoring on digital television networks. in Proceedings of SPIE Multimedia Systems and Applications II, vol. 3845, 1999, ed. by Tescher AG, Vasudev B, Bove VM, Derryberry B, pp. 298–306 (1999)
H Wu, M Yuen, A generalized block edge impairment metric for videocoding. Signal Process Lett 4(11), 317–320 (1997). Publisher Full Text
NetEm:, Network emulation with NetEm. [http://www.linuxfoundation.org/collaborate/workgroups/networking/netem] webcite