The present invention generally relates to packet recovery for use with packet networks, and relates in particular to partial recovery of lost packets and their use in applications that can tolerate partial packet loss, such as audio and video media.
The unprecedented demand for data communications over unreliable systems, such as the Internet and wireless networks, has made linear block codes (LBC) increasingly popular within the past decade. In particular, modern Internet and wireless applications have employed these codes for unicast and multicast transmission of realtime multimedia and other non-realtime data types over erasure channels. These applications range from unicast telephony and receiver-driven multicast of multimedia data using traditional Reed-Solomon (RS) codes to reliable multicast of data files using low-density codes.
In principle, one can classify the type of applications that employ linear block codes into realtime and non-realtime. Each of these application types has its own requirements, constraints, and also flexibilities that can be exploited for a successful deployment of block codes over erasure channels. For example, a successful usage of the flexibilities and requirements of non-realtime applications that demand a reliable transmission of large data files to a large number of receivers has resulted in the recently developed digital fountain approach.
On the other hand, the majority of recent proposals for the recovery of lost packets encountered in realtime multicast and unicast applications are based on traditional RS codes. Some of these approaches are based on employing feedback information regarding the channel condition in realtime. Meanwhile, there are several key requirements and flexibilities imposed/provided by realtime applications that have not been fully considered/utilized when designing block codes for these applications.
One disadvantage of classical linear block codes, such as Reed-Solomon (RS)-based codes, is that they fail to recover any lost message symbols when the total losses exceed the redundant symbols. Under adverse channel conditions, situations where losses are greater than redundancy can often be possible. As a result, RS-based codes can often fail catastrophically when used with real-time multimedia applications under adverse conditions. Accordingly, multi-media stream generators typically take a conservative approach, and transmit a high number of redundant packets. This increase in packet transmission contributes to packet network congestion, thus exacerbating the adverse conditions. The result is a fierce competition between multi-media content providers for network bandwidth resources. Thus the need remains for a packet recovery technique that avoids catastrophic failure, thereby reducing the need for redundant packet transmission and conserving packet network resources. The present invention fulfills this need.
In accordance with the present invention, a coding system and method employs a Partial Reed Solomon (PRS) code profile of order s having an s-partition on a set of parity symbols and a (s+1)-partition on a set of message symbols. In other aspects, an adaptive forward error correction scheme keeps block length and transmission rate fixed, while changing an underlying code profile based on received feedback information about a probability of erasure p from a channel.
The Partial Recovery Codes of the present invention are advantageous over previous recovery codes because they exhibit improved performance over classic Reed Solomon codes when the coding rate is close to channel capacity, and avoid catastrophic failure in the case where the total losses exceed the redundant symbols. Partial packet recovery is accomplished for real-time multimedia even where the number of losses exceeds the number of redundant packets. These Partial Recovery Codes facilitate a partial recovery of lost symbols, and are specifically designed and optimized for real-time multimedia communication over packet-based erasure channels. Their efficient design is facilitated by lowering the density and increasing irregularity. Accordingly, based on the constraints and flexibilities of realtime applications, a performance measure is designed, message throughput (τm), which is suitable for these applications. This measure differentiates the notion of optimum codes for the target multimedia applications which can tolerate some packet loss, as compared to performance measures that are used for non-realtime applications that require guaranteed reliability. Based on the proposed throughput measure, the advantages of lowering the density of a code for near capacity performance are combined with the high decoding efficiency of Reed Solomon (RS) codes, in order to design optimum partial Reed-Solomon (PRS) codes. An example of a Binary Erasure Channel (BEC) demonstrates that at near-capacity coding rates, appropriate design of a PRS code can outperform an RS code. This analysis and optimization is extended for a general BEC over a wide range of channel conditions. Moreover, as compared with RS codes, the proposed PRS codes provide a significantly improved graceful degradation when the number of losses exceeds the number of parity symbols within the code block. This is a highly desirable feature for realtime multimedia communication. Video simulations carried out using H.264 compressed video further emphasize the utility of this graceful degradation. Finally a paradigm is set for a unique rate constrained adaptive FEC scheme based on PRS codes. This scheme is compared with other adaptive rate constrained schemes based on RS codes.
Further areas of applicability of the present invention will become apparent from the detailed description provided hereinafter. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
The present invention will become more fully understood from the detailed description and the accompanying drawings, wherein:
The following description of the preferred embodiment(s) is merely exemplary in nature and is in no way intended to limit the invention, its application, or uses.
The present invention takes into consideration key requirements and flexibilities of multimedia applications, in general, and realtime compressed video transmission in particular to introduce and design a family of linear block codes that can outperform traditional RS codes. Specifically, a new family of codes is introduced, referred to herein as Partial Reed Solomon (PRS) codes. The proposed PRS codes can be considered a lower-density version of the classical RS codes. It has been observed that low density codes can give near channel capacity performance. As the coding rate approaches channel capacity, lowering the density of a code becomes a necessity, in particular, for realtime applications. Meanwhile, the decoding efficiency (or recovery ratio) offered by an RS code is greater than any other linear code. Thus, the proposed PRS codes are based on a framework that combines the advantage of lowering density with the high decoding efficiency of traditional RS codes.
The rest of the application is structured as follows: In Section II, the constraints and the flexibilities associated with realtime multimedia transmission are identified. Based on this, a performance measure is identified that is more suitable for FEC schemes that are targeted for realtime multimedia applications. In Section III, the proposed family of PRS linear block codes are introduced, which as explained further in the application, can be characterized by a certain order. In Section IV optimal PRS codes are identified for a Binary Erasure Channel and it is shown that the optimal PRS codes are of order one. This optimum class of PRS is referred to as PRS-1 codes. In Section V, some further optimality analysis of the PRS-1 codes is provided. Some results exhibiting the performance of these codes under various channel conditions are provided. In Section VI the performance of PRS-1 codes is compared with that of RS codes. In Section VII, the performance of the two codes is compared in terms of graceful degradation. In Section VIII, results of actual video simulations and a subjective comparison of media quality supported by the two coding schemes are provided. In Section IX, a case is made for designing adaptive FEC schemes based on PRS-1 codes for time-varying channels.
II. Realtime Applications: Constraints and Flexibilities
A fundamental requirement of any realtime application is the transmission of message data at a minimum desired rate R. In general, this minimum rate should be maintained to achieve a certain quality. The minimum rate requirement translates to the transmission of a minimum number of K message symbols within an N-symbol code block: R=K/N. Consequently, one of the constraints in the design of linear block codes for realtime applications is the usage of a maximum number (N−K) of parity symbols within the N-symbol block.
In general, the performance of linear block codes improves with larger values of the code block size N. However, realtime applications can employ a maximum number N depending on the particular application. For example, non-interactive multimedia streaming applications can use larger values of N than interactive (e.g., telephony) applications. In either case, there is a maximum number for the code block size N that needs to be adhered to. Therefore, unlike non-realtime applications that may have the flexibility in selecting N and R=K/N, realtime applications, in general, have to employ (adhere to) a block code with a pair-constraint (N, K).
Performance criteria for LBC codes, which are used for non-realtime data, are not always suitable for realtime applications. For example, a non-realtime LBC code can be evaluated based on the number of symbols needed to perfectly recover all of the original message symbols. In general, for realtime applications, perfect recovery, and consequently perfect reconstruction, of the original message symbols is not a hard requirement (as explained further below). Meanwhile, it is crucial to deliver the realtime application layer with the maximum number of the message symbols that are transmitted by the system. Therefore, the probability of a message symbol loss (after channel decoding) is a key performance parameter. This probability is denoted by pm. Hence, the parameter τm=(1−pm), which represents the probability of receiving a message symbol by the realtime application (after channel decoding), is a measure of the end-to-end message symbol throughput. One of the key objectives of the family of PRS codes that are proposed in this application is to maximize this throughput measure τm. (For the remainder of this application, τm is referred to as the message throughput.)
Moreover, based on a variety of multimedia processing, compression, and scalable coding techniques, a wide range of practical application-layer error resilience and concealment methods can be used to compensate for lost data. These techniques, however, usually work well only when the number of losses is limited to a certain threshold. In other words, practical multimedia error concealment and resilience methods usually become useless when the number of losses is beyond an application-dependent threshold. Consequently, it is very crucial for LBC codes to perform well when the number of lost message symbols is large by recovering the majority of these lost message symbols. Meanwhile, and although it is desirable, it is less crucial for these codes to provide perfect recovery when the number of losses (before or after channel decoding) is very small (e.g., one or few symbols) due to the maturity of powerful multimedia processing techniques. Therefore, codes that maintain very low end-to-end (effective) message losses are-more desirable than codes that provide perfect recovery under good channel conditions (e.g., under very low loss probability) but provide low recovery rate under adverse channel conditions. This desirable feature highlights one of the key problems with current LBC codes that are used widely for realtime video. It is well known, for example, that when a RS code block experience a number of losses that is larger than the number of parity symbols, then the code is incapable of recovering any of the lost message data. Experiencing a number of losses that is larger than the number of parity symbols is quite feasible over channels with time-varying characteristics (e.g., the Internet and wireless networks), even if, “on average”, the message rate R is lower than the channel capacity. This is particularly true when the message rate R is close to (but may still be lower than) the channel capacity. Moreover, and due to: (a) the large amount of data that is inherently needed for representing multimedia (in particular video) signals, and/or (b) the compressed representation of these signals is normally encoded ahead of time at a certain minimum rate that cannot be further reduced in realtime by the sender, it is quite often when multimedia applications operate very closely to channel capacity. This phenomenon is quite common for a wide range of applications such as popular streaming applications on the web, IP multimedia telephony, and IP multicast.
Consequently, one of the main objectives of the proposed work is the design of LBC codes that are capable of achieving high message throughput τm when the rate is close to (but still lower than) channel capacity. Unlike traditional RS codes, which exhibit a very sharp degradation in their ability to recover lost packets around the point (L=N−K), and as shown in this application, the proposed PRS codes provide a graceful transition in their lost-message-recovery capabilities while maintaining a very high message throughput τm over this transition point and beyond.
III. Partial Reed Solomon Codes
Before introduction of the family of PRS codes, the role density plays in the design of codes is first discussed. This will clearly outline the significance of the design of PRS codes.
It is well known that all linear block codes could be represented by bipartite graphs, where the vertex set can be partitioned into message nodes and parity nodes. Codes based on GF(2) can be represented by graphs. Similarly codes in GF (q), where q represents the order of the underlying field, where q>2, can also be represented by graphs, but in this case the edges are weighted by elements from GF (q). Thus, it is possible to represent an RS code also by an edge-weighted bipartite graph. However, in this case, the graph representing an RS code would be a full density (i.e., complete) graph. It should be noted that, the code-graph can be represented such that the message and the redundancy symbols belong to the same partition and all the nodes in the parity-check partition are made equal to zero. However, in the above case, the assumed graph representation is such that, the redundancy and the message do not belong to the same partition and thus the checks can have actual non-zero values. Only for such a representation can the RS-graph be termed as full-density.
The decoding of a codeword transmitted over an erasure channel is equivalent to solving a system of equations, represented by the parity check equations. The erased symbols represent the unknowns in the system of equation. Thus for a given (N,K) and a given density, as the probability of channel erasure p increases, the average number of unknowns in each parity check equation also increases. Also, as the number of unknowns in a parity check equation increases, the probability of that equation being successfully solved decreases. Due to this, when the coding rate is near (or above) channel capacity, it becomes necessary to reduce the number of message symbols that are protected by each parity symbol. This is equivalent to reducing the density of the code.
The iterative algorithms used for erasure decoding of current LDPC codes, limit each sub-step of an iteration to solving a single equation with a single unknown. This constraint has influenced the design of most of the current LDPC codes. If a code is based on GF(2), then the above constraint is not a severe one. But, for codes based on GF(q), the above constraint, can lead to an over-compromise of the decoding efficiency for reduced time-complexity. Moreover, it has been shown that codes designed on fields higher than (2) GF(2), can exhibit a much-improved performance in terms of error recovery. In addition, since a packet loss causes a burst of bit erasures, a code based on larger symbols can facilitate a better data recovery. Since a key objective of the effort is to maximize the message throughput (i.e., lost-symbol recovery), code-design is not constrained to graphs without cycles, and the decoding algorithm is not constrained to solving only “single unknown” equations.
Meanwhile, decoding algorithms for a general code based on GF(q), can have a very high time complexity. Thus, it is advantageous to limit the code design to a family of codes, where the entire codeword can be broken down into sub-codes that resemble RS codes. This allows us to use algorithms developed for efficient decoding of RS codes, for decoding of these RS based sub-codes. Decoding of individual sub-codes can facilitate the decoding of the entire codeword. However, for ease of analysis the subcodes are designed such that the message symbols are mutually exclusive. It should be noted that a code design that allows sub-codes to overlap does not necessarily lead to a performance improvement. Moreover a code-design, which as described above does not allow a sub-code overlap, does not require multiple iterations for decoding and thus can lead to reduced time complexity. It is envisioned, however, that more generalized designs may be accomplished that allow sub-code overlap and/or use multiple iterations for decoding. After this brief discussion of the significance of the proposed PRS codes, the general structure of these codes is introduced.
For a given realtime-pair constraint (N, K), a general PRS code of order s is denoted by (N, K, Λs)q where Λs represents a 2×(s+1) matrix given by:
In all further discussions, q is dropped from the notation and the order of the field on which the code is based is assumed to have been pre-specified. Moreover, as long as q>N, the parameter q does not influence the performance of the FEC scheme. The entries of matrix Λs are constrained by the equations in (1):
Thus Λs gives an s-partition on the set of parity symbols and a (s+1)-partition on the set of message symbols.
Thus the average total message throughput of an order s PRS code is given by
where, ρ(Ni,Ki) denotes the average number of message symbols received after channel decoding due to a single component of the code graph and can be evaluated using the following equation
IV. Optimal PRS Codes
This section identifies the class of optimal PRS codes for a Binary Erasure Channel (BEC). It is shown that, for a BEC, the optimal PRS code is given by an order 1 PRS code (i.e., PRS-1). The parameter used to measure performance of a code here is message throughput. Thus a code that maximizes this parameter will be the optimal code. At this stage some lemmas are proven; these lemmas help to limit the ensemble of codes that have to be considered to find the optimal PRS code. The following notations and propositions are used by the lemmas.
Let,
Below three propositions are used to prove the lemmas regarding the optimality of PRS codes of order s=1.
Proposition 1 (P1): ∀ (N,K) the optimal PRS code in the set
Proposition 2 (P2): ∀ (N,K), ∀ (N,K3) the optimal PRS code in the set
Proposition 3 (P3): ∀ (N,K), ∃ an order s PRS code, that performs better than all order (s+1) PRS codes.
LEMMA 1: For a BEC P1P2. In other words, if the optimal code within the set
Proof Consider the optimal code on the set
LEMMA 2: For a BEC P2P3.
Proof: Consider a PRS code of order (s+1) be given by (N,K,Λs+1) as shown in
Ψ(N+N+1),(K+K
is an order 1 PRS code. For a BEC the relative performance of two codewords does not change due to the addition of identical code sections. Thus
LEMMA 3: For a BEC P2P3.
Lemma's 1 and 2 reduce the ensemble of codes over which there is a need to search for an optimal code to the set
CONJECTURE 1: For a BEC channel P1 is true.
The validity of conjecture 1 is verified for different values of N, K and p. Here, some results for N=100 and K=88 are presented. Any PRS code of order 2 belonging to the set
In other words, for a given (N, K), only two parameters (e.g., N1 and K1) are needed to represent all codes in the set
Thus in
FIGS. 4 (a) and (b) show the results for p=0.05 and p=0.1 where the channel capacity is 0.95 and 0.90, respectively. It should be noted that in both of these cases the coding rate (0.88) is below channel capacity. It can be seen in that in both the cases the optimal PRS code for a BEC in
It has been explained above in section II, that though “on-average” the coding rate is lower than channel capacity, the time varying nature of a channel can make the scenarios when the number of losses are greater than N−K, or when the coding rate is higher than channel capacity possible. In such a situation though complete recovery of lost data is impossible, partial recovery can be provided even when the coding rate is greater than channel capacity. Thus a possible way to mitigate the above problem is to use some feedback information to change the code profile. Section IX presents a unique “fixed rate” adaptive FEC scheme based on PRS-1 codes which adaptively facilitates such a partial recovery under severe channel conditions. For this purpose it is important to conduct optimality analysis of PRS codes for coding rates greater than channel capacity. The FIGS. 4 (c) and (d) exhibit the performance of PRS codes belonging to
Thus using Conjecture 1 and lemma 3 it can be concluded that for a BEC ∀(N K p), the optimal PRS code is an order 1 PRS code.
V. Optimal PRS-1
In this section, the performance of PRS codes of order 1 (PRS-1) are further evaluated and analyzed. As the design of a PRS code is completely determined by the choice of K1, a shortened notation for order 1 PRS code is used. Thus a PRS code denoted by (N,K,K1) is equivalent to a PRS code denoted by (N,K,Λ1) where
Thus the optimal PRS code will be obtained by choosing an optimal value of K1, denoted by K*.
The probability of a message symbol loss (after channel decoding) for a (N,K,K1) PRS-1 code over a BEC with probability of erasure p is given by
The optimal value of K1 can be obtained by minimizing the above expression. Since τm=(1−τm), this is equivalent to maximizing the message throughput. Thus the results in
In
In
As the dependence of optimal PRS codes on channel capacity and coding rate is symmetric, it can be concluded that for a given probability of erasure and block-length there exists a critical coding rate lesser than channel capacity, such that, for all coding rates above this critical value, there exists an optimal PRS code that can outperform the traditional RS code. Moreover, it can be shown for the PRS-1 codes,
Thus, since
N→∞,C→(1−p) and R=K/N,
it can be concluded that as
N→∞,pm→1−(C/R).
By combining the (inverse of the) channel coding theorem with this result, it can be concluded that as
VI. Performance Comparison with RS Codes
The z-axis in
At this stage it is important to emphasize that, there exist coding rates below channel capacity for which an optimal PRS code can outperform an RS code of similar rate and blocklength.
VII. Graceful Degradation
VIII. Video Simulations
The overall performance due to the graceful degradation in performance of PRS codes, as the number of losses in a code block increase, is further highlighted when the performance is measured in terms of perceptual image quality instead of message throughput. This can be attributed to the limitations of error concealment algorithms, which are effective only when the numbers of losses (after channel decoding) is not substantial. The newly emerging JVT standard is used as an underlying video coding technique to compare the performance of RS and PRS channel coding schemes under identical channel conditions and identical loss patterns.
In the above experiments no knowledge about the source model was used for allocation of parity symbols i.e. the symbols to be protected in a PRS code block were chosen without taking into consideration the importance of I frames or the temporal proximity of P frames to a particular I frame. Thus no attempt is made to provide a new Prioritized Encoding scheme. Often Unequal Error Protection is associated with Prioritized Encoding schemes, but it should be noted that even irregular graph codes are unequal error protection schemes. Thus in this case the best PRS code for a BEC is an unequal distribution of parity. A more appropriate interpretation of such a code would be to recognize it as an irregular graph code. In addition the error robustness features in the standard were kept at a minimal. i.e. features such as forced intra coded blocks, data partitioning, use of B-frames etc. were turned off. Taking all the above features into consideration can significantly improve the performance of PRS codes, but even without these features and even for worst cases the performance improvement of PRS codes is significant.
The standard test sequence foreman is employed to present results. The sequence was coded at 1 Mbps at 30 HZ. A Group-Of-Pictures (GOP) size of 15 with a frame sequence IPPP was used. A packet size of 512 bytes and slice size of 512 bytes were used for the purpose of the simulations.
In the simulations, the number of losses in each code block was forced to be equal to some L.
It should be noted that there were many instances when a particular frame in an RS coded sequence was significantly distorted but a PRS coded sequence had absolutely no artifacts.
It can be clearly seen in the above mentioned Figures that when L=12 the image quality for an RS coded sequence is better than that of a PRS coded sequence. Nevertheless the distortion in the PRS coded sequence is not very significant. On the contrary the performance of the RS coded sequence when L=13 is much worse than that of the PRS code. It can be seen that though the quality of the image for a PRS sequence also deteriorates, the increase in distortion is not significant.
However the increase in distortion for an RS coded sequence is high enough to almost make the frame unintelligible. For such low quality images, the traditional Peak-Signal-to-Noise-Ratio (PSNR) measure does not reflect the true quality of the image and hence only visual results have been presented.
IX. Adaptive FEC
Over channels with time-varying characteristics, multiple code blocks can experience a number of losses that are larger than the number of parity symbols. (e.g., the Internet and wireless networks). Thus, though “on-average” coding rate is lesser than channel capacity, it is possible for the coding rate to be greater than the channel capacity for a period of time. If the change in channel conditions is slow enough and if a channel can provide some feedback information about the channel conditions, then the underlying error control code in an FEC scheme can be changed to adapt to the channel conditions. Most of the current FEC schemes adapt to the channel conditions by changing the coding rate R. If the loss probability increases, the number of parity symbols are also increased (thus the rate is adapted to always transmit below channel capacity). For a realtime application this is equivalent to increasing the transmission bit-rate. Increasing the transmission bit-rate is not always feasible and thus changing the coding rate in an adaptive FEC scheme is not always suitable.
Using a PRS code based adaptive FEC scheme can mitigate the above problem. In such a scheme the coding rate is kept fixed, but the underlying PRS-1 code can be changed. The feedback information about the erasure probability from the channel can be used to optimize the design of the underlying PRS-1 code. It should be noted that the coding rate of the PRS code could be greater than channel capacity for a limited period of time. Optimality analysis of PRS codes for such scenarios was provided in section 4.
It should be realized though, that it is possible to design an RS based fixed transmission rate adaptive FEC scheme. This can be achieved by changing the rate of a code without changing the block-length and transmission rate as shown in
(a) transmitting only a subset of K* message packets out of the K message packets and protecting these K* message packets by N−K* parity packets instead of N−K. Thus K−K* message packets are dropped at the source itself. The performance of this scheme is given by the following equation
(b) transmitting only a subset N·(1−p) message packets out of the K message packets and protecting these N·(1−p) message packets by N·p parity packets instead of N−K. Thus K−N (1−p) message packets are dropped at the source itself. The performance of such a scheme is given by
The description of the invention is merely exemplary in nature and, thus, variations that do not depart from the gist of the invention are intended to be within the scope of the invention. Such variations are not to be regarded as a departure from the spirit and scope of the invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US05/23437 | 6/29/2005 | WO | 12/1/2006 |
Number | Date | Country | |
---|---|---|---|
60585798 | Jul 2004 | US |