The present invention relates to the field of data transmission generally and in particular to techniques for dynamically setting a forward error correction (FEC) type as equal or unequal.
Many kinds of data are transmitted over the Internet and other networks, including video and audio data. Data can be transmitted, for example, from one computer or other transmitting station to another remote computer or other receiving station. Data transmission over networks such as the Internet is frequently accomplished by packetizing the message to be transmitted—that is, by dividing the message into packets that are reassembled at the receiving end to reconstruct the original message. Packets may be lost or delayed during transmission, resulting in corruption of the message. This can be especially problematic when it occurs during real time transmission of data (such as during a voice over IP (VOIP) session or video conferencing).
An apparatus and method are disclosed that dynamically generate error correction codes to be used with a plurality of source packets to be transmitted over a network. In accordance with one aspect of the disclosed embodiments, the plurality of source packets has frames and first partition packets and the method comprises determining at least one source content parameter of a first plurality of source packets, determining at least one network state parameter of the network, selecting between an equal protection forward error correction code and an unequal protection forward error correction code for the first plurality of source packets based on the at least one source content parameter and the at least one network state parameter, and generating a data stream comprising the first plurality of source packets and the selected forward error correction code.
In accordance with another aspect of the disclosed embodiments, an apparatus is provided for dynamically selecting forward error correction codes for use with a plurality of source packets to be transmitted over a network. The apparatus comprises a memory and a processor configured to execute instructions stored in the memory to determine at least one source content parameter of a first plurality of source packets, determine at least one network state parameter of the network, select between an equal protection forward error correction code and an unequal protection forward error correction code for the first plurality of source packets based on the at least one source content parameter and the at least one network state parameter, and generate a data stream comprising the first plurality of source packets and the equal protection forward error correction code or the unequal protection forward error correction code as selected.
These and other embodiments will be described in additional detail hereafter.
The description herein makes reference to the accompanying drawings wherein like reference numerals refer to like parts throughout the several views, and wherein:
A common type of real-time data transmission includes digital video. Digital video is used for various purposes including, for example, remote business meetings via video conferencing, high definition video entertainment, video advertisements, and sharing of user-generated videos. As technology is evolving, users have higher expectations for video quality and expect high resolution video even when transmitted over communications channels having limited bandwidth.
To address the problem of packet loss and other errors, schemes have been proposed for providing additional information in transmissions of data. This additional information can be used by the receiving station to detect and/or correct errors. One such scheme is forward error-correction (FEC) coding, also called channel coding. See, e.g., Lee, A. “RTP Payload Format for Generic Forward Error Correction,” RFC 5109, December, 2007. Under this approach, an FEC packet is applied as an XOR channel code. The XOR code is used to generate the FEC packets by means of a packet mask.
Errors that lead to packet loss can be distributed randomly during the transmission or be distributed in “bursty” fashion, wherein certain errors are grouped together for a period of time. Furthermore, in some video coders not all packets have the same information content. For example, in encoded video transmission, it is likely that the initial packets associated with a video frame contain information derived from the video transmission such as motion vectors. The result is that uncorrected packet loss occurring in the initial packets associated with a video frame can cause greater subjective degradation in the received and decoded video transmission than uncorrected packet loss occurring in other packets of the frame. In the case where errors are distributed substantially randomly, FEC coding that treats each packet equally perform well. In the case where errors are distributed in a bursty fashion, FEC coding that treats each packet equally does not perform as well. FEC coding can be improved by adding more error correction information. However, adding more error correction information decreases the effective bandwidth of the transmission.
In contrast, dynamically selecting forward error correction (FEC) codes based on the network state and the type of packet to be transmitted is taught herein. Such a selection scheme is designed to improve the quality of packetized data transmission where the packets have differing information content, particularly in the presence of bursty errors, without reducing the effective bandwidth.
Treating each packet equally for error protection can be undesirable as described above. Thus, embodiments taught herein select error correction codes to be transmitted with packetized data over a communications network by distinguishing between equal protection error correction (FEC-EP) and un-equal protection error correction (FEC-UEP).
In
In FEC-UEP shown in
In another embodiment, an equal forward error correction code may be formed by using a single pre-calculated FEC-EP packet mask, and increasing the number of is (i.e., replace some of the 0 entries by 1) in the columns corresponding to the first partition source packets. The greater number of 1s in the column entries corresponding to the first partition packets means that more FEC protection is applied to the first partition.
More specifically, packet masks 98, 100 are each a matrix that specifies the linear combination of source packets to generate an FEC packet. The linear combination is applied over the packet data by a bitwise XOR as is known in the art. The packet mask (or the relevant row of the matrix/mask for that packet) is contained in each FEC packet, so decoder 34 has access to this information (at least for a received FEC packet), and hence can perform a recovery operation if needed. For example, referring to
This scheme is able to use pre-calculated FEC-EP packets to accomplish FEC-UEP by considering a subset of the source packets to be transmitted, in this case packets 1-3, and selecting the appropriate FEC-EP packets 9-11 to correct these packets. The scheme then considers the entire set of eight source packets 92 and selects two more FEC-EP packets 12, 13 to correct all eight packets 92. This provides a total of five packets 9-13 with information to correct important source frames 93 and two packets 12, 13 with information to correct less important source frames 95. This provides a higher level of error correction for source packets 93 and a lower level of error correction for source packets 95 than would be obtained by transmitting the same number of purely FEC-EP packets that each considered all eight source packets 92. This scheme provides the advantages of having improved error correction for important source packets 93 while permitting the system to calculate a set of FEC-EP packets 9-13 that can be accessed by look-up tables or other efficient methods to provide efficient run-time implementations of this FEC-UEP scheme. By using a variety of EP masks, various levels of protection can be provided for a number of source packets.
UEP/EP selector 130 receives information regarding the video content state 132, including motion level (ML), spatial level (SL), frame rate (f), and frame size in step 142 from the encoder 112. UEP/EP selector 130 then receives network state parameters 124, including packet loss rate (p), burst length (b) and network rate (R), in step 144 from the network state parameters 124 and with the predetermined FEC protection factor (PF) that determines how large the FEC overhead will be in relation to the data stream in step 146. FEC protection factor PF specifies the amount of FEC packets for a given frame according to the formula m=PF*n, where n is the total number of packets in a given frame and m is the number of FEC packets to be applied to that frame. In step 148, UEP/EP selector 130 calculates the average source encoding rate Rs, which is calculated from the predetermined protection factor PF and the network rate R obtained from the network state parameters 124 according to equation (1) below. Based on the received/calculated parameters, UEP/EP selector 130 selects the FEC protection type, either unequal protection or equal protection, in step 150 and communicates the selection to encoder 112 as indicated by path 134 in
The processing of
ML is measured by detecting the motion level from motion vectors obtained from the encoding process or from a simple frame-to-frame difference operator. SL is obtained by measuring the edges or texture with the frames according to known techniques. ML and SL are averaged over the interval T. Preselected thresholds are then used to classify the motion and spatial levels as low, medium or high. PF determines how many FEC packets are applied during each T second time interval and to set the source encoding rate Rs equal to network rate R reduced by the percentage of FEC packets as follows:
Rs=R(1−PF). (1)
The relative size c of the first partition to the total frame is calculated from previously encoded frames and the frame rate f is based, for example, on a camera capture rate. UEP/EP selector 130 calculates a function F that maps the variables listed above into a protection-type state y wherein:
y=F(ML,SL,PF,c,R,f,p,b). (2)
A threshold H may be used to map the output into a binary state such that y<H results in UEP/EP selector 130 selecting EP in step 150, and y≧H results in UEP/EP selector 130 selecting UEP in step 150. A typical default value might be H=0.5.
An embodiment of the instant invention, the selection can be based on a reduced set of parameters. In this case, three parameters are used according to the formula:
y=G(ML)*J(PF/c); wherein (3)
J is a sigmoid-type function that will asymptotically approach 1 for PF/c>>C1, asymptotically approach 0 if PF/c<<C2 and interpolate smoothly in between; and G is a function that maps the discrete motion level state to an analog scale. UEP/EP selector again uses a threshold H to map the output y into a binary state to select EP or UEP as above.
As shown in
G(ML=0)=0: favor EP for stationary scenes; (4)
G(ML=1)=1.0: neutral; and (5)
G(ML=2)=1.5: favor UEP for motion scenes. (6)
In this scheme, an FEC-EP scheme is favored when the motion level is low or when the amount of FEC protection selected is close to or smaller than c, the size of the first partition. These cases are favorable to EP-type error correction since error concealment for most decoders is very good for stationary scenes and the FEC protection is small compared to the size of the first partition. The values of C1 and C2 can be set depending upon the error concealment property EC of the particular decoder used and the burstiness b of the network. For example, the stronger EC is on the first partition, the less UEP is relied upon, therefore C1 and C2 can be increased. On the other hand, the burstier the network packet loss is, the more it would favor UEP, since FEC-UEP is more robust with respect to bursty loss. In this case, C1 and C2 would be decreased.
Three exemplary curves 166, 172, 178 that may be generated to guide the selection of unequal or equal packet protection schemes are shown by example in
In yet another embodiment, a training approach may be used to generate the mapping function that selects the UEP/EP state. This approach involves off-line training over a set of test simulations. Each simulation is defined by previously defined parameters ML, SL, R, f, p, b and PF. The test data could be a representative video clip of a few seconds. The parameter c is implicitly defined by the scene content (ML, SL), the encoder bit rate R and the frame rate f. A large set of simulations over different representative video clips and parameter ranges would be used to generate a training sample for the function below:
y=F(ML,SL,R,f,p,b,PF). (7)
For each simulation a quality score that compares the input data stream to the output is used. For example, as peak signal to noise ratio (PSNR) or a structural similarity score (SSIM) is calculated. Each simulation would be scored with the EP and UEP settings. A clustering approach is used to cluster all the simulation results that favor UEP over EP into one group and the results that favor EP over UEP in to another group. The mapping function may then be adjusted to yield these results on subsequent data by estimating the image and network parameters that would yield optimal error correction. An example of a learning/clustering model that may be used in this application is a support vector machine.
In another embodiment, UEP/EP 130 selector may operate at the frame level, making a further decision or overriding the decision from the method described with respect to
In one implementation, the process to select the UEP protection is done using a designated number (z) of FEC packets solely for protection of the first partition, where z=H(n,m,k). If z=0, then the EP protection type is selected. In one example, we use z=H(n,m,k)=min(k,m/2) (where min(a,b) is the minimum of a and b).
An exemplary transmitting station 192 has the functions shown in
Network 24 connects transmitting station 192 and a receiving station 200. Receiving station 200 can correspond to receiver 25 of
The receiving station 200, in one example, can be a computer having an internal configuration of hardware including a processor such as a central processing unit (CPU) 202 and a memory 204. CPU 202 can be a controller for controlling the operations of receiving station 200. CPU 202 can be connected to memory 204 by, for example, a memory bus (not shown). Memory 204 can be ROM, RAM and/or any other suitable memory device. Memory 204 stores data and program instructions that are used by CPU 202 particularly those described for FEC decoder 26, de-packetization process 30 and decoder 34 in this example. CPU 202 so configured can, in this example, decode the transmitted source packets along with error correction codes, correcting the source packets when necessary with the error correction codes according to known methods. Other suitable implementations of receiving station 200 are possible. For example, the processing of receiving station 200 can be distributed among multiple devices.
In
The particular method used to encode or decode the data stream herein is not critical or limited. For example, digital video streams can include formats such as VPx, promulgated by Google Inc. of Mountain View, Calif., and H.264, a standard promulgated by ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG), including present and future versions thereof. H.264 is also known as MPEG-4 Part 10 or MPEG-4 AVC (formally, ISO/IEC 14496-10). Such video compression can be used to reduce the volume of bits needed to transmit, store or otherwise represent digital video and can be used herein.
Other implementations of the data transmission system 190 are possible. For example, one implementation can omit network 24 and/or display 206. In another implementation, a video stream can be encoded and then stored for transmission at a later time by receiving station 200 or any other device having memory. In another implementation, additional components can be added to data transmission system 190. For example, a display or a video camera can be attached to transmitting station 192 to capture a video stream to be encoded. In an exemplary implementation, the real-time transport protocol (RTP) is used. In another implementation, for example, a transport protocol other than RTP may be used.
The embodiments of transmitting station 192 and receiving station 200 (and the algorithms, methods, instructions, etc. stored thereon and/or executed thereby) can be realized in hardware, software, or any combination thereof. The hardware can include, for example, computers, intellectual property (IP) cores, application-specific integrated circuits (ASICs), programmable logic arrays, optical processors, programmable logic controllers, microcode, microcontrollers, servers, microprocessors, digital signal processors or any other suitable circuit. In the claims, the term “processor” should be understood as encompassing any of the foregoing hardware, either singly or in combination. The terms “signal” and “data” are used interchangeably. Further, portions of transmitting station 192 and receiving station 200 do not necessarily have to be implemented in the same manner.
Further, in one embodiment for example, transmitting station 192 or receiving station 200 can be implemented using a general purpose computer/processor with a computer program that, when executed, carries out any of the respective methods, algorithms and/or instructions described herein. In addition or alternatively, for example, a special purpose computer/processor can be utilized which can contain specialized hardware for carrying out any of the methods, algorithms, or instructions described herein.
All or a portion of embodiments of the present invention can take the form of a computer program product accessible from, for example, a computer-usable or computer-readable medium. A computer-usable or computer-readable medium can be any device that can, for example, tangibly contain, store, communicate, or transport the program for use by or in connection with any processor. The medium can be, for example, an electronic, magnetic, optical, electromagnetic, or a semiconductor device. Other suitable mediums are also available.
The above-described embodiments have been described in order to allow easy understanding of the present invention and do not limit the present invention. On the contrary, the invention is intended to cover various modifications and equivalent arrangements included within the scope of the claims, which scope is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structure as is permitted under the law.
Number | Name | Date | Kind |
---|---|---|---|
6195391 | Hancock et al. | Feb 2001 | B1 |
6556588 | Wan et al. | Apr 2003 | B2 |
6681362 | Abbott et al. | Jan 2004 | B1 |
6732313 | Fukushima et al. | May 2004 | B2 |
7577898 | Costa et al. | Aug 2009 | B2 |
7710973 | Rumbaugh et al. | May 2010 | B2 |
8060651 | Deshpande et al. | Nov 2011 | B2 |
8494053 | He et al. | Jul 2013 | B2 |
8553776 | Shi et al. | Oct 2013 | B2 |
20020157058 | Ariel et al. | Oct 2002 | A1 |
20030012287 | Katsavounidis et al. | Jan 2003 | A1 |
20030229822 | Kim et al. | Dec 2003 | A1 |
20040196902 | Faroudja | Oct 2004 | A1 |
20050157793 | Ha et al. | Jul 2005 | A1 |
20050185715 | Karczewicz et al. | Aug 2005 | A1 |
20050259729 | Sun | Nov 2005 | A1 |
20060013310 | Lee et al. | Jan 2006 | A1 |
20060248563 | Lee et al. | Nov 2006 | A1 |
20060291475 | Cohen | Dec 2006 | A1 |
20070250754 | Costa et al. | Oct 2007 | A1 |
20080089414 | Wang et al. | Apr 2008 | A1 |
20080109707 | Dell et al. | May 2008 | A1 |
20080134005 | Izzat et al. | Jun 2008 | A1 |
20080250294 | Ngo et al. | Oct 2008 | A1 |
20090007159 | Rangarajan et al. | Jan 2009 | A1 |
20090022157 | Rumbaugh et al. | Jan 2009 | A1 |
20090059067 | Takanohashi et al. | Mar 2009 | A1 |
20090059917 | Lussier et al. | Mar 2009 | A1 |
20090080510 | Wiegand et al. | Mar 2009 | A1 |
20090103635 | Pahalawatta | Apr 2009 | A1 |
20090122867 | Mauchly et al. | May 2009 | A1 |
20090245351 | Watanabe | Oct 2009 | A1 |
20090268819 | Nishida | Oct 2009 | A1 |
20090276686 | Liu et al. | Nov 2009 | A1 |
20100122127 | Oliva et al. | May 2010 | A1 |
20100153828 | De Lind Van Wijngaarden et al. | Jun 2010 | A1 |
20100202414 | Malladi et al. | Aug 2010 | A1 |
20100235820 | Khouzam et al. | Sep 2010 | A1 |
20100306618 | Kim et al. | Dec 2010 | A1 |
20100309372 | Zhong | Dec 2010 | A1 |
20100309982 | Le Floch et al. | Dec 2010 | A1 |
20110194605 | Amon et al. | Aug 2011 | A1 |
20110218439 | Masui et al. | Sep 2011 | A1 |
20120287999 | Li et al. | Nov 2012 | A1 |
20130031441 | Ngo et al. | Jan 2013 | A1 |
Entry |
---|
Liang, Y.J.; Apostolopoulos, J.G.; Girod, B., “Analysis of packet loss for compressed video: does burst-length matter?,” Acoustics, Speech and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International conference on, vol. 5, no., pp. V, 684-687 vol. 5, Apr. 6-10, 2003. |
Yoo, S. J.B., “Optical Packet and burst Switching Technologies for the Future Photonic Internet,” Lightwave Technology, Journal of, vol. 24, No. 12, pp. 4468, 4492, Dec. 2006. |