SCALABLE VIDEO CODING AND DECODING

Information

  • Patent Application
  • 20080048894
  • Publication Number
    20080048894
  • Date Filed
    July 09, 2007
    18 years ago
  • Date Published
    February 28, 2008
    17 years ago
Abstract
A system and method for improved video encoding and decoding. The present invention addresses issues that arise in the H.264/AVC standard involving “high magnitude coefficients.” According to various embodiments of the present invention, an encoded end of block (EOB) symbol provides information comprising at least one of the maximum magnitude of values in a block, the number of values in the block with a magnitude greater than 1, and a variable length code (VLC) index indicating a VLC to be used in decoding precise magnitudes for non-zero values in the block. By including this information in the EOB symbol, improved coding efficiency is achieved.
Description

BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 shows a generic multimedia communications system for use with the present invention;



FIG. 2 is a perspective view of a mobile telephone that can be used in the implementation of the present invention; and



FIG. 3 is a schematic representation of the circuitry of the mobile telephone of FIG. 2.





DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS


FIG. 1 shows a generic multimedia communications system for use with the present invention. As shown in FIG. 1, a data source 100 provides a source signal in an analog, uncompressed digital, or compressed digital format, or any combination of these formats. An encoder 110 encodes the source signal into a coded media bitstream. The encoder 110 may be capable of encoding more than one media type, such as audio and video, or more than one encoder 110 may be required to code different media types of the source signal. The encoder 110 may also get synthetically produced input, such as graphics and text, or it may be capable of producing coded bitstreams of synthetic media. In the following, only processing of one coded media bitstream of one media type is considered to simplify the description. It should be noted, however, that typically real-time broadcast services comprise several streams (typically at least one audio, video and text sub-titling stream). It should also be noted that the system may include many encoders, but in the following only one encoder 110 is considered to simplify the description without a lack of generality.


The coded media bitstream is transferred to a storage 120. The storage 120 may comprise any type of mass memory to store the coded media bitstream. The format of the coded media bitstream in the storage 120 may be an elementary self-contained bitstream format, or one or more coded media bitstreams may be encapsulated into a container file. Some systems operate “live”, i.e. omit storage and transfer coded media bitstream from the encoder 110 directly to the sender 130. The coded media bitstream is then transferred to the sender 130, also referred to as the server, on a need basis. The format used in the transmission may be an elementary self-contained bitstream format, a packet stream format, or one or more coded media bitstreams may be encapsulated into a container file. The encoder 110, the storage 120, and the sender 130 may reside in the same physical device or they may be included in separate devices. The encoder 110 and sender 130 may operate with live real-time content, in which case the coded media bitstream is typically not stored permanently, but rather buffered for small periods of time in the content encoder 110 and/or in the sender 130 to smooth out variations in processing delay, transfer delay, and coded media bitrate.


The sender 130 sends the coded media bitstream using a communication protocol stack. The stack may include but is not limited to Real-Time Transport Protocol (RTP), User Datagram Protocol (UDP), and Internet Protocol (IP). When the communication protocol stack is packet-oriented, the sender 130 encapsulates the coded media bitstream into packets. For example, when RTP is used, the sender 130 encapsulates the coded media bitstream into RTP packets according to an RTP payload format. Typically, each media type has a dedicated RTP payload format. It should again be noted that a system may contain more than one sender 130, but for the sake of simplicity, the following description only considers one sender 130.


The sender 130 may or may not be connected to a gateway 140 through a communication network. The gateway 140 may perform different types of functions, such as translation of a packet stream according to one communication protocol stack to another communication protocol stack, merging and forking of data streams, and manipulation of data streams according to the downlink and/or receiver capabilities, such as controlling the bit rate of the forwarded stream according to prevailing downlink network conditions. Examples of gateways 140 include multipoint conference control units (MCUs), gateways between circuit-switched and packet-switched video telephony, Push-to-talk over Cellular (PoC) servers, IP encapsulators in digital video broadcasting-handheld (DVB-H) systems, or set-top boxes that forward broadcast transmissions locally to home wireless networks. When RTP is used, the gateway 140 is called an RTP mixer and acts as an endpoint of an RTP connection.


Alternatively, the coded media bitstream may be transferred from the sender 130 to the receiver 150 by other means, such as storing the coded media bitstream to a portable mass memory disk or device when the disk or device is connected to the sender 130 and then connecting the disk or device to the receiver 150.


The system includes one or more receivers 150, typically capable of receiving, de-modulating, and de-capsulating the transmitted signal into a coded media bitstream. De-capsulating may include the removal of data that receivers are incapable of decoding or that is not desired to be decoded. The codec media bitstream is typically processed further by a decoder 160, whose output is one or more uncompressed media streams. Finally, a renderer 170 may reproduce the uncompressed media streams with a loudspeaker or a display, for example. The receiver 150, decoder 160, and renderer 170 may reside in the same physical device or they may be included in separate devices.


Scalability in terms of bitrate, decoding complexity, and picture size is a desirable property for heterogeneous and error prone environments. This property is desirable in order to counter limitations such as constraints on bit rate, display resolution, network throughput, and computational power in a receiving device.


Various embodiments of the present invention provide improved coding efficiency while coding the transform coefficient refinement levels by addressing two principal problems with the conventional approach to coding “high magnitude” coefficients in Annex F of H.264/AVC: the EOB symbol itself and the VLC used to refine high magnitude coefficients. The invention observes the high likelihood of the high-magnitude refinement coefficient levels under certain coding conditions, and advantageously determines tradeoff thresholds to adapt the coding of the EOB symbol that conveys information associated with the remaining number of coefficients in a block with magnitude greater than 1, as well as the maximum magnitude in the block. For example, the tradeoff thresholds may be determined according to an input signal which limits either or both the values of MaxMag and CountMag2.


In one embodiment of the invention, MaxMag is capped so that a value of MaxMag=4 indicates that the block contains at least one coefficient with value of 4 or higher. In this case, the precise value of MaxMag is unknown, and coding of the exp-Golomb codes cannot be terminated early, which is contrary to what occurs in conventional arrangements as discussed previously. Although this may result in a few extra bits being coded, the EOB symbol is potentially much smaller, thus requiring fewer bits and resulting in a net saving of bits.


In another embodiment of the invention, CountMag2 is capped so that a value of CountMag2=6 indicates that the block contains at least 6 coefficients with magnitude greater than 1. The precise value of CountMag2 is unknown so that coding of the exp-Golomb codes could not be terminated early. Although this may also result in a few extra bits being coded, the EOB symbol is potentially much smaller, thus requiring fewer bits and resulting in a net saving of bits.


In still another embodiment of the present invention, the cap on CountMag2 is not a constant number, but instead is at least partially relative to the number of non-zero coefficients in the block, NumSigCoeff. For example, a value of CountMag2 in the range 1.3 may indicate the exact number of coefficients with a magnitude greater than 1; CountMag2=4 may indicate that at least the greater of 4 or half of NumSigCoeff coefficients have a magnitude greater than 1; and CountMag2=5 may indicate that at least the greater of 5 or three-quarters of NumSigCoeff coefficients have a magnitude greater than 1.


The present invention also involves the utilization of a VLC code for refining coefficients with a magnitude greater than 1. In one embodiment, the invention uses a VLC known to both the encoder and the decoder for refining coefficient magnitudes. The VLC that is selected may depend upon one or more of the values of NumSigCoeff, MaxMag, and CountMag2. For example, one VLC may be used if MaxMag<4, and a different VLC may be used if MaxMag>=4. In cases where more than one of the values NumSigCoeff, MaxMag and CountMag2 is used, a lookup table may be utilized to determine which VLC is used.


In a further embodiment of the present invention, the invention uses binary values to code coefficient magnitudes. In this embodiment, for example, given MaxMag=4, 00 means a magnitude of 1, 01 means a magnitude of 2, 10 means a magnitude of 3, and 11 means a magnitude of 4. In the event that MaxMag is not a power of two, early truncation of the coding process may be used, since not all binary values are permitted.


In still another embodiment, a VLC table to be used is embedded in the EOB symbol. This may occur along with, or instead of, the values MaxMag and CountMag2, which are already embedded in the EOB symbol. In this embodiment, the EOB symbol is a function that takes MaxMag, CountMag2 and/or the VLC table index as parameters.


In a particular embodiment of the present invention, the pseudo-code discussed previously with regard to conventional arrangements is modified as follows:

















if ( EOBsymbol < 8 )



{



 MaxMag = (EOBsymbol % 2) + 2;



 CountMag2 = (EOBsymbol / 2) + 1;



 VLC  = 1;



} else {



 MaxMag = (EOBsymbol % 2 ) + 4;



 CountMag2 = (EOBsymbol / 16);



 VLC  = 2;



}










In the above embodiment, MaxMag is capped at 5, i.e. MaxMag=5 means that the maximum coefficient magnitude is at least 5. Similarly, CountMag2 is capped at 4 for MaxMag<4. The value of EOBsymbol also provides a VLC table index to be used in refining magnitude information. In one embodiment, a VLC=1 indicates exp-Golomb, and a VLC=2 indicates a binary representation of the coefficient magnitude minus 1. It should be noted that the VLC indication may be explicitly coded. In other words, the VLC indication may be determined directly from the EOB symbol value, rather than inferred from the maximum magnitude or number of coefficients.


In another embodiment of the invention, the pseudo-code is modified as follows:

















if ( EOBsymbol < 4)



{



 MaxMag = 5;



 CountMag2 = 6;



 VLC  = EOBsymbol + 1;



} else {



 MaxMag = (EOBsymbol / 4) % 4) + 1;



 CountMag2 = min(6, EOBsymbol / 16);



 VLC  = 0;



}










In another embodiment, the VLC used for encoding high magnitude coefficients involves binarizing the magnitude values and forming a VLC codeword based on bitplane values. This can be particularly useful if the maximum magnitude (MaxMag) is small, but the number of coefficients with magnitude greater than 1 is large. For example, if the vector of coefficients is {1, 1, 2, 1, 1}, using a VLC to code the individual magnitudes of each coefficient is not very beneficial. But, if the VLC codeword is formed based on the bitplane 00100, the probability of the various magnitudes may be better approximated.


It should be noted that, although the various embodiments of the present invention described herein are described in the context of significance pass coding for FGS slices in the H.264/AVC standard, some or all of these embodiments can be similarly applied to other types of coefficients (e.g., refinement coefficients), other types of values (e.g., pixel values instead of coefficients), other slice types, or other video coders.


The mobile telephone 12 of FIGS. 2 and 3 includes a housing 30, a display 32 in the form of a liquid crystal display, a keypad 34, a microphone 36, an ear-piece 38, a battery 40, an infrared port 42, an antenna 44, a smart card 46 in the form of a UICC according to one embodiment of the invention, a card reader 48, radio interface circuitry 52, codec circuitry 54, a controller 56 and a memory 58. Individual circuits and elements are all of a type well known in the art, for example in the Nokia range of mobile telephones.


Communication devices of the present invention may communicate using various transmission technologies including, but not limited to, Code Division Multiple Access (CDMA), Global System for Mobile Communications (GSM), Universal Mobile Telecommunications System (UMTS), Time Division Multiple Access (TDMA), Frequency Division Multiple Access (FDMA), Transmission Control Protocol/Internet Protocol (TCP/IP), Short Messaging Service (SMS), Multimedia Messaging Service (MMS), e-mail, Instant Messaging Service (IMS), Bluetooth, IEEE 802.11, etc. A communication device may communicate using various media including, but not limited to, radio, infrared, laser, cable connection, and the like.


The present invention is described in the general context of method steps, which may be implemented in one embodiment by a program product including computer-executable instructions, such as program code, executed by computers in networked environments. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of program code for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.


Software and web implementations of the present invention could be accomplished with standard programming techniques with rule based logic and other logic to accomplish the various database searching steps, correlation steps, comparison steps and decision steps. It should also be noted that the words “component” and “module,” as used herein and in the claims, is intended to encompass implementations using one or more lines of software code, and/or hardware implementations, and/or equipment for receiving manual inputs.


The foregoing description of embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the present invention to the precise form disclosed, and modifications and variations are possible in light of the above teachings or may be acquired from practice of the present invention. The embodiments were chosen and described in order to explain the principles of the present invention and its practical application to enable one skilled in the art to utilize the present invention in various embodiments and with various modifications as are suited to the particular use contemplated.

Claims
  • 1. A method of decoding a block of values from a bit stream, comprising: decoding from the bit stream an indication of a run length terminated by a value with a magnitude greater than zero;determining whether the indication belongs to a class of symbols indicating that no non-zero values remain to be decoded in the block;if the indication does not belong to the class of symbols, repeating the decoding and determining until all non-zero values in the block have been read; anddetermining, based on the last decoded indication, whether any of the non-zero values in the block have a magnitude greater than 1,wherein the last decoded indication provides information comprising at least one of: a maximum magnitude of values in the block,a number of values in the block with a magnitude greater than 1, anda variable length code (VLC) index indicating a VLC to be used in decoding precise magnitudes for non-zero values in the block.
  • 2. The method of claim 1, wherein at least one of the maximum magnitude of values in the block and the number of values in the block with a magnitude greater than 1 are limited to particular maximum values.
  • 3. The method of claim 2, wherein any of the particular maximum values are determined from previously decoded information.
  • 4. The method of claim 3, wherein the previously decoded information includes the number of non-zero values in the current block.
  • 5. The method of claim 1, wherein a VLC is used to decode magnitude refinement information following decoding the last decoded indication.
  • 6. The method of claim 5, wherein the VLC depends upon the value of the last decoded indication.
  • 7. A computer program product, embodied in a computer readable medium, for decoding a block of values from a bit stream, comprising: computer code for decoding from the bit stream an indication of a run length terminated by a value with a magnitude greater than zero;computer code for determining whether the indication belongs to a class of symbols indicating that no non-zero values remain to be decoded in the block;computer code for, if the indication does not belong to the class of symbols, repeating the decoding and determining until all non-zero values in the block have been read; andcomputer code for determining, based on the last decoded indication, whether any of the non-zero values in the block have a magnitude greater than 1,wherein the last decoded indication provides information comprising at least one of: a maximum magnitude of values in the block,a number of values in the block with a magnitude greater than 1, anda variable length code (VLC) index indicating a VLC to be used in decoding precise magnitudes for non-zero values in the block.
  • 8. An apparatus, comprising: a processor; anda memory unit communicatively connected to the processor and including: computer code for decoding from a bit stream an indication of a run length terminated by a value with a magnitude greater than zero;computer code for determining whether the indication belongs to a class of symbols indicating that no non-zero values remain to be decoded in the block;computer code for, if the indication does not belong to the class of symbols, repeating the decoding and determining until all non-zero values in the block have been read; andcomputer code for determining, based on the last decoded indication, whether any of the non-zero values in the block have a magnitude greater than 1,wherein the last decoded indication provides information comprising at least one of: a maximum magnitude of values in the block,a number of values in the block with a magnitude greater than 1, anda variable length code (VLC) index indicating a VLC to be used in decoding precise magnitudes for non-zero values in the block.
  • 9. The apparatus of claim 8, wherein at least one of the maximum magnitude of values in the block and the number of values in the block with a magnitude greater than 1 are limited to particular maximum values.
  • 10. The apparatus of claim 9, wherein the particular maximum values are determined from previously decoded information.
  • 11. The apparatus of claim 10, wherein the previously decoded information includes the number of non-zero values in the current block.
  • 12. The apparatus of claim 8, wherein a VLC is used to decode magnitude refinement information following decoding the last decoded indication.
  • 13. The apparatus of claim 12, wherein the VLC depends upon the value of the last decoded indication.
  • 14. A method of encoding a block of values into a bit stream, comprising: determining whether any non-zero values in the block remain to be encoded;if any non-zero values remain to be encoded: encoding into the bit stream an indication of a run length terminated by a value with a magnitude greater than zero;returning to the determining of whether any non-zero values remain;if no non-zero values remain to be encoded, encoding into the bit stream an end of block (EOB) symbol,wherein the EOB symbol is selected based upon at least one of a maximum magnitude of values in the block,a number of values in the block with a magnitude greater than 1, anda variable length code (VLC) index indicating a VLC to be used in encoding precise magnitudes for non-zero values in the block.
  • 15. The method of claim 14, wherein at least one of the maximum magnitude of values in the block and the number of values in the block with a magnitude greater than 1 are limited to particular maximum values.
  • 16. The method of claim 15, wherein any of the particular maximum values are determined from previously encoded information.
  • 17. The method of claim 16, wherein the previously encoded information includes the number of non-zero values in the current block.
  • 18. The method of claim 14, wherein a VLC is used to encode magnitude refinement information following encoding of the EOB symbol.
  • 19. The method of claim 18, wherein the VLC depends upon the value of the EOB symbol.
  • 20. A computer program product, embodied in a computer-readable medium, for encoding a block of values into a bit stream, comprising: computer code for determining whether any non-zero values in the block remain to be encoded;computer code for, if any non-zero values remain to be encoded: encoding into the bit stream an indication of a run length terminated by a value with a magnitude greater than zero;returning to the determining of whether any non-zero values remain;computer code for, if no non-zero values remain to be encoded, encoding into the bit stream an end of block (EOB) symbol,wherein the EOB symbol is selected based upon at least one of a maximum magnitude of values in the block,a number of values in the block with a magnitude greater than 1, anda variable length code (VLC) index indicating a VLC to be used in encoding precise magnitudes for non-zero values in the block.
  • 21. An apparatus, comprising: a processor; anda memory unit communicatively connected to the processor and including: computer code for determining whether any non-zero values in the block remain to be encoded;computer code for, if any non-zero values remain to be encoded:encoding into the bit stream an indication of a run length terminated by a value with a magnitude greater than zero;returning to the determining of whether any non-zero values remain; computer code for, if no non-zero values remain to be encoded, encoding into the bit stream an end of block (EOB) symbol,wherein the EOB symbol is selected based upon at least one of a maximum magnitude of values in the block,a number of values in the block with a magnitude greater than 1, anda variable length code (VLC) index indicating a VLC to be used in encoding precise magnitudes for non-zero values in the block.
  • 22. The apparatus of claim 21, wherein at least one of the maximum magnitude of values in the block and the number of values in the block with a magnitude greater than 1 are limited to particular maximum values.
  • 23. The apparatus of claim 22, wherein any of the particular maximum values are determined from previously encoded information.
  • 24. The apparatus of claim 23, wherein the previously encoded information includes the number of non-zero values in the current block.
  • 25. The apparatus of claim 21, wherein a VLC is used to encode magnitude refinement information following encoding of the EOB symbol.
  • 26. The apparatus of claim 25, wherein the VLC depends upon the value of the EOB symbol.
Provisional Applications (1)
Number Date Country
60830389 Jul 2006 US