In some cases, an algorithm may be used to reduce an amount of data that is transmitted between devices. Consider, for example, a media player that outputs moving images to a display device. The media player might retrieve locally stored image information or receive a stream of image information from a media server (e.g., a content provider might transmit a stream that includes information about high-definition image frames to a television, a set-top box, or a digital video recorder through a cable or satellite network). Such image information may be encoded to reduce the amount of data used to represent the image. For example, an image might be divided into smaller image portions, such as macroblocks, so that information encoded with respect to one image portion does not need to be repeated with respect to another image portion (e.g., because neighboring image portions may frequently have similar color and brightness characteristics). Moreover, algorithms, such as Varied Length Coding (VLC) or Context-Based Adaptive Binary Arithmetic Coding (CABAC) may be used to reduce the number of bits that are needed to represent the image. Thus, improving the efficiency of such algorithm implementations may improve the performance and/or reduce the cost of such devices.
Although some embodiments will be described with respect to media devices, note that embodiments may be associated with any systems or devices that may benefit from the techniques described herein. Consider, for example, a media player that receives image information, decodes the information, and outputs a signal to a display device. Such a media player might be a Digital Video Recorder (DVR) that retrieves locally stored image information, or a set-top box that receives a stream of image information from a remote device (e.g., a content provider might transmit a stream that includes information about high-definition image frames to the set-top box through a cable or satellite network).
According to some embodiments, the encoders 114 and/or video processing system 140 (e.g., along with a memory unit 150) of the media player 120 may use algorithms to reduce the amount of information that needs to be transmitted in order to represent an image. That is, the encoders 114 may reduce the amount of data that is required to represent image content 112 before the data is transmitted by a transmitter 116 as a stream of image information. As used herein, information may be encoded and/or decoded in accordance with any of a number of different protocols. For example, image information may be processed in connection with International Telecommunication Union-Telecommunications Standardization Sector (ITU-T) recommendation H.264 entitled “Advanced Video Coding for Generic Audiovisual Services” (2004) or the International Organization for Standardization (ISO)/International Engineering Consortium (IEC) Motion Picture Experts Group (MPEG) standard entitled “Advanced Video Coding (Part 10)” (2004).
Unlike typical a VLC algorithm, which decodes a symbol directly from a bit stream, the CABAC approach decodes the symbol from a bin string (e.g., bit) which is computed bin-by-bin from the bit stream. In order to speed up the process, a VLC decoder may read more bits than what the symbol being decoded needs (and keep the unused bits in an internal buffer for later use). A CABAC decoder, however, shouldn't read too many bins because it will waste time on the unnecessary computation and may degrade the context variables required by the bin computation if too many bins are derived.
In addition to being a relatively slow process, bin-by-bin decoding may result in poor resource sharing because different binarizations, which contain many bin strings and their corresponding Syntax Element symbols (SE) and are similar to VLC lookup tables, may need different decoding flows. In H.264, for example, 24 syntax elements may use CABAC and more than 15 binarizations may be needed. According to H.264 C reference code, moreover, many CABAC binarizations have individual decoding parsers. As a result, implementations may be complex and costly.
Note that seven of the 24 syntax elements coded with CABAC use Unary (U), Truncated Unary (TU), and Unary k-th Order Exp-Golomb (UEGk) algorithms to create binarizations. Some embodiments described herein may provide an efficient methodology to unify the decode process for these algorithms so that a single bin string decoder, instead of seven bin string decoders, may be used. Such an approach may, for example, substantially reduce the gate count of the implementation.
Note that the H.264 standard doesn't define how the computed new bin string (b0, . . . , bbindix) should be compared with the bin strings in the binarization. If the system compares one bin string in binarization after another, the implementation may be inefficient. For example, the process might compare the bin string N times if there are N symbols in the binarization and an SE is not found. Similarly, the process would compare up to N*6 times if a symbol were found in the 6th iteration. Assuming a maximum bin length of M, then N*M comparisons may be required. Consider, for example, a binarization with 26 SE symbols (N=26) and the maximum bin length is 7 (M=7), then the worst case would be 26*7=182 comparisons to decode a symbol. Such an approach may be a time consuming process.
Also note that a CABAC bin string may be prone to decoded in a bin-by-bin manner because the bin values are computed one-by-one. Therefore, a “serial search approach” may be appropriate for CABAC bin string decoding. The serial search approach may process an input bit-stream serially, one bit at a time, and utilize a constructed “serial search tree.”
In some cases, the serial search tree is converted using a serial search table stored in a buffer for a software or hardware implementation.
When search starts, the node value is zero, representing the root of serial search tree. After getting an input bit, or bin, the combination of node and input is the address of the search table 400 in the buffer and the NodeSym and flag are read out.
Here is an example of decoding the string “101” (referring to the search table 400 of
Starting with node=0 and input=1, it can be seen that NodeSym=1 and flag=0. As a result, address 1 will be the next node.
Now looking at node=1 and input=0, it can be seen that NodeSym=6 and flag=0. As a result, address 6 will be the next node.
Now using node=6 and input=1, it can be seen that NodeSym=d and flag=1. Thus, “d” is the symbol and decoding is complete.
Comparing the search tree 320 of
Such a serial search approach may, however, have several disadvantages. For example, the size of the data table required may be substantial (e.g., to contain all necessary symbols and search nodes). Usually, the number of entries in the serial search table 400 is between double and triple the number of potential symbols, depending to the encode table. The more symbols, the bigger buffer size is required to store the search table 400.
As a result, it can be difficult to use a serial search scheme for UEGk binarization because the number of symbols of the coefficient level of H.264 is 256, and UEGk can generate a binarization for an unlimited number of symbols. We do need a smaller size of buffer for the hardware implementation.
The UEGk algorithm used in H.264 CABAC coding is for the syntax elements of motion vectors, UEG3 with uCoff=3, and coefficient absolute levels, UEG0 with uCoff=14. The UEGk algorithm is the superset of TU and Exp-Golomb algorithms, which are often used in H.264 video coding, because the UEGk combines the TU and Exp-Golomb, the former is the prefix part while the latter is suffix part of UEGk.
Consider, for example,
In the table 500, the first 9 SE symbols (SE 0 through 8), use TU with uCoff=9 to encode or decode while the rest of the SE symbol's use the 3th order Exp-Golomb. In other words, the first 9 SE symbols only have prefix bins while the rest have 9 1's as the prefix bin and several suffix bins coded by EG3 algorithm. (the suffix bins are illustrated in bold in
The following pseudo code, adapted from ITU-T Recommendation H.264, illustrates how to generate the suffix of UEGk binarization. Where, the sufS=symbol value−uCoff. This pseudo code illustrates that the suffix bins starts from a number of leading 1's, followed by a leading 0, followed by a number of extra bins. Assume the number of leading 1's is N1 and the number of extra bins is NE, then NE=k+N1.
For instance, for symbols SE 9 through 16 in the table 500, the N1=0 and NE=k+0=3; there is no leading 1 but a leading 0. The extra bins starts from 000 to 111. For symbols SE 17 through 32 in the table 500, N1=1 and NE=k+1=4. The leading bits are 10, one leading 1 and one leading 0, and the extra bins are from 0000 to 1111. For symbols SE 33 through 64, the leading bits are 110, 2 leading 1's and 1 leading 0, and the extra bins are from 00000 to 11111.
Note that the binarization generated by UEGk with uCoff algorithm may have the following characteristics:
1) For the symbols between 0 and (uCoff−1), the bin strings may have only a prefix part, which includes a number of leading 1's followed by a 0. The number of leading 1's may be equal to the symbol value.
2) For the symbols greater than or equal to uCoff, the bin strings may consist of a prefix and a suffix part:
In additions, the following features of the suffix of the bin string or the kth order Exp-Golomb code may be noted:
1) Extra bins with a given length may be lexicographically consecutive with increment 1.
2) Symbols for the bin strings with a given extra bin length may also be consecutive with increment 1.
3) The extra bins, which are following to the leading 0, may have all values from all 0's to all 1's.
For example, the extra bin length for the SE symbols 9 through 16 is 3 and their extra bin values are from 000 to 111; the extra bin length for SE symbols 17 through 32 is 4 and their extra bin values are from 0000 to 1111.
With the mentioned features of the UEGk, some embodiments may implement a serial search scheme that may require a substantially smaller buffer for decoding. For example, there may be a very close relationship between the extra bin values and the symbol values. For a given bin length, if the symbol for the smallest extra bin value for a given length is know, then all other symbol values with the same extra bin length may be determined. Moreover, extra bins may cover all possible values for a given extra length. As a result, all possible strings with the same bin length might share a single leaf of a search tree.
According to some embodiments, a UEGk search tree can be constructed without having leaves for every SE symbol in the binarization. That is, the tree might use a small number of leaves for the base symbols. for example,
Using the bin string “1111 1111 1010 1” as an example. the 1st 9 1's drive the traverse to arrive at node “A” in
According to some embodiments, an appropriate UEGk search table is built and saved in the buffer of a UEGk binarization decoder. The following pseudo code describes one method of building such a UEGk search table referred to as UEGkTree[ ]. Each element of the UEGkTree[ ] contains three members, NodeSym, Ext, and Flag:
Note that according to the pseudo code:
1) Each bin of each bin string in a UEGk binarization travels the search tree (still being constructed) from the root to one leaf.
2) If the bin value is 1, the process has reached a node. If it is an used node, it takes that node number; otherwise, a new node number is assigned to the node. The node number may be combined with coming bin value for the next move.
3) If the bin value is 0, it has reached a leaf. If it is a prefix bin, and there are no more extra bins, then the symbol is equal to the bin number; otherwise, the extra bin length and the base symbol value are evaluated.
4) When all bins of all bin strings complete the traveling, the appropriate node numbers, base symbols, extra bin lengths and flags are then assigned to all nodes and leafs.
The UEGk search table 700 might be built off-line and stored to a local buffer of a CABAC decoder before starting the process. In such cases, building the table might not affect the performance of the decoder.
According to some embodiments, decoding the bin string may be substantially simpler than building the UEGk search table 700.
At 802, a decoder may input bin data. The bin data may then be used at 804 to compute a search table address. At 806, the search table address is used to read the contents (Flag, Ext, and NodeSym) from the search table buffer. If the value of Flag is not “1” at 808, the process continues to input bin data at 802. If the value of Flag is “1” at 808, the decoder may input a number of extra bins, according to the Ext value, and compute the symbol value at 810. For example, the following pseudo code may demonstrated the binarization decoder according to some embodiments:
Using bin string “1111 1111 1010 1” again as example, the above pseudo code and the UEGk search table 700 of
1) The address after inputting the 1st ‘1’ is 1 (0+1=1) and it contains Flag=0 and Node=1.
2) The address after inputting the 2nd ‘1’ is 3 (1*2+1=3) and it contains Flag=0 and Node=2. The process continues similarly unit the address after inputting the 9th ‘1’ is 19 (8*2+1=17) and it contains Flag 0 and Node=9.
3) Finally, the 1st ‘0’ is input and the address becomes 18 (9*2+0=18) and it gets Flag=1, NodeSym=9, and Ext=3.
4) The 1st extra bin is 1 and n=0*2+1=1.
5) The 2nd extra bin is 0 and n=1*2+0=2.
6) The last extra bin is 1 and n=2*2+1=5. Therefore, the Symbol=9+5=14.
Thus, some embodiments of the UEGk bin string decoder described herein may provide a substantially smaller table size. For example, the required buffer size may be one fifth as compared to a conventional serial search scheme for the 1 st 64 symbols of the UEG9 binarization with uCoff=9. In case of search first 128 symbols of the same binarization, some embodiments may use a table with only 26 entries while a conventional serial search scheme might require a table with about 256 entries. Note that such improvements may increase as the binarization size increases.
Also note that some embodiments of the UEGk bin string decoder described herein may support other types of binarizations. For example, according to the H.264 Specifications, seven of the 24 syntax elements that are coded by CABAC use U, TU, and UEGk algorithms to create the binarizations. All of those binarizations might share the same hardware architecture (because the U and TU are subsets of UEGk).
The following illustrates various additional embodiments. These do not constitute a definition of all possible embodiments, and those skilled in the art will understand that many other embodiments are possible. Further, although the following embodiments are briefly described for clarity, those skilled in the art will understand how to make any changes, if necessary, to the above description to accommodate these and other embodiments and applications.
For example, although embodiments have been described herein with respect to a particular video encoding protocol, note embodiments could be associated with other video encoding protocol and/or non-video encoding protocols. Moreover, although particular tables, values, and pseudo code have been used as examples, other approaches could be used instead to implement any of the embodiments described herein.
The several embodiments described herein are solely for the purpose of illustration. Persons skilled in the art will recognize from this description other embodiments may be practiced with modifications and alterations limited only by the claims.