The present invention relates to video coding. In particular, the present invention relates to quantization level clipping for High Efficiency Video Coding (HEVC).
High-Efficiency Video Coding (HEVC) is a new international video coding standard that is being developed by the Joint Collaborative Team on Video Coding (JCT-VC). HEVC is based on the hybrid block-based motion-compensated DCT-like transform coding architecture. The basic unit for compression, termed Coding Unit (CU), is a 2N×2N square block, and each CU can be recursively split into four smaller CUs until a predefined minimum size is reached. Each CU contains one or several variable-block-sized Prediction Unit(s) (PUs) and Transform Unit(s) (TUs). For each PU, either intra-picture or inter-picture prediction is selected. Each TU is processed by a spatial block transform and the transform coefficients for the TU are then quantized. The smallest TU size allowed for HEVC is 4×4.
The quantization of transform coefficients plays an important role in bitrate and quality control in video coding. A set of quantization steps is used to quantize the transform coefficient into a quantization level. A larger quantization step size will result in lower bitrate and lower quality. On the other hand, a smaller quantization step size will result in higher bitrate and higher quality. A straightforward implementation of the quantization process would involve a division operation which is more complex in hardware-based implementation and consumes more computational resource in software-based implementation. Accordingly, various techniques have been developed in the field for division-free quantization process. In HEVC Test Model Revision 5 (HM-5.0), the quantization process is described as follows. A set of parameters are defined:
B=bit width or bit depth of the input source video,
DB=B−8,
N=transform size of the transform unit (TU),
M=log 2(N),
Q[x]=f(x), where f(x)={26214,23302,20560,18396,16384,14564}, x=0, . . . , 5, and
IQ[x]=g(x), where g(x)={40,45,51,57,64,72}, x=0, . . . , 5.
Q[x] and IQ[x] are called quantization step and dequantization step respectively. The quantization process is performed according to:
qlevel=(coeff*Q[QP %6]+offset)>>(21+QP/6−M−DB), where
offset=1<<(20+QP/6−M−DB), (1)
where “%” is the modulo operator. The dequantization process is performed according to:
coeffQ=((qlevel*IQ[QP%6]<<(QP/6))+offset)>>(M−1+DB), where
offset=1<<(M−2+DB). (2)
The variable qlevel in equations (1) and (2) represents the quantization level of a transform coefficient. The variable coeffQ in equation (2) represents the dequantized transform coefficient. IQ[x] indicates de-quantization step (also called de-quantization step size) and QP represents the quantization parameter. “QP 16” in equations (1) and (2) represents the integer part of QP divided by 6. As shown in equations (1) and (2), the quantization and dequantization processes are implemented by integer multiplication followed by arithmetic shift(s). An offset value is added in both equations (1) and (2) to implement integer conversion using rounding.
The bit depth of the quantization level is 16 bits (including 1 bit for sign) for HEVC. In other words, the quantization level is represented in 2 bytes or a 16-bit word. Since IQ(x)<=72 and QP<=51, the dynamic range of IQ[x] is 7 bits and the “<<(QP/6)” operation performs left arithmetic shift up to 8 bits. Accordingly, the dynamic range of de-quantized transform coefficient coeffQ, i.e., “(qlevel*IQ[QP %6])<<(QP/6)”, is 31 (16+7+8) bits. Therefore, the de-quantization process as described by equation (2) will never cause overflow since the de-quantization process uses 32-bit data representation.
However, when quantization matrix is introduced, the de-quantization process is modified as shown in equations (3) through (5):
iShift=M−1+DB+4. (3)
if(iShift>QP/6),
coeffQ[i][j]=(qlevel[i][j]*W[i][j]*IQ[QP%6]+offset)>>(iShift−QP/6), where
offset=1<<(iShift−QP/6−1), with i=0 . . . nW−1,j=0 . . . nH−1 (4)
else
coeffQ[i][j]=(qlevel[i][j]*W[i][j]*IQ[QP%6])<<(QP/6−iShift) (5)
wherein “[i][j]” indicates the position (also called indices) of the transformed coefficient within a transform unit, W denotes quantization matrix, nW and nH are width and height of the transform. If n represents the dynamic range of a quantization level for a transform coefficient, the dynamic range n has to satisfy the following condition to avoid overflow:
n+w+iq+QP/6−M+DB−3≦32, (6)
where w is the dynamic range of quantization matrix W, iq is the dynamic range of IQ[x] and the bit depth of the de-quantized or reconstructed transform coefficient is 32 bits.
If the dynamic range of the quantization matrix W is 8 bits, the dynamic range of the reconstructed transform coefficient as described by equations (3) through (5) becomes 34 (16+8+7+3) bits for QP=51, M=2 and DB=0. When the de-quantization process uses 32-bit data representation, the reconstructed transform coefficient according to equation equations (3) through (5) may overflow and cause system failure. Therefore it is desirable to develop a scheme for transform coefficient reconstruction to avoid possible overflow.
A method and apparatus for clipping a quantization level are disclosed. Embodiments according to the present invention avoid overflow of the quantized transform coefficient by clipping the quantization level adaptively after quantization. In one implementation, a method is implemented in a video encoder for clipping a quantization level. The method operates by generating the quantization level for a transform coefficient of a transform unit by quantizing the transform coefficient according to a quantization matrix and quantization parameter, determining a clipping range in the video encoder for one or a combination of a fixed-range clipping condition and a dynamic-range clipping condition, and clipping the quantization level according to the clipping range to generate a clipping-processed quantization level. The quantization level is clipped within a fixed clipping range from −m to m−1 for the fixed-range clipping condition and m corresponds to 128, 32768, or 2147483648.
In another implementation a video encoding apparatus clips a quantization level. The apparatus comprising at least one circuit configured to: generate the quantization level for a transform coefficient of a transform unit by quantizing the transform coefficient according to a quantization matrix and a quantization parameter, determine a clipping range in the video encoder for one or a combination of a fixed-range clipping condition and a dynamic-range clipping condition, and clip the quantization level according to the clipping range to generate a clipping-processed quantization level. The quantization level is clipped within a fixed clipping range from −m to m−1 for the fixed-range clipping condition and m corresponds to 128, 32768, or 2147483648.
As mentioned before, the coefficient de-quantization (or reconstruction) process as described above may suffer from overflow when quantization matrix is incorporated. To avoid potential overflow during transform coefficient reconstruction, embodiments according to the present invention restrict the quantization level of the transform coefficient before performing the de-quantization process. The dynamic range of the quantization level of the transform coefficient is represented by an integer n. In the example as described in equations (3) to (5), the dynamic range of n shall not exceed 32 bits if 32-bit data representation is used for the de-quantized (or reconstructed) transform coefficients. Accordingly, n has to satisfy the following constraint:
n+8+7+(QP/6−(M−1+DB+4))≦32, (7)
which leads to
n≦20+M+DB−QP/6. (8)
In this case, the quantization level, qlevel, of the transform coefficient shall be clipped according to equation (9):
qlevel=max(−2n−1,min(2n−1−1,qlevel)) (9)
To avoid the overflow, the dynamic range of the quantization level of the transform coefficient has to be constrained according to equation (8). According to equation (8), n has to be less than or equal to (20+M+DB−QP 16) to avoid overflow. However, since the quantization level is represented by 16 bits in this example, (i.e., the bit depth of the quantization level=16), n should not exceed 16 bits. Accordingly, if (20+M+DB−QP 16) is greater than 16, the quantization level of the transform coefficient has to be clipped to a range not to exceed 16-bit data representation. The following pseudo codes (pseudo code A) illustrate an example of clipping the quantization level, qlevel, of the transform coefficient according to an embodiment of the present invention in order to avoid data overflow during transform coefficient reconstruction:
Pseudo Code A:
if(20+M+DB−QP/6>=16)
qlevel=max(−215,min(215−1,qlevel));
else
qlevel=max(−220+M+DB−QP/6−1,min(220+M+DB−QP/6−1−1,qlevel)).
As shown in pseudo code A, two clipping ranges are used for two different clipping conditions. The first clipping condition corresponds to “20+M+B−8−QP/6≦16” and the second clipping condition corresponds to “20+M+B−8−QP/6<16”. The first clipping range corresponds to a fixed clipping range, i.e., (−215, 215−1) and the second clipping range corresponds to (−220+M+DB−QP/6−1, 220+M+DB−QP/6−1). While the test condition “if (20+M+DB−QP/6≧16)” is used in the exemplary pseudo code A shown above, other test conditions may also be used. For example, the test condition may use the bit depth B of the video source instead of parameter DB. The test condition becomes “if (20+M+B−8−QP/6>=16)”, i.e., “if (12+M+B−QP/6>=16)”. The corresponding pseudo codes (Pseudo code B) becomes:
Pseudo Code B:
if(12+M+B−QP/6>=16)
qlevel=max(−215,min(215−1,qlevel));
else
qlevel=max(−212+M+B−QP/6−1,min(212+M+B−QP/6−1−1,qlevel)).
If the bit-depth of source video is 8 bits (DB=0) and the transform size is 4×4, equation (8) can be simplified to:
n≦22−QP/6. (9)
Therefore, the test condition “if (12+M+B−QP/6≧16)” becomes “if (22−QP/6≧16)” in this case. The test condition can be further simplified as “if (QP<=36)”. Consequently, clipping process for the quantization level of the transform coefficient according to another embodiment of the present invention only depends on QP for video source with fixed dynamic range. An exemplary pseudo codes (Pseudo code C) is shown below:
Pseudo Code C:
if(QP<=36)
qlevel=max(−215,min(215−1,qlevel));
else
qlevel=max(−221−QP/6,min(221−QP/6−1,qlevel)).
When the bit-depth of source video is 10 bits or higher, i.e., DB 2, the condition in (7) is always met. In this case, 16-bit clipping, namely qlevel=max(−215, min(215−1, qlevel)) or qlevel=max(−32,768, min(32,767, qlevel)), is always used unconditionally. While the clipping is performed unconditional for the bit-depth equal to 10 bits or higher, the quantization level of the transform coefficient may also be clipped unconditionally to desired bit-depth regardless of the bit-depth of the source video. The desired bit-depth can be 8, 16 or 32 bits and the corresponding clipping ranges can be [−128, 127], [−32768, 32767] and [−2147483648, 2147483647].
Three exemplary pseudo codes incorporating an embodiment of the present invention are described above. These pseudo codes are intended to illustrate exemplary process to avoid data overflow during transform coefficient reconstruction. A person skilled in the art may practice the present invention by using other test conditions. For example, instead of testing “if (QP<=36)”, the test condition “if (QP/6<=6)” may be used. In another example, the clipping operation may be implemented by using other function such as a clipping function, clip (x, y, z), where the variable z is clipped between x and y (x<y). The clipping operations for pseudo code C can be expressed as:
qlevel=clip(−215,215−1,qlevel), and
qlevel=clip(−221−QP/6,221−QP/6−1,qlevel).
In the above examples, specific parameters are used to illustrate the dequantization process incorporating embodiments of the present invention to avoid data overflow. The specific parameters used shall not be construed as limitations to the present invention. A person skilled in the art may modify the testing for clipping condition based on the parameters provided. For example, if de-quantization step has 6-bit dynamic range instead of 7-bit dynamic range, the constraint of equation (8) becomes n≦19+M+DB−QP 16. The corresponding clipping condition testing in pseudo code A becomes “if (19+M+DB−QP 16>=16)”.
While the above quantization level clipping process is performed for the decoder side, the quantization level clipping process can also be performed in the encoder side after quantization. To avoid potential overflow, embodiments according to the present invention restrict the quantization level of the transform coefficient after performing the quantization process. The clipping condition may be based on the quantization matrix, the quantization parameter, de-quantization step, video source bit-depth, transform size of the transform unit, or any combination thereof. The clipping condition may also include a null clipping condition, where no clipping condition is set. In other words, the null condition corresponds to unconditional clipping that always clips the quantization level to a range. In an embodiment, the quantization level can be clipped to a first range for a first clipping condition and the quantization level can be clipped to a second range for a second clipping condition. The first range may correspond to a fixed range related to quantization-level bit-depth and the second range may be related to dynamic range of the quantization level. The clipping condition can be determined by comparing a first weighted value with a threshold, wherein the first weighted value corresponds to a first linear function of the quantization matrix, the quantization parameter, the video source bit-depth, the transform size of the transform unit, or any combination thereof. Furthermore, the threshold may correspond to a fixed value or a second weighted value, wherein the second weighted value corresponds to a second linear function of the quantization matrix, the quantization parameter, the video source bit-depth, the transform size of the transform unit, or any combination thereof.
The quantization level of the transform coefficient may also be clipped unconditionally to a desired bit-depth regardless of the bit-depth of source video. The desired bit-depth can be 8, 16 or 32 bits and the corresponding clipping ranges can be [−128, 127], [−32768, 32767] and [−2147483648, 2147483647].
The flow chart in
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art how the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be a circuit integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
The present invention is a Continuation of pending U.S. patent application Ser. No. 15/079,341, filed on Mar. 24, 2016, which is a Divisional of U.S. application Ser. No. 13/985,779, filed on Aug. 15, 2013 (now U.S. Pat. No. 9,420,296, issued on Aug. 16, 2016), which is a National Stage of PCT Patent Application, Serial No. PCT/CN2012/086648, filed on Dec. 14, 2012, entitled “Method of Clipping Transformed Coefficients before De-Quantization,” which is a Continuation-In-Part of PCT Patent Application Ser. No. PCT/CN2011/084083, filed on Dec. 15, 2011. The priority applications are hereby incorporated by reference in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
5832130 | Kim | Nov 1998 | A |
5923787 | Hara et al. | Jul 1999 | A |
6898323 | Schwartz et al. | May 2005 | B2 |
7660355 | Winger et al. | Feb 2010 | B2 |
7778813 | Zhou | Aug 2010 | B2 |
20020118743 | Jiang | Aug 2002 | A1 |
20020150157 | Lin et al. | Oct 2002 | A1 |
20030105788 | Chatterjee | Jun 2003 | A1 |
20040146106 | De Lameillieure | Jul 2004 | A1 |
20050013357 | Cheong | Jan 2005 | A1 |
20050036545 | Zhou | Feb 2005 | A1 |
20060171456 | Kwon | Aug 2006 | A1 |
20070065023 | Lee et al. | Mar 2007 | A1 |
20070189626 | Tanizawa et al. | Aug 2007 | A1 |
20070299897 | Reznik | Dec 2007 | A1 |
20110194614 | Norkin et al. | Aug 2011 | A1 |
20120328004 | Coban et al. | Dec 2012 | A1 |
20150215620 | Alshina et al. | Jul 2015 | A1 |
20160191936 | Suzuki et al. | Jun 2016 | A1 |
Number | Date | Country |
---|---|---|
1150740 | May 1998 | CN |
102271258 | Dec 2011 | CN |
2 278 874 | May 2014 | EP |
2 350 041 | Mar 2009 | RU |
2014 102 989 | Aug 2015 | RU |
WO 2011043793 | Apr 2011 | WO |
Entry |
---|
Li, X., et al.; “Clipping of Transformed Coefficients before De-quantization;” Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11; Feb. 2012; pp. 1-3. |
Kerofsky, L., et al.; “Limiting Dynamic Range when Using a Quantization Weighing Matrix;” Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and IDO/IRV JTC1/SC291WG11; Jul. 2011; pp. 1-5. |
Alshin, A.; “About Clip Operation Removal from De-quantization Part of HM;” Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11; Jul. 2011; pp. 1-5. |
Number | Date | Country | |
---|---|---|---|
20170094275 A1 | Mar 2017 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13985779 | US | |
Child | 15079341 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15079341 | Mar 2016 | US |
Child | 15375574 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2011/084083 | Dec 2011 | US |
Child | 13985779 | US |