The present disclosure is related generally to video coding and, more particularly, to coding the coefficients of adaptive-loop filters that are used in video coding.
Video compression (i.e., coding) systems generally employ block processing for most compression operations. A block is a group of neighbouring pixels and is considered a “coding unit” for purposes of compression. Theoretically, a larger coding unit size is preferred to take advantage of correlation among immediate neighbouring pixels. Certain video coding standards, such as Motion Picture Expert Group (“MPEG”)-1, MPEG-2, and MPEG-4, use a coding unit size of 4 by 4, 8 by 8, or 16 by 16 pixels (known as a macroblock).
High efficiency video coding (“HEVC”) is an alternative video coding standard that also employs block processing. As shown in
Each CU includes one or more prediction units (“PUs”).
Further, each CU-partition of PUs is associated with a set of transform units (“TUs”). Like other video coding standards, HEVC applies a block transform on residual data to decorrelate the pixels within a block and to compact the block energy into low-order transform coefficients. However, unlike other standards that apply a single 4 by 4 or 8 by 8 transform to a macroblock, HEVC can apply a set of block transforms of different sizes to a single CU. The set of block transforms to be applied to a CU is represented by its associated TUs. By way of example,
Once a block transform operation has been applied with respect to a particular TU, the resulting transform coefficients are quantized to reduce the size of the coefficient data. The quantized transform coefficients are then entropy coded, resulting in a final set of compression bits. HEVC currently offers an entropy coding scheme known as context-based adaptive binary arithmetic coding (“CABAC”). CABAC can provide efficient compression due to its ability to adaptively select context models (i.e., probability models) for arithmetically coding input symbols based on previously coded symbol statistics. However, the context model selection process in CABAC (referred to as context modelling) is complex and requires significantly more processing power for encoding and decoding than do other compression schemes.
While the appended claims set forth the features of the present techniques with particularity, these techniques, together with their objects and advantages, may be best understood from the following detailed description taken in conjunction with the accompanying drawings of which:
Turning to the drawings, wherein like reference numerals refer to like elements, techniques of the present disclosure are illustrated as being implemented in a suitable environment. The following description is based on embodiments of the claims and should not be taken as limiting the claims with regard to alternative embodiments that are not explicitly described herein.
The term “coding” as used herein includes both encoding and decoding. Thus, when the present disclosure (including the flowcharts 900, 1000, 1100, and 1200) sets forth steps for coding, persons of ordinary skill in the art recognize that the steps are to be executed in the appropriate order. Thus, an encoder executes the steps in a sequence appropriate for encoding, while a decoder executes them in a sequence appropriate for decoding.
In video coding, as with other types of coding, a major goal is to minimize the amount of memory that the information occupies. In many cases, this means compressing the actual video data. But overhead information also takes up memory and should also be coded efficiently.
In accordance with the foregoing, a method for coding filter coefficients is now described.
One embodiment of the method unifies an HEVC Adaptive-Loop Filter (“ALF”) coefficient coding with coeff_abs_level_remaining coding by using the same binarization scheme for both.
Another embodiment removes the parameter mapping table for the Luma ALF coefficients when the Luma ALF coefficients are binarized and coded with a unary code and a variable length code.
Yet another embodiment uses the same parameter value for different Luma ALF coefficients at different positions when the Luma ALF coefficients are binarized and coded with a unary code and a variable length code.
Still another embodiment uses parameter values of 0, 1, 2, 3, 4, or 5 for different Luma ALF coefficients at different positions when the Luma ALF coefficients are binarized and coded with a unary code and a variable length code.
Still another embodiment uses parameter values of 0, 1, 2, 3, 4, or 5 for different Chroma ALF coefficients at different positions when the Chroma ALF coefficients are binarized and coded with a unary code and a variable length code.
Yet another embodiment removes the parameter mapping table for the Luma ALF coefficients and the Luma ALF coefficients are binarized and coded with k-variable Exp-Golomb codewords where k is larger than 0.
A further embodiment uses k-variable Exp-Golomb codewords for all the Luma ALF coefficient binarization and coding where k could be 1, 2, 3, 4, 5, or larger.
Another embodiment uses k-variable Exp-Golomb codewords for all the Chroma ALF coefficient binarization and coding where k could be 1, 2, 3, 4, 5, or larger.
Another embodiment uses fixed length codewords for all the Luma ALF coefficient binarization and coding where the length could be 4, 5, 6, 7, 8, and larger.
Still another embodiment uses fixed length codewords for all the Chroma ALF coefficient binarization and coding where the length could be 4, 5, 6, 7, 8, and larger.
While the embodiments described are suitable in many video coding contexts,
As shown, encoder 500 receives as input a current PU “x.” PU x corresponds to a CU (or a portion thereof), which is in turn a partition of an input picture (e.g., video frame) that is being encoded. Given PU x, a prediction PU “x′” is obtained through either spatial prediction or temporal prediction (via spatial-prediction block 502 or temporal-prediction block 504). PU x′ is then subtracted from PU x to generate a residual PU “e.”
Once generated, residual PU e is passed to a transform block 506, which is configured to perform one or more transform operations on PU e. Examples of such transform operations include the discrete sine transform, the discrete cosine transform (“DCT”), and variants thereof (e.g., DCT-I, DCT-II, DCT-III, etc.). Transform block 506 then outputs residual PU e in a transform domain (“E”), such that transformed PU E comprises a two-dimensional array of transform coefficients. In this block, a transform operation can be performed with respect to each TU that has been associated with the CU corresponding to PU e (as described with respect to
Transformed PU E is passed to a quantizer 508, which is configured to convert, or quantize, the relatively high precision transform coefficients of PU E into a finite number of possible values. After quantization, transformed PU E is entropy coded via entropy-coding block 510. This entropy coding process compresses the quantized transform coefficients into final compression bits that are subsequently transmitted to an appropriate receiver or decoder. Entropy-coding block 510 can use various types of entropy coding schemes, such as CABAC. A particular embodiment of entropy-coding block 510 that implements CABAC is described in further detail below.
In addition to the foregoing steps, encoder 500 includes a decoding process in which a dequantizer 512 dequantizes the quantized transform coefficients of PU E into a dequantized PU “E′.” PU E′ is passed to an inverse transform block 514, which is configured to inverse transform the de-quantized transform coefficients of PU E′ and thereby generate a reconstructed residual PU “e′.” Reconstructed residual PU e′ is then added to the original prediction PU x′ to form a new, reconstructed PU “x″.” A loop filter 516 performs various operations on reconstructed PU x″ to smooth block boundaries and minimize coding distortion between the reconstructed pixels and original pixels. The loop filter 516 can be made up of multiple filters. In the embodiments described below, the loop filter 516 is an ALF. Reconstructed PU x″ is then used as a prediction PU for encoding future frames of the video content. For example, if reconstructed PU x″ is part of a reference frame, then reconstructed PU x″ can be stored in a reference buffer 518 for future temporal prediction.
As shown, decoder 600 receives as input a bitstream of compressed data, such as the bitstream output by encoder 500. The input bitstream is passed to an entropy-decoding block 602, which is configured to perform entropy decoding on the bitstream to generate quantized transform coefficients of a residual PU. In one embodiment, entropy-decoding block 602 is configured to perform the inverse of the operations performed by entropy-coding block 510 of encoder 500. Entropy-decoding block 602 can use various different types of entropy coding schemes, such as CABAC. A particular embodiment of entropy-decoding block 602 that implements CABAC is described in further detail below.
Once generated, the quantized transform coefficients are dequantized by dequantizer 604 to generate a residual PU “E′.” PU E′ is passed to an inverse transform block 606, which is configured to inverse transform the dequantized transform coefficients of PU E′ and thereby output a reconstructed residual PU “e′.” Reconstructed residual PU e′ is then added to a previously decoded prediction PU x′ to form a new, reconstructed PU “x″.” A loop filter 608 performs various operations on reconstructed PU x″ to smooth block boundaries and minimize coding distortion between the reconstructed pixels and original pixels. The loop filter 608 can be made up of multiple filters. In the embodiments described below, the loop filter 608 is an ALF. Reconstructed PU x″ is then used to output a reconstructed video frame. In certain embodiments, if reconstructed PU x″ is part of a reference frame, then reconstructed PU x″ can be stored in a reference buffer 610 for reconstruction of future PUs (via, e.g., spatial-prediction block 612 or temporal-prediction block 614).
As noted with respect to
Generally speaking, the process of encoding a syntax element using CABAC includes three elementary steps: (1) binarization, (2) context modeling, and (3) binary arithmetic coding. In the binarization step, the syntax element is converted into a binary sequence or bin string (if it is not already binary valued). In the context-modeling step, a context model is selected (from a list of available models per the CABAC standard) for one or more bins (i.e., bits) of the bin string. The context-model selection process can differ based on the particular syntax element being encoded as well as on the statistics of recently encoded elements. In the arithmetic coding step, each bin is encoded (via an arithmetic coder) based on the selected context model. The process of decoding a syntax element using CABAC corresponds to the inverse of these steps.
At block 702, entropy-coding block 510 or entropy-decoding block 602 codes a last significant coefficient position that corresponds to the (y, x) coordinates of the last significant (i.e., non-zero) transform coefficient in the current TU (for a given scanning pattern).
With respect to the encoding process, block 702 includes binarizing a last_significant_coeff_y syntax element (corresponding to the y coordinate) and binarizing a last_significant_coeff_x syntax element (corresponding to the x coordinate). Block 702 further includes selecting a context model for the last_significant_coeff_y and last_significant_coeff_x syntax elements, where the context model is selected based on a predefined context index (lastCtx) and a context index increment (lastIndInc).
Once a context model is selected, the last_significant_coeff_y and last_significant_coeff_x syntax elements are arithmetically coded using the selected model.
At block 704, entropy-coding block 510 or entropy-decoding block 602 codes a binary significance map associated with the current TU, where each element of the significance map (represented by the syntax element significant_coeff_flag) is a binary value that indicates whether or not the transform coefficient at the corresponding location in the TU is non-zero. Block 704 includes scanning the current TU and selecting, for each transform coefficient in scanning order, a context model for the transform coefficient. The selected context model is then used to arithmetically code the significant_coeff_flag syntax element associated with the transform coefficient. The selection of the context model is based on a base context index (“sigCtx”) and a context index increment (“sigIndInc”). Variables sigCtx and sigIndInc are determined dynamically for each transform coefficient using a neighbor-based scheme that takes into account the transform coefficient's position as well as the significance map values for one or more neighbor coefficients around the current transform coefficient.
At block 706 of
Referring back to
When processing the current pixel, the encoder or decoder performs a series of computations involving the pixel values (Luma or Chroma, ranging from 0 to 255). The computations can include multiplying each coefficient by the value of the pixel with which it is associated and summing the products. The purpose of the loop filter is to minimize coding distortion. The loop filter is applied to the reconstructed pixel for the purpose of adjusting its Luma or Chroma to be as close as possible to that of the original pixel.
In the current implementation of HEVC, the loop filter is a 10-tap symmetric two-dimensional Finite Impulse Response filter.
The filter is also adaptive in that the coefficients change according to circumstances. This loop filter is referred to herein as an ALF. In
The encoder 500 and decoder 600 binarize and code the ALF coefficients in a manner that minimizes the amount of memory used for the coefficients. Currently, HEVC binarizes and codes the ALF coefficients using fixed k parameter Exp-Golomb coding.
Table 1 is an example of a k-parameter mapping table, which maps a hypothetic set of k values for the filter of
Currently, HEVC uses fixed k parameter Exp-Golomb coding only for ALF coefficient coding and uses other coding schemes for other types of data. For example, HEVC binarizes and codes the remainder of the absolute value of a quantized transform coefficient level, which is referred to in HEVC by the syntax coeff_abs_level_remaining, using two part coding—a unary code and a variable length code. In effect, the syntax coeff_abs_level_remaining is binarized and coded by two codewords that are combined.
The length of the variable length code depends on the unary code and a parameter k that ranges from 0 to 4.
In one embodiment, the ALF coefficient binarization and coding scheme is the same as the coeff_abs_level_remaining binarization and coding scheme. According to one implementation, the encoder 500 and the decoder 600 binarize and code the coeff_abs_level_remaining values and the ALF coefficients using the same coding scheme. In one embodiment, that coding scheme is a combination of unary coding and variable-length coding.
The flowchart 900 of
In various embodiments of the disclosure, the encoder 500 and the decoder 600 binarize and code the Luma ALF coefficients using the combination of unary coding and variable-length coding but with no parameter mapping table. In each embodiment, the k-parameter value can be the same for both Luma and Chroma. The k-parameter value for Luma can also be different from that of Chroma.
The flowchart 1000 of
In one embodiment, the encoder 500 and the decoder 600 binarize and code the Luma ALF coefficients using the combination of unary coding and variable-length coding and do so using the same k-parameter value for different Luma ALF coefficients at different positions, i.e., they use the same k for each pixel. In a more specific embodiment, the k-parameter value is 0, 1, 2, 3, 4, or 5.
In another embodiment, the encoder 500 and the decoder 600 binarize and code the ALF coefficients using the combination of unary coding and variable-length coding, and do so using the same k-parameter value for different Chroma ALF coefficients at different positions, i.e., they use the same k for each pixel. In a more specific embodiment, the k-parameter value can be 0, 1, 2, 3, 4 or 5.
In yet another embodiment, the encoder 500 and the decoder 600 binarize and code the ALF coefficients without a parameter mapping table for the Luma ALF coefficients. In this embodiment, the encoder 500 and the decoder 600 binarize and code the Luma ALF coefficients with k-variable Exp-Golomb codewords, where k is larger than 0. In other words, the encoder 500 and the decoder 600 use the same k parameter for all Luma ALF coefficients but use k values greater than 0.
The flowchart 1100 of
In a further embodiment, the encoder 500 and the decoder 600 binarize and code the Luma ALF coefficients with k-variable Exp-Golomb codewords, where k is 1, 2, 3, 4, 5, or larger. In other words, the encoder 500 and the decoder 600 use the same k parameter for all Luma ALF coefficients but use k values of 1, 2, 3, 4, 5, or larger.
In still another embodiment, the encoder 500 and the decoder 600 binarize and code the Chroma ALF coefficients with k-variable Exp-Golomb codewords, where k is 1, 2, 3, 4, 5, or larger. In other words, the encoder and decoder use the same k parameter for all Luma ALF coefficients but use k values of 1, 2, 3, 4, 5, or larger.
In yet another embodiment, the encoder 500 and the decoder 600 binarize and code all of the Luma ALF coefficients with fixed-length codewords where the length (in bits) is 4, 5, 6, 7, 8, or larger.
In a further embodiment, the encoder 500 and the decoder 600 binarize and code all of the Chroma ALF coefficients with fixed-length codewords where the length (in bits) is 4, 5, 6, 7, 8, or larger.
The flowchart 1200 of
In view of the many possible embodiments to which the principles of the present discussion may be applied, it should be recognized that the embodiments described herein with respect to the drawing figures are meant to be illustrative only and should not be taken as limiting the scope of the claims. Therefore, the techniques as described herein contemplate all such embodiments as may come within the scope of the following claims and equivalents thereof.
Number | Date | Country | |
---|---|---|---|
61669136 | Jul 2012 | US |