The present invention relates to an image decoding device, an image decoding method, and a program.
A moving image coding system using intra-prediction or inter-prediction, and transform/quantization and entropy coding of a prediction residual signal has been proposed (for example, Non-Patent Literature 1).
Non-Patent Literature 1 discloses applying a Golomb-Rice code to coding of a residual coefficient. Here, two types, normal residual coding and transform skip residual coding, are defined as the coding of a residual coefficient.
In normal residual coding, it is prescribed that, with reference to a predefined table, a Rice parameter is derived on the basis of the magnitudes of surrounding coded coefficients.
On the other hand, in transform skip residual coding, it is prescribed that the Rice parameter is always 1.
Here, in Non-Patent Literature 1, the method for deriving a Rice parameter is set on the assumption of the tendency of general coefficient distributions obtained in a case where various quantization parameters are applied to 8-bit video or 10-bit video.
Non patent Literature 1: ITU-T H.266 Versatile Video Coding
However, the Rice parameter is different from the assumed tendency in a case where the bit depth is deep or the quantization parameter is small; therefore, the prescription in Non-Patent Literature 1 described above has a problem that coding performance is reduced because an appropriate Rice parameter is not selected.
Thus, the present invention has been made in view of the above-described problem, and an object of the present invention is to provide an image decoding device, an image decoding method, and a program capable of deriving a Rice parameter suitable for distribution of residual coefficients to be generated and improving coding performance even in a case where the bit depth is deep or the quantization parameter is small.
The first aspect of the present invention is summarized as an image decoding device configured to decode coded data, the image decoding device including: a local addition unit that adds level values of decoded residual coefficients existing around a residual coefficient to be decoded and derives a sum total value of level; a transform unit that transforms the sum total value of level into an amount of shift and an index; a table reference unit that refers to a predefined table and outputs a pre-correction Rice parameter corresponding to the index; and a correction unit that derives a post-correction Rice parameter on the basis of the pre-correction Rice parameter and the amount of shift.
The second aspect of the present invention is summarized as an image decoding method for decoding coded data, the image decoding method including: a step of adding level values of decoded residual coefficients existing around a residual coefficient to be decoded and deriving a sum total value of level; a step of transforming the sum total value of level into an amount of shift and an index; a step of referring to a predefined table and outputting a pre-correction Rice parameter corresponding to the index; and a step of deriving a post-correction Rice parameter on the basis of the pre-correction Rice parameter and the amount of shift.
The third aspect of the present invention is summarized as a program for causing a computer to function as an image decoding device, wherein the image decoding device includes: a local addition unit that adds level values of decoded residual coefficients existing around a residual coefficient to be decoded and derives a sum total value of level; a transform unit that transforms the sum total value of level into an amount of shift and an index; a table reference unit that refers to a predefined table and outputs a pre-correction Rice parameter corresponding to the index; and a correction unit that derives a post-correction Rice parameter on the basis of the pre-correction Rice parameter and the amount of shift.
According to the present invention, it is possible to provide an image decoding device, an image decoding method, and a program capable of deriving a Rice parameter suitable for distribution of residual coefficients to be generated and improving coding performance even in a case where the bit depth is deep or the quantization parameter is small.
Hereinafter, embodiments of the present invention will be described with reference to the drawings. Note that components in the following embodiments can be replaced with existing components or the like as appropriate, and various variations including combinations with other existing components are possible. Therefore, the following description of the embodiments does not limit the contents of the invention described in the claims.
The block division unit 110 is configured to divide an entire screen of an input image into the same squares, and output an image (divided image) obtained by recursive division using a quad tree or the like.
The inter-prediction unit 101 is configured to perform inter-prediction by using the divided image input by the block division unit 110 and a locally decoded image after filtering input from the frame buffer 109 to generate and output an inter-prediction image.
The intra-prediction unit 102 is configured to perform intra-prediction by using the divided image input by the block division unit 110 and a locally decoded image before filtering (described later) to generate and output an intra-prediction image.
The transform/quantization unit 103 is configured to execute orthogonal transform processing on a residual signal input from the subtracting unit 106, execute quantization processing on a transform coefficient obtained by the orthogonal transform processing, and output a quantized level value obtained by the quantization processing.
The entropy coding unit 104 is configured to perform entropy coding on the quantized level value, transform unit size, and transform size input from the transform/quantization unit 103 and output coded data.
The inverse transform/inverse quantization unit 105 is configured to execute inverse quantization processing on the quantized level value input from the transform/quantization unit 103, execute inverse orthogonal transform processing on a transform coefficient obtained by the inverse quantization processing, and output an inversely orthogonally transformed residual signal obtained by the inverse orthogonal transform processing.
The subtracting unit 106 is configured to output the residual signal that is a difference between the divided image input by the block division unit 110 and the intra-prediction image or the inter-prediction image.
The adding unit 107 is configured to output a divided image obtained by adding the inversely orthogonally transformed residual signal input from the inverse transform/inverse quantization unit 105 and the intra-prediction image or the inter-prediction image.
The block integration unit 111 is configured to output the locally decoded image before filtering obtained by integrating the divided images input from the adding unit 107.
The in-loop filter unit 108 is configured to apply in-loop filtering processing such as deblocking filter processing to the locally decoded image before filtering input from the block integration unit 111 to generate and output the locally decoded image after filtering. Here, the locally decoded image before filtering is a signal obtained by adding the residual signal subjected to the inverse orthogonal transform and the intra-prediction image or the inter-prediction image.
The frame buffer 109 accumulates the locally decoded image after filtering and appropriately supplies the locally decoded image after filtering to the inter-prediction unit 101 as the locally decoded image after filtering.
Hereinafter, the entropy coding unit 104 of the image coding device 100 according to the present embodiment will be described with reference to
Specifically,
As illustrated in
The local addition unit 104a is configured to add the level values of coded residual coefficients existing around the coefficient to be coded and output the sum total value of level locSumAbs.
The transform unit 104b is configured to transform the sum total value of level locSumAbs output by the local addition unit 104a into the amount of shift shiftRice and an index locSumAbsIdx.
For example, the transform unit 104b may be configured to multiply the above-described sum total value of level locSumAbs by ⅛, transform the resulting value into a logarithm with a base of 2, and calculate the maximum integer value not exceeding the logarithm as the amount of shift shiftRice. In such a case, assuming that the sum total value of level locSumAbs is 64, the amount of shift shiftRice is 3.
Further, the transform unit 104b may be configured to calculate the index locSumAbsIdx by (Formula 1) below.
The table reference unit 104c is configured to refer to a predetermined table and output a pre-correction Rice parameter cBaseRiceParam corresponding to the index locSumAbsIdx. For example, Table 128 defined in Non-Patent Literature 1 can be used as such a table (see
The correction unit 104d is configured to output a post-correction Rice parameter cRiceParam on the basis of the pre-correction Rice parameter cBaseRiceParam and the amount of shift shiftRice.
For example, the correction unit 104d may be configured to output a post-correction Rice parameter cRiceParam by (Formula 2) below.
The normal residual coding disclosed in Non-Patent Literature 1 has two types, base-attached residual coding and baseless residual coding. Here, 4 is defined as the base value.
Thus, the local addition unit 104a may be configured to perform correction by subtracting the base value from the level value of a coded residual coefficient existing around the coefficient to be coded.
The entropy decoding unit 201 is configured to perform entropy decoding on coded data and output a quantized level value, a motion compensation method generated by the image coding device 100, etc.
The inverse transform/inverse quantization unit 202 is configured to perform inverse quantization processing on a quantized level value input from the entropy decoding unit 201, perform inverse orthogonal transform processing on the result of the inverse quantization processing, and output the result as a residual signal.
The inter-prediction unit 203 is configured to perform inter-prediction by using a locally decoded image after filtering input from the frame buffer 207 to generate and output an inter-prediction image.
The intra-prediction unit 204 is configured to perform intra-prediction by using a locally decoded image before filtering input from the addition unit 205 to generate and output an intra-prediction image.
The addition unit 205 is configured to output a divided image obtained by adding a residual signal input from the inverse transform/inverse quantization unit 202 and a prediction image (an inter-prediction image input from the inter-prediction unit 203 or an intra-prediction image input from the intra-prediction unit 204).
Here, the prediction image is, out of an inter-prediction image input from the inter-prediction unit 203 and an intra-prediction image input from the intra-prediction unit 204, a prediction image calculated by a prediction method obtained by entropy decoding.
The block integration unit 208 is configured to output a locally decoded image before filtering obtained by integrating divided images input from the addition unit 205.
The in-loop filtering unit 206 is configured to apply in-loop filtering processing such as deunit filtering processing to a locally decoded image before filtering input from the block integration unit 208 to generate and output a locally decoded image after filtering.
The frame buffer 207 is configured to accumulate locally decoded images after filtering input from the in-loop filter unit 206 and to, as appropriate, supply the accumulated data to the inter-prediction unit 203 as locally decoded images after filtering and output the accumulated data as decoded images.
Hereinafter, an example of some functional blocks of the entropy decoding unit 201 of the image decoding device 200 according to the present embodiment will be described with reference to
As illustrated in
Here, the local addition unit 104a in the entropy decoding unit 201 is configured to add the level values of decoded residual coefficients existing around the residual coefficient to be decoded and output the sum total value of level locSumAbs.
Note that the transform unit 104b, the table reference unit 104c, and the correction unit 104d in the entropy decoding unit 201 have the same functions as those of the transform unit 104b, the table reference unit 104c, and the correction unit 104d in the entropy coding unit 104.
According to the present embodiment, in each of normal residual coding and transform skip residual coding, the amount of correction (the amount of shift and the index) of a Rice parameter is derived on the basis of the level values of decoded residual coefficients existing around a coded residual coefficient (or a decoded residual coefficient), and a predefined table (at the time of normal residual coding) or a fixed value (at the time of transform skip residual coding) is corrected according to the amount of correction. As a result, even in a case where the bit depth is deep or the quantization parameter is small, a Rice parameter suitable for distribution of residual coefficients to be generated can be derived, and coding performance can be improved.
Hereinafter, an image processing system 1 according to a second embodiment of the present invention will be described focusing on differences from the image processing system 1 according to the first embodiment described above.
In the present embodiment, the transform unit 104b may be configured such that, in a case where the above-described sum total value of level locSubAbs is larger than a threshold value, the transform unit 104b derives the above-described amount of shift shiftRice on the basis of the internal processing bit depth of the content.
Specifically, the transform unit 104b may be configured such that, in a case where the above-described sum total value of level locSubAbs is larger than a threshold value, the transform unit 104b outputs, as the above-described amount of shift shiftRice, a value defined by the internal processing bit depth of the content.
For example, in a case where the internal processing bit depth is 10 bits, the above-described value may be 0; in a case where the internal processing bit depth is 11 bits, the above-described value may be 2; in a case where the internal processing bit depth is 12 bits or 13 bits, the above-described value may be 3; and in a case where the internal processing bit depth is 14 bits, 15 bits, or 16 bits, the above-described value may be 4.
Further, the transform unit 104b may be configured such that, in a case where the above-described sum total value of level locSubAbs is larger than a threshold value, the transform unit 104b derives the index locSumAbsIdx on the basis of the level value locSubAbs and the amount of shift shiftRice described above.
Specifically, the transform unit 104b may be configured such that, in a case where the above-described sum total value of level locSubAbs is equal to or larger than a threshold value, the transform unit 104b outputs, as the index locSumAbsIdx, the result of subjecting the above-described level value locSubAbs to a right-shift operation by a number equal to the amount of shift shiftRice. Alternatively, the transform unit 104b may be configured to output the result of a process in which the above-described level value locSubAbs is divided by the result of subjecting 1 to a left-shift operation by a number equal to the amount of shift shiftRice and the resulting value is rounded.
On the other hand, the transform unit 104b may be configured such that, in a case where the above-described sum total value of level locSubAbs is smaller than a threshold value, the transform unit 104b outputs 0 as the amount of shift shiftRice and outputs the above-described sum total value of level locSumAbs as the index locSumAbsIdx.
Here, 32 may be used as the threshold value. Further, the transform unit 104b may be configured to correct the threshold value by adding a base value according to the types of base-attached residual coding and baseless residual coding.
According to the present embodiment, a Rice parameter according to the bit depth of the content (video container) can be derived, and coding performance can be improved.
Hereinafter, an image processing system 1 according to a third embodiment of the present invention will be described focusing on differences from the image processing systems 1 according to the first embodiment and the second embodiment described above.
In the present embodiment, the correction unit 104d may be used to derive a Rice parameter in transform skip residual coding. That is, in Non-Patent Literature 1, a transform skip residual coding unit may be provided at a preceding stage of a Rice parameter derivation unit.
As described above, Non-Patent Literature 1 prescribes that the Rice parameter is always 1 in transform skip residual coding.
On the other hand, in the present embodiment, the correction unit 104d is used to derive a Rice parameter in transform skip residual coding, and therefore there is also a case where the Rice parameter is other than 1 in transform skip residual coding.
In the present embodiment, the Rice parameter may always be set to a value larger than 1 regardless of the output result by the correction unit 104d.
Alternatively, the correction unit 104d may be configured to output a post-correction Rice parameter according to the internal processing bit depth of the content.
For example, the correction unit 104d may be configured to output 4 as the post-correction Rice parameter in a case where the internal processing bit depth of the content is 12 bits, output 6 as the post-correction Rice parameter in a case where the internal processing bit depth of the content is 14 bits, and output 8 as the post-correction Rice parameter in a case where the internal processing bit depth of the content is 16 bits.
Alternatively, the correction unit 104d may be configured to output a post-correction Rice parameter determined on the image coding device 100 side and written in the stream.
According to the present embodiment, a Rice parameter according to transform skip residual coding can be derived in a video called screen content, and coding performance can be improved.
The image coding device 100 and the image decoding device 200 described above may be implemented as programs that cause a computer to execute respective functions (respective steps).
Note that the above described embodiments have been described by taking application of the present invention to the image encoding device 10 and the image decoding device 30 as examples. However, the present invention is not limited only thereto, but can be similarly applied to an encoding/decoding system having functions of the image encoding device 10 and the image decoding device 30.
According to the present embodiment, it is possible to improve the overall quality of service in video communications, thereby contributing to Goal 9 of the UN-led Sustainable Development Goals (SDGs) which is to “build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation”.
Number | Date | Country | Kind |
---|---|---|---|
2020-217767 | Dec 2020 | JP | national |
The present application is a continuation of PCT Application No. PCT/JP2021/035556, filed on Sep. 28, 2021, which claims the benefit of Japanese patent application No. 2020-217767 filed on Dec. 25, 2020, the entire contents of which are incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2021/035556 | Sep 2021 | WO |
Child | 18212182 | US |