The present principles relate to image and video compression systems and coders and decoders for that purpose.
In H.264, Intra4×4 and Intra8×8 intra prediction corresponds to a spatial estimation of the pixels of a current block to be coded (
Consider the intra4×4 prediction mode of H.264, in which the prediction depends on the reconstructed neighboring pixels as illustrated in
Concerning the intra 4×4 prediction, the different modes are shown in
Similarly, for Intra 8×8 prediction,
The chroma samples of a macroblock are predicted using a similar prediction technique as for the luma component in Intra 16×16 macroblocks. Four prediction modes are supported. Prediction mode 0 (vertical prediction), mode 1 (horizontal prediction), and mode 2 (DC prediction) are specified similar to the modes in Intra 4×4.
The intra prediction is then performed using the different prediction directions. After this, the residue, corresponding to the difference between a current block and the predicted block, is frequency transformed (DCT), quantized and finally encoded and then carried out. Before the encoding process, from the nine prediction modes available, the best prediction mode is selected. The direction prediction can use for example the SAD measure (Sum of Absolute Differences) computed between the current block to encode and the block of prediction. Obviously the prediction mode is encoded for each sub partition.
HEVC Intra prediction operates according to the block size, and previously decoded boundary samples from spatially neighboring blocks are used to form the prediction signal. Directional prediction with 33 different directional orientations is defined for square block sizes from 4×4 up to 32×32. The possible prediction directions are shown in
For chroma, the horizontal, vertical, planar, and DC prediction modes can be explicitly signaled, or the chroma prediction mode can be indicated to be the same as the luma prediction mode.
For the H.264 and HEVC video compression standards for intra prediction, the luminance and chrominance components are encoded with the same spatial prediction principle. For instance in H.264, one of the nine intra coding modes is used to predict and encode the luminance block, and then one of the four chrominance intra coding modes is used to encode the chrominance blocks. In HEVC, the principle is nearly the same with the 36 intra modes for the luminance and the use or not of the same mode as luminance for the chroma blocks.
It is desired to improve the directional prediction for the case of curved contours for which the classical directional prediction modes are not sufficiently efficient. In the case of curved contours, it is desired to improve the prediction so as to reduce the high frequency coefficients of the residual error of prediction induced by the difference between the curved contour and the unidirectional prediction used in some video coding standards.
These and other drawbacks and disadvantages of the prior art are addressed by the present principles, which are directed to a method and apparatus for spatial guided prediction for use in video image compression systems.
According to an aspect of the present principles, there is provided a method for encoding of digital video images including an operation for intra frame prediction, wherein the intra frame prediction comprises predicting a first component of a video image block using a first directional prediction mode, quantizing a residual prediction error that results from using said predicted first component and encoding the quantized residual prediction error. The method further comprises generating a reconstructed first component of the video image block from the quantized residual prediction error and recursively building a prediction block for additional components of the video image block from respective spatial neighboring reconstructed pixels of the corresponding component using a plurality of prediction modes determined from reconstructed pixels of the first component of the video image block.
According to another aspect of the present principles, there is provided an apparatus for encoding of digital video images including an operation for intra frame prediction, wherein said apparatus comprises a memory and a processor coupled to the memory and configured to perform the video image compression, wherein the processor is configured to perform said intra frame prediction by predicting a first component of a video image block using a first directional prediction mode, quantizing a residual prediction error that results from using said predicted first component and encoding the quantized residual prediction error. The processor is further configured to generate a reconstructed first component of the video image block from the quantized residual prediction error and recursively build a prediction block for additional components of the video image block from respective spatial neighboring reconstructed pixels of the corresponding component using a plurality of prediction modes determined from reconstructed pixels of the first component of the video image block.
According to an aspect of the present principles, there is provided a method for decoding of digital video images including an operation for intra frame prediction, comprising decoding a residual prediction error of a first component of a digital video image block, reconstructing said first component of the digital video image block using the decoded residual prediction error and a prediction obtained with a spatial intraprediction mode, and recursively building a prediction block for additional components of the video image block from respective spatial neighboring reconstructed pixels of the corresponding component using a plurality of prediction modes determined from reconstructed pixels of the first component of the video image block. The method also comprises decoding residual prediction errors for remaining components of the digital video image block, and, reconstructing remaining components of the digital video image block from the decoded residual prediction errors and the recursively reconstructed prediction block of the additional components of the digital video image block.
According to another aspect of the present principles, there is provided an apparatus for decoding of digital video images including an operation for intra frame prediction, wherein the apparatus comprises a memory, and a processor coupled to the memory and configured to perform decoding of a residual prediction error of a first component of a digital video image block, reconstructing the first component of the digital video image block using the decoded residual prediction error and a prediction obtained with a spatial intraprediction mode and recursively building a prediction block for additional components of the video image block from respective spatial neighboring reconstructed pixels of the corresponding component using a plurality of prediction modes determined from reconstructed pixels of the first component of the video image block. The processor is also configured to decode residual prediction errors for remaining components of the digital video image block, and, reconstruct remaining components of the digital video image block from the decoded residual prediction errors and the recursively reconstructed prediction block of the additional components of the digital video image block.
According to another aspect of the present principles, there is provided an apparatus for transmission of digital video images including an operation for intra frame prediction, wherein said apparatus comprises a memory and a processor coupled to the memory and configured to perform said video image compression, wherein the processor is configured to perform the intra frame prediction by predicting a first component of a video image block using a first directional prediction mode, quantizing a residual prediction error that results from using said predicted first component, and encoding the quantized residual prediction error. The processor is further configured to generate a reconstructed first component of the video image block from the quantized residual prediction error, and recursively build a prediction block for additional components of the video image block from respective spatial neighboring reconstructed pixels of the corresponding component using a plurality of prediction modes determined from reconstructed pixels of the first component of the video image block.
These and other aspects, features and advantages of the present principles will become apparent from the following detailed description of exemplary embodiments, which is to be read in connection with the accompanying drawings.
An approach for spatial guided prediction for intra prediction in a digital video compression scheme is described herein.
It is desired to improve the video coding performance by keeping the same quality for a lower bit-rate. The goal is to propose a tool to be used at the encoder and at the decoder that would enable such a coding gain. The problem of directional spatial, or intra, prediction is described herein in the context of image and video coding.
In the H.264 standard we have one DC and 8 directional prediction modes. For HEVC, there are DC, Planar, and 33 directional intra prediction modes. For a current block to encode, once the best prediction mode is chosen at the encoder side, each block of luminance and chrominance components is sequentially predicted and encoded according to the chosen mode.
The objective of the present proposal is to improve the directional intra prediction by using on first component, for example, the classical directional spatial prediction and on a second and subsequent components an “aided” directional prediction mode.
In the current video coding standards, the luminance and the chrominance blocks are sequentially encoded using more or less the same coding mode for the different components of the current block. The approach described herein tries to improve the prediction quality in curved contours for which the classical directional intra prediction modes are not sufficient.
The spatial prediction of the current block is encoded by first using, for example, a given classical directional prediction mode for the first component of the block. The resulting prediction based on that mode is then used with the first component of the block to find the residual prediction error, which is quantized and encoded. Then this first component of the block is reconstructed to form a reconstructed first component.
Then, recursively build prediction blocks, line by line or column by column, for example, of the other components of the block from the respective spatial neighboring reconstructed pixels of those components with the directional prediction modes of the host coder/decoder. These directional prediction modes are found with the help of the reconstructed pixels of the reconstructed first component of the block. The mode that results in the lowest prediction error when the reconstructed first component of the block is compared to the first component of the block, either line by line or column by column, is used to build prediction blocks for the other components, which are used as “aided” prediction blocks in coding the components other than the first.
Finally, quantize and encode the residual prediction errors and rebuild the other remaining components of the block, the prediction error being the difference between the respective component of the block and the respective prediction obtained recursively above.
On the decoder side, the residual prediction error of the first component of the block is decoded. This component is reconstructed by adding the prediction of this component to the decoded residual error. The block of prediction is obtained by using one of the usual spatial prediction modes in the same way as done at the encoder side. Then, recursively build the prediction, either line by line or column by column, of each of the other components of the block from the respective spatial neighboring reconstructed pixels with the directional prediction modes of the host coder/decoder and with the help of the reconstructed pixels of the first decoded component of the block. These directional prediction modes are found with the help of the reconstructed pixels of the reconstructed first component of the block. The mode that results in the lowest prediction error when the reconstructed first component of the block is compared to the first component of the block, either line by line or column by column, is used to build prediction blocks for the other components, which are used as “aided” prediction blocks in decoding the components other than the first.
Finally, decode the residual prediction errors and rebuild each of the other remaining components of the block, by adding the respective residual prediction errors to the respective blocks of prediction obtained recursively.
Another way of stating the operations is that once the block of the first component is encoded and decoded with the m index mode:
In the case of “horizontal prediction”, start from the first column (0) to the last column (N−1) of the decoded block V′ (the guide), and the algorithm is similar to previous vertical prediction, the construction of the prediction being realized column by column.
At the encoder side, encode the residual prediction error block of each component (Y, U)
At the decoder side, decode the residual prediction error blocks of each component (Y, U), and reconstruct the blocks.
As an example for a current block of an intra image being encoded in 4:4:4 YUV format, as shown in
The aided local directional prediction mode is realized at the coder and the decoder side because the first decoded component block is known. In fact, it is termed “aided” because the prediction of the other components (Y and U) is built with the help of the reconstructed V block (V′).
The principle is shown by
At the encoder and the decoder the steps are, for example, as follows.
J(Mode|Qp,λ)=D(y,y′,Mode|Qp)+λ×R(y,y′,Mode|Qp)
One embodiment of a method 900 used in an encoder and using the present principles is shown in
One embodiment of a method 1000 used in a decoder and using the present principles is shown in
One embodiment of an apparatus 1100 using the present principles is shown in
The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
The present description illustrates the present principles. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present principles and are included within its scope.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the present principles and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the present principles, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the present principles. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
Reference in the specification to “one embodiment” or “an embodiment” of the present principles, as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, the appearances of the phrase “in one embodiment” or “in an embodiment”, as well any other variations, appearing in various places throughout the specification are not necessarily all referring to the same embodiment.
In conclusion, a method for spatial guided prediction is provided in various embodiments, to be used in digital video compression systems.
Number | Date | Country | Kind |
---|---|---|---|
16306279.7 | Sep 2016 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP17/73910 | 9/21/2017 | WO | 00 |