The invention relates generally to video coding. In particular, the present invention relates to chroma Intra prediction using combined Intra prediction modes, extended neighbouring chroma samples and corresponding luma samples for deriving the linear model prediction parameters, or extended linear model prediction modes.
The High Efficiency Video Coding (HEVC) standard is developed under the joint video project of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) standardization organizations, and is especially with partnership known as the Joint Collaborative Team on Video Coding (JCT-VC).
In HEVC, one slice is partitioned into multiple coding tree units (CTU). The CTU is further partitioned into multiple coding units (CUs) to adapt to various local characteristics. HEVC supports multiple Intra prediction modes and for Intra coded CU, the selected Intra prediction mode is signalled. In addition to the concept of coding unit, the concept of prediction unit (PU) is also introduced in HEVC. Once the splitting of CU hierarchical tree is done, each leaf CU is further split into one or more prediction units (PUs) according to prediction type and PU partition. After prediction, the residues associated with the CU are partitioned into transform blocks, named transform units (TUs) for the transform process.
HEVC uses more sophisticated Intra prediction than previous video coding standards such as AVC/H.264. According to HEVC, 35 Intra prediction modes are used for the luma components, where the 35 Intra prediction modes include DC, planar and various angular prediction modes. For the chroma component, linear model prediction mode (LM mode) is developed to improve the coding performance of chroma components (e.g. U/V components or Cb/Cr components) by exploring the correlation between the luma (Y) component and chroma components.
In the LM mode, a linear model is assumed between the values of a luma sample and a chroma sample as shown in eq. (1):
C=a*Y+b, (1)
where C represents the prediction value for a chroma sample; Y represents the value of the corresponding luma sample ; and a and b are two parameters.
For some colour sampling formats such as 4:2:0 or 4:2:2, samples in the chroma component and the luma component are not in a 1-1 mapping.
In LM mode, an interpolated luma value is derived and the luma interpolated value is used to drive a prediction value for a corresponding chroma sample value. In
Parameters a and b are derived based on previously decoded luma and chroma samples from top and left neighbouring area.
There are several extensions of the LM mode. In one extension, parameters a and b are derived from top neighbouring decoded luma and chroma samples only.
In another extension, parameters a and b are derived from left decoded neighbouring luma and chroma samples only.
In still another extension, a linear model is assumed between values of a sample of a first chroma component (e.g. Cb) and a sample of a second chroma component (e.g. Cr) as shown in eq. (2):
C
1
=a*C
2
+b, (2)
where C1 represents the prediction value for a sample of the first chroma component (e.g. Cr); C2 represents the value of the corresponding sample of the second chroma component (e.g. Cb); a and b are two parameters, which are derived from top and left neighbouring samples of the first chroma component and corresponding samples of the second chroma component. This extended LM mode is called LM_CbCr.
Although LM and its extended modes can improve coding efficiency significantly, it is desirable to further improve the coding efficiency of chroma Intra prediction.
A method and apparatus of Intra prediction for a chroma component performed by a video coding system are disclosed. According to this method, combined Intra prediction is generated for encoding or decoding of a current chroma block by combining first Intra prediction generated according to the first chroma Intra prediction mode and second Intra prediction generated according to the second chroma Intra prediction mode. The first chroma Intra prediction mode corresponds to a linear-model prediction mode (LM mode) or an extended LM mode. The second chroma Intra prediction mode belongs to an Intra prediction mode group, where the Intra prediction mode group excludes any linear model prediction mode (LM mode) that generates a chroma prediction value based on a reconstructed luma value using a linear model.
The combined Intra prediction can be generated using a weighted sum of the first Intra prediction and the second Intra prediction. The combined Intra prediction can be calculated using integer operations including multiplication, addition and arithmetic shift to avoid a need for a division operation. For example, the combined Intra prediction can be calculated using a sum of the first Intra prediction and the second Intra prediction followed by a right-shift by one operation. In one example, the weighting coefficient of the weighted sum is position dependent.
In one embodiment, the first chroma Intra prediction mode corresponds to an extended LM mode. For example, the extended LM mode belongs to a mode group including LM_TOP mode, LM_LEFT mode, LM_TOP_RIGHT mode, LM_RIGHT mode, LM_LEFT_BOTTOM mode, LM_BOTTOM mode, LM_LEFT_TOP mode and LM_CbCr mode. On the other hand, the second chroma Intra prediction mode belongs to a mode group including angular modes, DC mode, Planar mode, Planar_Ver mode, Planar_Hor mode, a mode used by a current luma block, a mode used by a sub-block of the current luma block, and a mode used by a previous processed chroma component of the current chroma block.
In another embodiment, a fusion mode can be included in an Intra prediction candidate list, where the fusion mode indicates that the first chroma Intra prediction mode and the second chroma Intra prediction mode are used and the combined Intra prediction is used for the encoding or decoding of the current chroma block. The fusion mode is inserted in a location of the Intra prediction candidate list after all LM modes, where a codeword of the fusion mode is not shorter than the codeword of any LM mode. Furthermore, chroma Intra prediction with a fusion mode can be combined with multi-phase LM modes. In the multi-phase LM modes, mapping between chroma samples and corresponding luma samples is different between a first LM mode and a second LM mod. The first LM mode can be inserted into the Intra prediction candidate list to replace a regular LM mode, and the second LM mode can be inserted into the Intra prediction candidate list at a location after the regular LM mode and the fusion mode.
A method and apparatus of Intra prediction for a chroma component of non-444 colour video data performed by a video coding system are also disclosed. A mode group including at least two linear-model prediction modes (LM modes) are used for multi-phase Intra prediction, where mapping between chroma samples and corresponding luma samples is different for two LM modes from the mode group. For a 4:2:0 colour video data, each chroma sample has four collocated luma samples Y0, Y1, Y2 and Y3 located above, below, above-right, and below-right of each current chroma sample respectively. The corresponding luma sample associated with each chroma sample may correspond to Y0, Y1, Y2, Y3, (Y0+Y1)/2, (Y0+Y2)/2, (Y0+Y3)/2, (Y1+Y2)/2, (Y1+Y3)/2, (Y2+Y3)/2, or (Y0+Y1+Y2+Y3)/4. For example, the mode group may include a first LM mode and a second LM mode, and the corresponding luma sample associated with each chroma sample corresponds to Y0 and Y1 for the first LM mode and the second LM mode respectively.
Yet another method and apparatus of Intra prediction for a chroma component performed by a video coding system are disclosed. According to this method, parameters of a linear model are determined based on neighbouring decoded chroma samples and corresponding neighbouring decoded luma samples from one or more extended neighbouring areas of the current chroma block. The extended neighbouring areas of the current chroma block include one or more neighbouring samples outside an above neighbouring area of the current chroma block or outside a left neighbouring area of the current chroma block. For example, the extended neighbouring areas of the current chroma block may correspond to top and right, right, left and bottom, bottom, or left top neighbouring chroma samples and corresponding luma samples.
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
In the following description, Y component is identical to the luma component, U component is identical to Cb component and V component is identical to Cr component.
In the present invention, various advanced LM prediction modes are disclosed. In some embodiments, parameters a and b are derived from extended neighbouring area(s) of the current chroma block and/or extended neighbouring area(s) of the corresponding luma block. For example, the top and right neighbouring chroma samples and corresponding luma samples can be used to derive parameters a and b. This extended mode is called LM_TOP_RIGHT mode.
In another embodiment, parameters a and b are derived from right neighbouring chroma samples and corresponding luma samples. This extended mode is called LM_RIGHT mode.
In yet another embodiment, parameters a and b are derived from left and bottom neighbouring chroma samples and corresponding luma samples. This extended mode is called LM_LEFT_BOTTOM mode.
In yet another embodiment, parameters a and b are derived from bottom neighbouring chroma samples and corresponding luma samples. This extended mode is called LM_BOTTOM mode.
In yet another embodiment, parameters a and b are derived from left top neighbouring chroma samples and corresponding luma samples. This extended mode is called LM_LEFT_TOP mode.
The present invention also discloses a method of chroma Intra prediction by combining two different Intra prediction modes. According to this method, a chroma block is predicted by utilizing LM mode or its extended modes with one or more other modes together. In this case, the chroma block is coded by the ‘Fusion mode’. The use of fusion mode allows the use of a new type of chroma Intra prediction that is generated by combining two different chroma Intra predictions. For certain color video data, the combined chroma Intra prediction may perform better than any of two individual chroma Intra predictions. Since an encoder often uses a certain optimization process (e.g., rate-distortion optimization, RDO) to select a best coding mode for a current block, the combined chroma Intra prediction will be selected over the two individual chroma Intra predictions if the combined chroma Intra prediction achieves a lower R-D cost.
In one embodiment of fusion mode, a chroma block is predicted by mode L. For a sample (i,j) in this block, its prediction value with mode L is PL(i,j). The chroma block is also predicted by another mode, named mode K other than the LM mode. For a sample (i,j) in this block, its prediction value with mode K is PK (i,j). The final prediction for sample (i,j) denoted as P (i,j) in this block is calculated as shown in eq. (3):
P(i,j)=w1*PL(i,j)+w2*PK(i,j), (3)
where w1 and w2 are weighting coefficients corresponding to real number and w1+w2=1.
In eq. (3), w1 and w2 are real value. The final prediction P (i,j) may have to be calculated using floating point operations. In order to simplify P (i,j) computation, integer operations are preferred. Accordingly, in another embodiment, the final prediction P (i,j) is calculated as shown in eq. (4):
P(i,j)−(w1*PL(i,j)+w2*PK(i,j)+D)>>S, (4)
where w1, w2, D and S are integers, S>=1, and w1+w2=1<<S. In one example, D is 0. In another example, D is 1<<(S−1). According to eq. (4), the final prediction P (i,j) may be calculated using integer multiplication, addition and arithmetic right shift.
In yet another embodiment, the final prediction P (i,j) is calculated as shown in eq. (5):
P(i,j)−(PL(i,j)+PK(i,j)>>1. (5)
In yet another embodiment, the final prediction P (i,j) is calculated as shown in eq. (6), where the final prediction P(i,j) is calculated as the sum of PL(i,j)and PK(i,j) followed by right-shift-by-one as shown in eq. (6):
P(i,j)=(PL(i,j)+PK(i,j))>>1. (6)
For example, mode L may correspond to LM mode, LM_TOP mode, LM_LEFT mode, LM_TOP_RIGHT mode, LM_RIGHT mode, LM_LEFT_BOTTOM mode, LM_BOTTOM mode, LM_LEFT_TOP mode, or LM_CbCr mode.
On the other hand, mode K can be any angular mode with a prediction direction, DC mode, Planar mode, Planar_Ver mode or Planar_Hor mode, the mode used by the luma component of the current block, the mode used by Cb component of the current block, or the mode used by Cr component of the current block.
In another example, mode K corresponds to the mode used by the luma component of any sub-block in the current block.
If a chroma block is predicted by the LM mode or an extended mode and the colour format is non-4:4:4, there can be more than one option to map a chroma sample value (C) to its corresponding luma value (Y) in the linear model C=a*Y+b.
In one embodiment, LM modes or its extended modes with different mapping from C to its corresponding Y are regarded as different LM modes, denoted as LM_Phase_X for X from 1 to N, where N is the number of mapping methods from C to its corresponding Y.
Some exemplary mappings for the colour format 4:2:0 in
a. Y=Y0
b. Y=Y1
c. Y=Y2
d. Y=Y3
e. Y=(Y0+Y1)/2
f. Y=(Y0+Y2)/2
g. Y=(Y0+Y3)/2
h. Y=(Y1+Y2)/2
i. Y=(Y1+Y3)/2
j. Y=(Y2+Y3)/2
k. Y=(Y0+Y1+Y2+Y3)/4
For example, two mapping methods can be used. For the first mapping method, mode LM_Phase_1, the corresponding luma value (Y) is determined according to Y=Y0. For the second mapping method, mode LM_Phase_2, the corresponding luma value (Y) is determined according to Y=Y1. The use of multi_phase mode allows alternative mappings from a chroma sample to different luma samples for chroma Intra prediction. For certain color video data, the multi_phase chroma Intra prediction may perform better than a single fixed mapping. Since an encoder often uses a certain optimization process (e.g., rate-distortion optimization, RDO) to select a best coding mode for a current block, the multi_phase chroma Intra prediction can provide more mode selections over the conventional single fixed mapping to improve the coding performance.
To code the chroma Intra prediction mode for a chroma block, LM Fusion mode is inserted into the code table after LM modes according to one embodiment of the present invention. Therefore, the codeword for an LM Fusion mode is always longer than or equal to the codewords for LM and its extension modes. An example code table order is demonstrated in
To code the chroma Intra prediction mode according to another embodiment of the present invention, LM_Phase_1 mode 1410 is inserted into the code table to replace the original LM mode as shown in
The method of extended neighbouring areas for deriving parameters of the LM mode, the method of Intra prediction by combining two Intra prediction modes (i.e.
fusion mode) and the multi-phase LM mode for non-444 colour format can be combined. For example, one or more multi-phase LM modes can be used for the fusion mode.
The flowcharts shown are intended to illustrate an example of video coding according to the present invention. A person skilled in the art may modify each step, re-arranges the steps, split a step, or combine steps to practice the present invention without departing from the spirit of the present invention. In the disclosure, specific syntax and semantics have been used to illustrate examples to implement embodiments of the present invention. A skilled person may practice the present invention by substituting the syntax and semantics with equivalent syntax and semantics without departing from the spirit of the present invention.
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
Number | Date | Country | Kind |
---|---|---|---|
PCT/CN2016/073998 | Feb 2016 | CN | national |
The present invention claims priority to PCT Patent Application, Serial No. PCT/CN2016/073998, filed on Feb. 18, 2016. The PCT Patent Applications is hereby incorporated by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2017/072560 | 1/25/2017 | WO | 00 |