The present invention claims priority to PCT Patent Application, Serial No. PCT/CN2016/074200, filed on Feb. 22, 2016. The PCT Patent Applications is hereby incorporated by reference in its entirety.
The invention relates generally to video coding. In particular, the present invention relates to chroma prediction using localized luma prediction mode inheritance.
The High Efficiency Video Coding (HEVC) standard is developed under the joint video project of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) standardization organizations, and is especially with partnership known as the Joint Collaborative Team on Video Coding (JCT-VC).
In HEVC, one slice is partitioned into multiple coding tree units (CTU). The CTU is further partitioned into multiple coding units (CUs) to adapt to various local characteristics. Coding mode, such as Inter mode or Intra, is selected on a CU basis. HEVC supports multiple Intra prediction modes and for Intra coded CU, the selected Intra prediction mode is signalled. In addition to the concept of coding unit, the concept of prediction unit (PU) is also introduced in HEVC. Once the splitting of CU hierarchical tree is done, each leaf CU is further split into one or more prediction units (PUs) according to prediction type and PU partition. After prediction, the residues associated with the CU are partitioned into transform blocks, named transform units (TUs) for the transform process.
HEVC uses more sophisticated Intra prediction than previous video coding standards such as AVC/H.264. According to HEVC, 35 Intra prediction modes are used for the luma components, where the 35 Intra prediction modes include DC, planar and various angular prediction modes. There is chroma Intra prediction mode referred as ‘DM’ mode, in which a chroma block will inherit the same Intra prediction coding mode of the corresponding luma block as depicted
The design in HEVC is suitable only when luma and chroma components share the same coding structure. When luma and chroma coding structure is separated, the DM mode may be not suitable. In the existing HEVC standard, the prediction mode of a chroma block (i.e., a chroma PU) may inherit the prediction mode of a corresponding block (i.e., a luma PU). The prediction mode inheritance is performed on a chroma PU basis. In advanced coding system, the coding structure may be different for luma and chroma components. When a large chroma PU is used, the prediction mode inherited from a corresponding luma block (i.e., luma PU) may not be suitable for the large chroma PU. Accordingly, a technique, named localized luma prediction mode inheritance (LLMI) is disclosed in the present invention in order to provide better prediction modes for the chroma PU and consequently to result in improved compression performance.
A method and apparatus of Inter/Intra prediction for a chroma component performed by a video encoder or video decoder are disclosed. According to this method, a current chroma prediction block (e.g. a prediction unit, PU) is divided into multiple chroma prediction sub-blocks (e.g. sub-PUs). A corresponding luma prediction block is identified for each chroma prediction sub-block. A chroma prediction mode for each chroma prediction sub-block is determined from a luma prediction mode associated with the corresponding luma prediction block. A local chroma predictor for the current chroma prediction block is generated by applying a prediction process to the multiple chroma prediction sub-blocks using respective chroma prediction modes. In other words, the prediction process is applied at the chroma prediction sub-block level. After the local chroma predictor is derived, a coding block associated with the current chroma prediction block is encoded or decoded using information comprising the local chroma predictor.
The chroma prediction mode and the luma prediction mode may correspond to an Intra prediction mode. The chroma prediction mode and the luma prediction mode may also correspond to an Inter prediction mode. Parameters associated with the Inter prediction mode belong to a motion information group comprising motion vector, reference picture index, prediction direction representing uni-prediction or bi-prediction, and merge index.
Transform block structure may use the same structure as the chroma prediction block so that each chroma prediction sub-block can be treated as a transform sub-block and a transform process is applied to each chroma prediction residual sub-block to generate each transform sub-block. When a first chroma prediction sub-block is processed before a second chroma prediction sub-block, a reconstructed first chroma prediction sub-block can be used for the prediction process of the second chroma prediction sub-block, where the reconstructed first chroma prediction sub-block is generated from a reconstructed first chroma prediction residual sub-block corresponding to a reconstructed transform sub-block. When a first chroma prediction sub-block is processed before a second chroma prediction sub-block, the chroma prediction mode or chroma predictor of the first chroma prediction sub-block can also be used for the prediction process of the second chroma prediction sub-block.
In another embodiment, a second chroma predictor is derived by applying a different prediction process having a second prediction mode to the current chroma prediction block. A combined chroma predictor is then generated based on the local chroma predictor and the second chroma predictor. The combined chroma predictor is then used for or decoding the current chroma prediction block. The combined chroma predictor may correspond to a weighted sum of the local chroma predictor and the second chroma predictor. In yet another embodiment, the combined chroma predictor, P(i,j) is derived from the local chroma predictor, PL(i,j) and the second chroma predictor, PK(i,j), according to P(i,j)=(w1*PL(i,j)+w2*PK(i,j)+D)>>S, wherein w1, w2, D and S are integers, S is greater than zero, w1+w2=1<<S, “>>” corresponds to a right-shift operation and “<<” corresponds to a left-shift operation. For example, D can be equal to zero or 1<<(S−1). In another example, w1, w2, and D are zero, and S is equal to one. In another example, w1 and w2 are zero, and D and S are equal to one.
The second prediction mode may belong to a mode group comprising one or more of LM mode, LM_TOP mode, LM_LEFT mode, LM_TOP_RIGHT mode, LM_RIGHT mode, LM_LEFT_BOTTOM mode, LM_BOTTOM mode, LM_LEFT_TOP mode, and LM_CbCr mode. The second prediction mode may correspond to an angular prediction mode. The second prediction mode may also belong to a mode group comprising DC mode, Planar mode, Planar_Ver mode and Planar_Hor mode.
A localized luma prediction mode inheritance (LLMI) mode indicating a use of the local chroma predictor for encoding or decoding the current chroma prediction block can be coded by placing the LLMI mode in a code table among other prediction modes, and the LLMI mode is placed at a first position in the code table so that the LLMI mode has a shortest codeword. In another example, the LLMI mode is placed after LM and extended LM modes in the code table so that the LLMI mode has a longer codeword than codewords associated with the LM and extended LM modes. In yet another example, the LLMI mode is placed after LM Fusion mode and extended LM Fusion modes in the code table so that the LLMI mode has a longer codeword than codewords associated with the LM Fusion mode and extended LM Fusion modes.
According to another method of the present invention, each sub-block is predicted as a part of the prediction block of the current chroma block. For example, a current chroma prediction block is divided into multiple chroma prediction sub-blocks comprising at least a first chroma prediction sub-block and a second chroma prediction sub-block. A first chroma prediction mode is applied to the current chroma prediction block to generate a first predictor for the current chroma prediction block. A second chroma prediction mode is applied to the current chroma prediction block to generate a second predictor for the current chroma prediction block. A local chroma predictor is generated for the current chroma prediction block, where the local chroma predictor comprises a first sub-block predictor corresponding to the first chroma prediction sub-block of the first predictor and a second sub-block predictor corresponding to the second chroma prediction sub-block of the second predictor. A chroma coding block associated with the current chroma prediction block is encoded using information comprising the local chroma predictor at the encoder side or a chroma coding block associated with the current chroma prediction block is decoded using information comprising the local chroma predictor at the decoder side.
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
In the following description, Y component is identical to the luma component, U component is identical to Cb component and V component is identical to Cr component.
In the present invention, Localized Luma Prediction Mode Inheritance (LLMI) for chroma prediction and/or coding is disclosed. In one embodiment, a current chroma block with size SxR is divided into multiple sub-blocks with the size PxQ. For example, an 8×8 chroma block can be divided into 16 sub-blocks with a block size of 2×2. For a sub-block C in the current chroma block, the corresponding luma block L can be found. The luma prediction mode M of the corresponding luma block L can be obtained and mode M is assigned to the sub-block C in the current chroma block. In a conventional video coding system, such as the High Efficiency Video Coding (HEVC), a prediction mode is applied to each chroma block (e.g. a chroma prediction unit, PU). The present invention discloses a process to divide a chroma prediction block into multiple chroma prediction sub-blocks. Therefore, each chroma prediction sub-block may use its own prediction mode that may match the local characteristics of each individual chroma prediction sub-block. Subsequently, this LLMI prediction process may result in improved coding performance.
The term ‘prediction mode’ can refer to Intra prediction modes, such as various angular modes and other modes used in HEVC. It can also refer to Inter prediction modes. In this case, when a chroma sub-block inherits the prediction mode of a corresponding luma block, it implies that the chroma sub-block inherits the motion information or parameters associated with the Inter prediction mode, such as, but not limited to, motion vector, reference picture, prediction direction to identify uni-prediction or bi-prediction and merge index. In the following descriptions, an example of LLMI for the Intra prediction modes is illustrated.
In the example of
The block structure for LLMI process may also be applied to the transform structure. In one embodiment, each sub-block corresponds to a transform unit (TU) in HEVC. The sub-blocks are processed in scanning order across the sub-blocks. At decoder, after one sub-block is predicted and reconstructed, the reconstructed samples of this sub-block will be used to predict following sub-blocks.
In another embodiment, each sub-block is predicted as a part of the prediction block of the current chroma block.
In another embodiment, each sub-block is predicted in a cascade way.
In another embodiment, LLMI mode can be fused with other modes. For example, a chroma block can be first predicted by the proposed LLMI mode. For a sample (i,j) in this block, its prediction value with LLMI mode is represented as PL(i,j). The chroma block is then predicted by another prediction mode, such as mode K that is different from the LLMI mode. For a sample (i,j) in this block, its prediction value with mode K is represented as PK(i,j). The combined prediction for sample (i,j) denoted as P (i,j) in this block is calculated as:
P(i,j)=w1*PL(i,j)+w2*PK(i,j), (1)
where w1 and w2 are weighting values (real number) and w1+w2=1. In equation (1), the combined prediction is derived as a weighted sum of LLMI prediction value and another prediction value.
In another example, the combined prediction for sample (i,j) denoted as P (i,j) in this block is calculated as:
P(i,j)=(w1*PL(i,j)+w2*PK(i,j)+D)>>S, (2)
where w1, w2, D and S are integers, S>=1, and w1+w2=1<<S. For example, D can be 0. In another example, D is 1<<(S−1). The symbols “>>” and “<<” represent right-shift and left-shift operations respectively.
In yet another example, the combined prediction for sample (i,j) denoted as P (i,j) in this block is calculated as:
P(i,j)=(PL(i,j)+PK(i,j)+1)>>1. (3)
In yet another example, the combined prediction for sample (i,j) denoted as P (i,j) in this block is calculated as:
P(i,j)=(PL(i,j)+PK(i,j))>>1. (4)
Any mode can be used as mode K as long as it is not LLMI mode. For example, mode K may correspond to LM mode, LM_TOP mode, LM_LEFT mode, LM_TOP_RIGHT mode, LM_RIGHT mode, LM_LEFT_BOTTOM mode, LM_BOTTOM mode, LM_LEFT_TOP mode or LM_CbCr mode.
In one embodiment, mode K can be any angular prediction mode with a prediction direction. In another embodiment, mode K can be any of DC mode, Planar mode, Planar_Ver mode and Planar_Hor mode.
In a video coding system, the prediction mode selected for a block may have to be signalled in the bitstream. Usually, a codeword is signalled to indicate a selected prediction mode among a list of possible prediction modes. In one embodiment, in order to code the chroma mode, the LLMI mode is put into the code table at the first position, i.e., LLMImode requires a codeword no longer than any other chroma intra prediction modes. In other words, the LLMI mode has the shortest codeword.
To code the chroma mode, LLMImode can also be placed into the code table at a location after the LM and its extended modes. In this case, the LLMImode requires a codeword no shorter than the LM and its extended modes. An example of code table order is demonstrated in
To code the chroma mode, LLMImode is put into the code table after LM Fusion mode and its extended modes, i.e., LLMImode requires a codeword no less than LM Fusion mode and its extended modes. An example code table order is demonstrated in
The flowchart shown is intended to illustrate an example of video coding according to the present invention. A person skilled in the art may modify each step, re-arranges the steps, split a step, or combine steps to practice the present invention without departing from the spirit of the present invention. In the disclosure, specific syntax and semantics have been used to illustrate examples to implement embodiments of the present invention. A skilled person may practice the present invention by substituting the syntax and semantics with equivalent syntax and semantics without departing from the spirit of the present invention.
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
Number | Date | Country | Kind |
---|---|---|---|
PCT/CN2016/074200 | Feb 2016 | WO | international |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2017/074254 | 2/21/2017 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2017/143963 | 8/31/2017 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
9426472 | Kim | Aug 2016 | B2 |
20150373327 | Zhang et al. | Dec 2015 | A1 |
Number | Date | Country |
---|---|---|
1784015 | Jun 2006 | CN |
102843555 | Dec 2012 | CN |
102857768 | Jan 2013 | CN |
103314588 | Sep 2013 | CN |
104581181 | Apr 2015 | CN |
2 739 058 | Jun 2014 | EP |
2015078304 | Jun 2015 | WO |
WO-2015078304 | Jun 2015 | WO |
Entry |
---|
International Search Report dated Nov. 10, 2016, issued in application No. PCT/CN2016/074200. |
International Search Report dated May 17, 2017, issued in application No. PCT/CN2017/074254. |
Number | Date | Country | |
---|---|---|---|
20190068977 A1 | Feb 2019 | US |