The present invention relates to video coding using motion estimation and motion compensation. In particular, the present invention relates to motion vector buffer management for coding systems using motion estimation/compensation techniques including affine transform motion model.
Various video coding standards have been developed over the past two decades. In newer coding standards, more powerful coding tools are used to improve the coding efficiency. High Efficiency Video Coding (HEVC) is a new coding standard that has been developed in recent years. In the High Efficiency Video Coding (HEVC) system, the fixed-size macroblock of H.264/AVC is replaced by a flexible block, named coding unit (CU). Pixels in the CU share the same coding parameters to improve coding efficiency. A CU may begin with a largest CU (LCU), which is also referred as coded tree unit (CTU) in HEVC. In addition to the concept of coding unit, the concept of prediction unit (PU) is also introduced in HEVC. Once the splitting of CU hierarchical tree is done, each leaf CU is further split into one or more prediction units (PUs) according to prediction type and PU partition.
In most coding standards, adaptive Inter/Intra prediction is used on a block basis. In the Inter prediction mode, one or two motion vectors are determined for each block to select one reference block (i.e., uni-prediction) or two reference blocks (i.e., bi-prediction). The motion vector or motion vectors are determined and coded for each individual block. In HEVC, Inter motion compensation is supported in two different ways: explicit signalling or implicit signalling. In explicit signalling, the motion vector for a block (i.e., PU) is signalled using a predictive coding method. The motion vector predictors correspond to motion vectors associated with spatial and temporal neighbours of the current block. After a MV predictor is determined, the motion vector difference (MVD) is coded and transmitted. This mode is also referred as AMVP (advanced motion vector prediction) mode. In implicit signalling, one predictor from a candidate predictor set is selected as the motion vector for the current block (i.e., PU). Since both the encoder and decoder will derive the candidate set and select the final motion vector in the same way, there is no need to signal the MV or MVD in the implicit mode. This mode is also referred as Merge mode. The forming of predictor set in Merge mode is also referred as Merge candidate list construction. An index, called Merge index, is signalled to indicate the predictor selected as the MV for current block.
Motion occurs across pictures along temporal axis can be described by a number of different models. Assuming A(x, y) be the original pixel at location (x, y) under consideration, A′ (x′, y′) be the corresponding pixel at location (x′, y′) in a reference picture for a current pixel A(x, y), the affine motion models are described as follows.
In contribution ITU-T13-SG16-C1016 submitted to ITU-VCEG (Lin, et al., “Affine transform prediction for next generation video coding”, ITU-U, Study Group 16, Question Q6/16, Contribution C1016, September 2015, Geneva, CH), a four-parameter affine prediction is disclosed, which includes the affine Merge mode. When an affine motion block is moving, the motion vector field of the block can be described by two control-point motion vectors or four parameters as follows, where (vx, vy) represents the motion vector:
An example of the four-parameter affine model is shown in
In the above equations, (v0x, v0y) is the control-point motion vector (i.e., v0) at the upper-left corner of the block, and (v1x, v1y) is another control-point motion vector (i.e., v1) at the upper-right corner of the block. When the MVs of two control points are decoded, the MV of each 4×4 block of the block can be determined according to the above equation. In other words, the affine motion model for the block can be specified by the two motion vectors at the two control points. Furthermore, while the upper-left corner and the upper-right corner of the block are used as the two control points, other two control points may also be used. An example of motion vectors for a current block can be determined for each 4×4 sub-block based on the MVs of the two control points as shown in
The 6-parameter affine model can also be used. The motion vector field of each point in this moving block can be described by the following equation.
In the above equation, (v0x, v0y) is the control point motion vector on top left corner, (v1x, v1y) is another control point motion vector on above right corner of the block, (v2x, v2y) is another control point motion vector on bottom left corner of the block.
In contribution ITU-T13-SG16-C1016, for an Inter mode coded CU, an affine flag is signalled to indicate whether the affine Inter mode is applied or not when the CU size is equal to or larger than 16×16. If the current block (e.g., current CU) is coded in affine Inter mode, a candidate MVP pair list is built using the neighbour valid reconstructed blocks. corresponds to motion vector of the block V0 at the upper-left corner of the current block 210, which is selected from the motion vectors of the neighbouring block a0 (referred as the above-left block), a1 (referred as the inner above-left block) and a2 (referred as the lower above-left block), and the
corresponds to motion vector of the block V1 at the upper-right corner of the current block 210, which is selected from the motion vectors of the neighbouring block b0 (referred as the above block) and b1 (referred as the above-right block). The index of candidate MVP pair is signalled in the bit stream. The MV difference (MVD) of the two control points are coded in the bitstream.
In ITU-T13-SG16-C-1016, an affine Merge mode is also proposed. If current is a Merge PU, the neighbouring five blocks (c0, b0, b1, c1, and a0 blocks in
In HEVC, the decoded MVs of each PU are down-sampled with a 16:1 ratio and stored in the temporal MV buffer for the MVP derivation for the following frames. For a 16×16 block, only the top-left 4×4 MV is stored in the temporal MV buffer and the stored MV represents the MV of the whole 16×16 block.
Methods and apparatus of Inter prediction for video coding performed by a video encoder or a video decoder that utilizes MVP (motion vector prediction) to code MV (motion vector) information associated with a block coded with coding modes including an affine mode are disclosed. According to one method, input data related to a current block at a video encoder side or a video bitstream corresponding to compressed data including the current block at a video decoder side are received. A target neighbouring block from a neighbouring set of the current block is determined, where the target neighbouring block is coded according to a 4-parameter affine model or a 6-parameter affine model. If the target neighbouring block is in a neighbouring region of the current block, an affine control-point MV candidate is derived based on two target MVs (motion vectors) of the target neighbouring block where the affine control-point MV candidate derivation is based on a 4-parameter affine model. An affine MVP candidate list is generated where the affine MVP candidate list comprises the affine control-point MV candidate. The current MV information associated with an affine model is encoded using the affine MVP candidate list at the video encoder side or the current MV information associated with the affine model is decoded at the video decoder side using the affine MVP candidate list.
A region boundary associated with the neighbouring region of the current block may correspond to a CTU boundary, CTU-row boundary, tile boundary, or slice boundary of the current block. The neighbouring region of the current block may correspond to an above CTU (coding tree unit) row of the current block or one left CTU column of the current block. In another example, the neighbouring region of the current block corresponds to an above CU (coding unit) row of the current block or one left CU column of the current block.
In one embodiment, the two target MVs of the target neighbouring block correspond to two sub-block MVs of the target neighbouring block. For example, the two sub-block MVs of the target neighbouring block correspond to a bottom-left sub-block MV and a bottom-right sub-block MV of the neighbouring block. The two sub-block MVs of the target neighbouring block can be stored in a line buffer. For example, one row of MVs above the current block and one column of MVs to a left side of the current block can be stored in the line buffer. In another example, one bottom row of MVs of an above CTU row of the current block are stored in the line buffer. The two target MVs of the target neighbouring block may also correspond to two control-point MVs of the target neighbouring block.
The method may further comprise deriving the affine control-point MV candidate and including the affine control-point MV candidate in the affine MVP candidate list if the target neighbouring block is in a same region as the current block, where the affine control-point MV derivation is based on a 6-parameter affine model or the 4-parameter affine model. The same region corresponds to a same CTU row.
In one embodiment, the y-term parameter of MV x-component and x-term parameter is equal to MV y-component multiplied by (−1), and x-term parameter of MV x-component and y-term parameter of MV y-component are the same for the 4-parameter affine model. In another embodiment, y-term parameter of MV x-component and x-term parameter of MV y-component are different, and x-term parameter of MV x-component and y-term parameter of MV y-component are also different for the 6-parameter affine model.
According to another method, if the target neighbouring block is in a neighbouring region of the current block, an affine control-point MV candidate is derived based on two sub-block MVs (motion vectors) of the target neighbouring block. If the target neighbouring block is in a same region as the current block, the affine control-point MV candidate is derived based on control-point MVs of the target neighbouring block.
For the second method, if the target neighbouring block is a bi-predicted block, bottom-left sub-block MVs and bottom-right sub-block MVs associated with list 0 and list 1 reference pictures are used for deriving the affine control-point MV candidate. If the target neighbouring block is in the same region as the current block, the affine control-point MV candidate derivation corresponds to a 6-parameter affine model or a 4-parameter affine model depending on the affine mode of the target neighbouring block.
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
In the existing video systems, the motion vectors of previously coded blocks are stored in a motion vector buffer for use by subsequent blocks. For example, the motion vector in the buffer can be used to derive a candidate for a Merge list or an AMVP (advanced motion vector prediction) list for Merge mode or Inter mode respectively. When affine motion estimation and compensation is used, the motion vectors (MVs) associated with the control points are not stored in the MB buffer. Instead, the control-point motion vectors (CPMVs) are stored other buffer separated from the MV buffer. When an affine candidate (e.g. an affine Merge candidate or an affine Inter candidate) is derived, CPMVs of neighbouring blocks have to be retrieved from the other buffer. In order to reduce the required storage and/or CPMVs access, various techniques are disclosed.
In ITU-T13-SG16-C-1016, the affine MVP are derived for affine Inter mode and affine Merge mode. In ITU-T13-SG16-C-1016, for affine Merge mode of a current block, if the neighbouring block is affine coded block (including affine Inter mode block and affine Merge mode block), the MV of top-left N×N (e.g., the smallest block size to store an MV, and N=4) block of the neighbouring block and the MV of the top-right N×N block of the neighbouring block are used to derive the affine parameters or the MVs of the control points of the affine merge candidate. When the third control point is used, the MV of bottom-left N×N block is also used. For example, as shown in
In order to overcome this MV buffer issue, various methods of MV buffer management are disclosed to reduce the buffer requirements.
If the MVs are not in the neighbouring block row or block column of the current CU/CTU or in the current CTU/CTU-row (e.g. the referenced MV is not in the neighbouring N×N block row or N×N block column of the current CU/CTU or in the current CTU/CTU-row), the affine parameter derivation uses the MVs stored in the temporal MV buffer instead of the real MVs. Here N×N represents the smallest block size to store an MV. In one embodiment, N=4.
Instead of storing all MVs in the current frame, according to this method, the MVs of M neighbouring row blocks and the MVs of K neighbouring column blocks are stored for affine parameter derivation, where M and K are integer numbers, M can be larger than 1 and K can be larger than 1. Each block refers to the smallest N×N block that an associated MV (N=4 in one embodiment) can be stored. An example with M=K=2 and N=4 is shown in
The first derived control-point affine MVP from block B can be modified as follows:
In the above equations, VB0′, VB1′, and VB2 can be replaced by the corresponding MVs of any other selected reference/neighbouring PU, (posCurPU_X, posCurPU_Y) is the pixel position of the top-left sample of the current PU relative to the top-left sample of the picture, (posRefPU_X, posRefPU_Y) is the pixel position of the top-left sample of the reference/neighbouring PU relative to the top-left sample of the picture, (posB0′_X, posB0′_Y) is the pixel position of the top-left sample of the B0 block relative to the top-left sample of the picture. The other two control-point MVP can be derived as the follows.
V
1_x
=V
0_x+(VB1′_x−VB0′_x)*PU_width/RefPU_widt,h
V
1_y
=V
0_y+(VB1′_y−VB0′_y)*PU_width/RefPU_width,
V
2_x
=V
0_x+(VB2_x−VB0′_x)*PU_height/(2*N), and
V
2_y
=V
0_y+(VB2_y−VB0′_y)*PU_height/(2*N). (5)
The derived 2 control-point affine MVP from block B can be modified as follows:
Since the line buffer for storing the MVs from the top CTUs is much larger than the column buffer for storing the MVs from the left CTU, there is no need to constrain the value of M, where M can be set to CTU_width/N according to one embodiment.
In another embodiment, inside the current CTU row, M MV rows are used. However, outside the current CTU row, only one MV row is used. In another word, the CTU row MV line buffer only stores one MV row.
In another embodiment, different M MVs in vertical directions and/or different K MVs in horizontal direction are stored in the M MV row buffers and/or K MV column buffers. Different MVs can come from different CUs or different sub-blocks. The number of different MVs introduced from one CU with sub-block mode can be further limited in some embodiments. For example, one affine-coded CU with size 32×32 can be divided into 8 4×4 sub-blocks in the horizontal direction and 8 4×4 sub-blocks in the vertical direction. There are 8 different MVs in each direction. In one embodiment, all of these 8 different MVs are allowed to be considered as M or K different MVs. In another embodiment, only the first MV and the last MV among these 8 different MVs are considered as M or K different MVs.
Instead of storing all MVs in the current frames, it is proposed to store one more MV row and one more MV column. As shown in
In one embodiment, inside the current CTU row, two MV rows are used. However, outside the current CTU row, only one MV row is used. In other words, the CTU row MV line buffer is used only to store one MV row.
In equations (4), the MVs of top-left and top-right control points are used to derive the MVs of all N×N sub-blocks (i.e., the smallest unit to store an MV, N=4 in one embodiment) in the CU/PU. The derived MVs are (v0x, v0y) plus the position dependent offset MV. From the equations (4), if it derives an MV for an N×N sub-block, the horizontal direction offset MV is ((v1x−v0x)*N/w,(v1y−v0y)*N/w) and the vertical direction offset MV is (−(v1y−v0y)*N/w, (v1x−v0x)*N/w). For a 6-parameter affine model, if the top-left, top-right, and the bottom-left MVs are v0, v1, and V2, the MVs of each pixel can be as follows.
According to equation (7), an MV for an N×N sub-block at position (x, y) (relative to the top-left corner), the horizontal direction offset MV is ((v1x−v0x)*N/w, (v1y−v0y)*N/w) and the vertical direction offset MV is ((v2x−v0x)*N/h, (v2y−v0y)*N/h). The derived MV is (vx, vy) as shown in equation (7). In equations (4) and (7), w and h are the width and height of the affine coded block.
If the MV of the control points is the MV of the centre pixel of an N×N block, in equations (4) to (7), the denominator can be decreased by N. For example, the equation (4) can be rewritten as follows.
In one embodiment, the horizontal and vertical direction offset MVs for an M×M block or for a CU are stored. For example, if the smallest affine Inter mode or affine Merge mode block size is 8×8, then M can be equal to 8. For each 8×8 block or a CU, if the 4-parameter affine model that uses the upper-left and upper-right control points is used, the parameters of (v1x−v0x)*N/w and (v1y−v0y)*N/w and one MV of an N×N block (e.g. the v0y and v0y) are stored. If the 4-parameter affine model that uses the upper-left and bottom-left control points is used, the parameters of (v2x−v0x)*N/h and (v2y−v0y)*N/h and one MV of an N×N block (e.g. the v0y and v0y) are stored. If the 6-parameter affine model that uses the upper-left, upper-right, and bottom-left control points is used, the parameters of (v1x−v0x)*N/w, (v1y−v0y)*N/w, (v2x−v0x)*N/h, (v2y−v0y)*N/h, and one MV of an N×N block (e.g. v0y and v0y) are stored. The MV of an N×N block can be any N×N block within the CU/PU. The affine parameters of the affine Merge candidate can be derived from the stored information.
In order to preserve the precision, the offset MV can be multiplied by a scale number. The scale number can be predefined or set equal to the CTU size. For example, the ((v1x−v0x)*S/w, (v1y−v0y)*S/w) and ((v2x−v0x)*S/h, (v2y−v0y)*S/h) are stored. The S can be equal to CTU_size or CTU_size/4.
In another embodiment, instead of storing affine parameters, the MVs of two or three control points of an MxM block or a CU, for example, are stored in a line buffer or local buffer. The control-point MV buffer and the sub-block MV buffer can be different buffers. The control-point MVs are stored separately. The control-point MV are not identical to the sub-block MVs. The affine parameters of the affine Merge candidate can be derived using the stored control points.
Instead of storing all MVs in the current frame, the HEVC MV line buffer design is reused according to this method. The HEVC line buffer comprises one MV row and one MV column, as shown in
When deriving the affine candidates from the neighbouring block, two MVs of the neighbouring blocks (e.g. two MVs of two N×N neighbouring sub-blocks of the neighbouring block, or two control-point MVs of the neighbouring block) are used. For example, in
In one embodiment, the block E will not be used to derive the affine candidate. No additional buffer or additional line buffer is required for this method.
In another example, as shown in
In
In another embodiment, for
Note that the above mentioned methods use the left CUs to derive the affine parameters or control-points MVs for the current CU. The proposed methods can also be used for deriving the affine parameters or control-point MVs for the current CU from the above CUs by using the same/similar methods.
The derived 2 control-points (i.e., 4-parameter) affine MVP from block B can be modified as follow:
Alternatively, we can use the equation below:
In the above equation, VB0′, VB1′, and VB2 can be replaced by the corresponding MVs of any other selected reference/neighbouring PU, (posCurPU_X, posCurPU_Y) is the pixel position of the top-left sample of the current PU relative to the top-left sample of the picture, (posCurPU_TR_X, posCurPU_TR_Y) is the pixel position of the top-right sample of the current PU relative to the top-left sample of the picture, (posRefPU_X, posRefPU_Y) is the pixel position of the top-left sample of the reference/neighbouring PU relative to the top-left sample of the picture, (posB0′_X, posB0′_Y) are the pixel position of the top-left sample of the B0 block relative to the top-left sample of the picture.
In one embodiment, the proposed method, which uses two MVs for deriving the affine parameters or only using MVs stored in the MV line buffer for deriving the affine parameters, is applied to a neighbouring region. Inside the current region of the current block, the MVs are all stored (e.g. all the sub-block MVs or all the control-point MVs of the neighbouring blocks) and can be used for deriving the affine parameters. If the reference MVs are outside of the region (i.e., in the neighbouring region), the MVs in the line buffer (e.g. CTU row line buffer, CU row line buffer, CTU column line buffer, and/or CU column line buffer) can be used. The 6-parameter affine model is reduced to 4-parameter affine model in the case that not all control-point MVs are available. For example, two MVs of the neighbouring blocks are used to derive the affine control point MV candidate of the current block. The MVs of the target neighbouring block can be a bottom-left sub-block MV and a bottom-right sub-block MV of the neighbouring block or two control point MVs of the neighbouring block. When the reference MVs are inside the region (i.e., the current region), the 6-parameter affine model or 4-parameter affine model or other affine model can be used.
The region boundary associated with the neighbouring region can be CTU boundary, CTU-row boundary, tile boundary, or slice boundary. For example, for the MVs above the current CTU-row, the MVs stored in the one row MV buffer (e.g. the MVs of the above row of the current CTU row) can be used (e.g. the VB0 and VB1 in
In another example, for the MVs above the current CTU-row, the MVs of the above row of the current CTU and the right CTUs, and the MVs with the current CTU row can be used. The MV in the top-left CTUs cannot be used. In one embodiment, if the reference block is in the above CTU or the above-right CTUs, the 4-parameter affine model is used. If the reference block is in the top-left CTU, the affine model is not used. Otherwise, the 6-parameter affine model or 4-parameter affine model or other affine model can be used.
In another example, the current region can be the current CTU and the left CTU. The MVs in current CTU, the MVs of the left CTU, and one MV row above current CTU, left CTU and right CTUs can be used. In one embodiment, if the reference block is in the above CTU row, the 4-parameter affine model is used. Otherwise, the 6-parameter affine model or 4-parameter affine model or other affine model can be used.
In another example, the current region can be the current CTU and the left CTU. The MVs in current CTU, the MVs of the left CTU, and one MV row above current CTU, left CTU and right CTUs can be used. The top-left neighbouring CU of the current CTU cannot be used for derive the affine parameters. In one embodiment, if the reference block is in the above CTU row or in the left CTU, the 4-parameter affine model is used. If the reference block is in the top-left CTU, the affine model is not used. Otherwise, the 6-parameter affine model or 4-parameter affine model or other affine model can be used.
In another example, the current region can be the current CTU. The MVs in the current CTU, the MVs of the left column of the current CTU, and the MVs of the above row of the current CTU can be used for deriving the affine parameters. The MVs of the above row of the current CTU may also include the MVs of the above row of the right CTUs. In one embodiment, the top-left neighbouring CU of the current CTU cannot be used for deriving the affine parameter. In one embodiment, if the reference block is in the above CTU row or in the left CTU, the 4-parameter affine model is used. If the reference block is in the top-left CTU, the affine model is not used. Otherwise, the 6-parameter affine model or 4-parameter affine model or other affine model can be used.
In another example, the current region can be the current CTU. The MVs in the current CTU, the MVs of the left column of the current CTU, the MVs of the above row of the current CTU and the top-left neighbouring MV of the current CTU can be used for deriving the affine parameters. The MVs of the above row of the current CTU may also include the MVs of the above row of the right CTUs. Note that, in one example, the MVs of the above row of the left CTU are not available. In another example, the MVs of the above row of the left CTU except for the top-left neighbouring MV of the current CTU are not available. In one embodiment, if the reference block is in the above CTU row or in the left CTU, the 4-parameter affine model is used. Otherwise, the 6-parameter affine model or 4-parameter affine model or other affine model can be used.
In another example, the current region can be the current CTU. The MVs in the current CTU, the MVs of the left column of the current CTU, the MVs of the above row of the current CTU (in one example, including the MVs of the above row of the right CTUs and the MVs of the above row of the left CTU), and the top-left neighbouring MV of the current CTU can be used for deriving the affine parameters. In one embodiment, the top-left neighbouring CU of the current CTU cannot be used for derive the affine parameters.
In another example, the current region can be the current CTU. The MVs in the current CTU, the MVs of the left column of the current CTU, the MVs of the above row of the current CTU can be used for deriving the affine parameters. In another example, the MVs of the above row of the current CTU includes the MVs of the above row of the right CTUs but excluding the MVs of the above row of the left CUs. In one embodiment, the top-left neighbouring CU of the current CTU cannot be used for derive the affine parameter.
For 4-parameter affine model, the MVx and MVy (vx and vy) are derived by four parameters (a, b, e, and f) such as the following equation.
According to the x and y position of a target point and the four parameters, the vx and vy can be derived. In four parameter model, the y-term parameter of vx is equal to x-term parameter of vy multiplied by −1. The x-term parameter of vx and y-term parameter of vy are the same. According to equation (4), the a can be (v1x−v0x)/w, b can be −(v1y−v0y)/w, e can be v0x, f can be v0y.
For 6-parameter affine model, the MVx and MVy (vx and vy) are derived by six parameters (a, b, c, d, e, and f) such as the following equation.
According to the x and y position of a target point and the six parameters, the vx and vy can be derived. In six parameter model, the y-term parameter of vx and x-term parameter of vy are different. The x-term parameter of vx and y-term parameter of vy are also the different. According to equation (4), the a can be (v1x−v0x)/w, b can be (v2x−v0x)/h, c can be (v1y−v0y)/w, d can be (v2y−v0y)/h, e can be v0x, f can be v0y.
The proposed method that only uses partial MV information (e.g. only two MVs) to derive the affine parameters or control-point MVs/MVPs can be combined with the method that stores the affine control-point MVs separately. For example, a region is first defined. If the reference neighbouring block is in the same region (i.e., the current region), the stored control-point MVs of the reference neighbouring block can be used to derive the affine parameters or control-point MVs/MVPs of the current block. If the reference neighbouring block is not in the same region (i.e., in the neighbouring region), only the partial MV information (e.g. only two MVs of the neighbouring block) can be used to derive the affine parameters or control-point MVs/MVPs of the current block. The two MVs of the neighbouring block can be the two sub-block MVs of the neighbouring block. The region boundary can be CTU boundary, CTU-row boundary, tile boundary, or slice boundary. In one example, the region boundary can be CTU-row boundary. If the neighbouring reference block is not in the same region (e.g. the neighbouring reference block in the above CTU row), only the two MVs of the neighbouring block can be used to derive the affine parameters or control-point MVs/MVPs. The two MVs can be the bottom-left and the bottom-right sub-block MVs of the neighbouring block. In one example, if the neighbouring block is bi-predicted block, the List-0 and List-1 MVs of the bottom-left and the bottom-right sub-block MVs of the neighbouring block can be used to derive the affine parameters or control-point MVs/MVPs of the current block. Only the 4-parameter affine model is used. If the neighbouring reference block is in the same region (e.g. in the same CTU row with the current block), the stored control-point MVs of the neighbouring block can be used to derive the affine parameters or control-point MVs/MVPs of the current block. The 6-parameter affine model or 4-parameter affine model or other affine models can be used depending on the affine model used in the neighbouring block.
In this proposed method, it uses two neighbouring MVs to derive the 4-parameter affine candidate. In another embodiment, we can use the two neighbouring MVs and one additional MV to derive the 6-parameter affine candidates. The additional MV can be one of the neighbouring MVs or one of the temporal MVs. Therefore, if the neighbouring block is in the above CTU row or not in the same region, the 6-parameter affine model still can be used to derive the affine parameters or control-point MVs/MVPs of the current block.
In one embodiment, 4- or 6-parameter affine candidate is derived depending on the affine mode and/or the neighbouring CUs. For example, in affine AMVP mode, one flag or one syntax is derived or signalled to indicate the 4- or 6-parameter being used. The flag or syntax can be signalled in the CU level, slice-level, picture level or sequence level. If the 4-parameter affine mode is used, the above mentioned method is used. If the 6-parameter affine mode is used and not all control-point MVs of the reference block are available (e.g. the reference block being in above CTU row), the two neighbouring MVs and one additional MV are used to derive the 6-parameter affine candidate. If the 6-parameter affine mode is used and all control-point MVs of the reference block are available (e.g. the reference block being in current CTU) the three control-point MVs of the reference block are used to derive the 6-parameter affine candidate.
In another example, the 6-parameter affine candidate is always used for affine Merge mode. In another example, the 6-parameter affine candidate is used when the referencing affine coded block is coded in the 6-parameter affine mode (e.g. 6-parameter affine AMVP mode or Merge mode). The 4-parameter affine candidate is used when the referencing affine coded block is coded in 4-parameter affine mode. For deriving the 6-parameter affine candidate, if not all control-point MVs of the reference block are available (e.g. the reference block being in the above CTU row), the two neighbouring MVs and one additional MV are used to derive the 6-parameter affine candidate. If all control-point MVs of the reference block are available (e.g. the reference block being in the current CTU), the three control-point MVs of the reference block are used to derive the 6-parameter affine candidate.
In one embodiment, the additional MV is from the neighbouring MVs. For example, if the MVs of the above CU are used, the MV of the bottom-left neighbouring MV (A0 or A1 in
In another embodiment, the additional MV is from the temporal collocated MVs. For example, the additional MV can be the Col-BR, Col-H, Col-BL, Col-A1, Col-A0, Col-B0, Col-B1, Col-TR in
In one embodiment, whether to use the spatial neighbouring MV or the temporal collocated MV depends on the spatial neighbouring and/or the temporal collocated block. In one example, if the spatial neighbouring MV is not available, the temporal collocated block is used. In another example, if the temporal collocated MV is not available, the spatial neighbouring block is used.
In affine motion modelling, control-point MVs are first derived. The current block is then divided into multiple sub-blocks. The derived representative MV of the each sub-block is derived from the control-point MVs. In JEM, the Joint Exploration Test Model, the representative MV each sub-block is used for motion compensation. The representative MV is derived by using the centre point of the sub-block. For example, for a 4×4 block, the (2, 2) sample of the 4×4 block is used to derive the representative MV. In the MV buffer storage, for the four corners of the current block, the representative MVs of the four corners are replaced by control-points MVs. The stored MVs are used for MV referencing of the neighbouring block. This causes confusion since the stored MVs (e.g. control-point MVs) and the compensation MV (e.g. the representative MVs) for the four corners are different.
In this invention, it is proposed to store the representative MV in the MV buffer instead of control-point MVs in the four corners of the current block. In this way, it doesn't need to re-derive the compensation MVs for four corner sub-blocks or doesn't need additional MV storage for the four corners. However, the affine MV derivation needs to be modified since the denominator of the scaling factor in affine MV derivation is not a power-of-2 value. The modification can be addressed as the follows. Also, the reference sample positions in the equations are also modified according to embodiments of the present invention.
In one embodiment, the control-points MVs of the corners of the current block (e.g. top-left/top-right/bottom-left/bottom-right samples of the current block) are derived as affine MVPs (e.g. AMVP MVP candidate and/or affine Merge candidates). From the control-point MVs, the representative MV of each sub-block is derived and stored. The representative MVs are used for MV/MVP derivation and MV coding of neighbouring block and collocated blocks.
In another embodiment, the representative MVs of some corner sub-blocks are derived as affine MVPs. From the representative MVs of the corner sub-blocks, the representative MVs of each sub-block is derived and stored. The representative MVs are used for MV/MVP derivation and MV coding of neighbouring block and collocated blocks.
In this invention, to derive the affine control-point MVs, the MV difference (e.g. VB2_x−VB0′_x) is multiplied by a scaling factor (e.g. (posCurPU_Y−posB2_Y)/RefPUB_width and (posCurPU_Y−posB2_Y)/(posB3_X−posB2_X) in equation (9). If the denominator of the scaling factor is a power-of-2 value, the simple multiplication and shift can be applied. However, if the denominator of the scaling factor is not a power-of-2 value, a divider is required. Usually, the implementation of a divider requires a lot of silicon area. To reduce the implementation cost, the divider can be replaced by look-up table, multiplier, and shifter according to embodiments of the present invention. Since the denominator of the scaling factor is the control-point distance of the reference block, the value is smaller than CTU size and is related to the possible CU size. Therefore, the possible values of the denominator of the scaling factor are limited. For example, the value can be power of 2 minus 4, such as 4, 12, 28, 60 or 124. For these denominators (denoted as D), a list of beta values can be predefined. The “N/D” can be replace by N*K>>L , where the N is the numerator of the scaling factor and “>>” corresponds to the right shift operation. The L can be a fixed value. The K is related to D and can be derived from a look-up table. For example, for a fixed L, the K value depends on D and can be derived using Table 1 or Table 2 below. For example, the L can be 10. The K value is equal to {256, 85, 37, 17, 8} for the D equal to {4, 12, 28, 60, 124} respectively.
In another embodiment, the scaling factor can be replaced by the factor derived using the MV scaling method as used in AMVP and/or Merge candidate derivation. The MV scaling module can be reused. For example, the motion vector, my is scaled as follows:
tx=(16384+(Abs(td)>>1))/td
distScaleFactor=Clip3(−4096, 4095, (tb*tx+32)>>6)
mv=Clip3(−32768, 32767, Sign(distScaleFactor*mvLX)*((Abs(distScaleFactor*mvLX)+127)>>8))
In the above equations, td is equal to denominator and tb is equal to the numerator. For example, the tb can be the (posCurPU_Y−posB2_Y) and the td can be the (posB3_X−posB2_X) in equation (9).
Note that, in this invention, the derived control-points MVs or the affine parameters can be used for Inter mode coding as the MVP or the Merge mode coding as the affine Merge candidates.
Any of the foregoing proposed methods can be implemented in encoders and/or decoders. For example, any of the proposed methods can be implemented in MV derivation module of an encoder, and/or an MV derivation module of a decoder. Alternatively, any of the proposed methods can be implemented as a circuit coupled to the MV derivation module of the encoder and/or the MV derivation module of the decoder, so as to provide the information needed by the MV derivation module.
The flowcharts shown are intended to illustrate an example of video according to the present invention. A person skilled in the art may modify each step, re-arranges the steps, split a step, or combine steps to practice the present invention without departing from the spirit of the present invention. In the disclosure, specific syntax and semantics have been used to illustrate examples to implement embodiments of the present invention. A skilled person may practice the present invention by substituting the syntax and semantics with equivalent syntax and semantics without departing from the spirit of the present invention.
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be a circuit integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
The present invention claims priority to U.S. Provisional Patent Application, Ser. No. 62/687,291, filed on Jun. 20, 2018, U.S. Provisional Patent Application, Ser. No. 62/717,162, filed on Aug. 10, 2018 and U.S. Provisional Patent Application, Ser. No. 62/764,748, filed on Aug. 15, 2018. The U.S. Provisional Patent Applications are hereby incorporated by reference in their entireties.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2019/092079 | 6/20/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62764748 | Aug 2018 | US | |
62717162 | Aug 2018 | US | |
62687291 | Jun 2018 | US |