The present invention relates to video coding system. In particular, the present invention relates to OBMC (Overlapped Block Motion Compensation) in a video coding system that uses various inter prediction coding tools with subblock processing.
Versatile video coding (VVC) is the latest international video coding standard developed by the Joint Video Experts Team (JVET) of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG). The standard has been published as an ISO standard: ISO/IEC 23090-3:2021, Information technology—Coded representation of immersive media—Part 3: Versatile video coding, published February 2021. VVC is developed based on its predecessor HEVC (High Efficiency Video Coding) by adding more coding tools to improve coding efficiency and also to handle various types of video sources including 3-dimensional (3D) video signals.
As shown in
The decoder, as shown in
According to VVC, an input picture is partitioned into non-overlapped square block regions referred as CTUs (Coding Tree Units), similar to HEVC. Each CTU can be partitioned into one or multiple smaller size coding units (CUs). The resulting CU partitions can be in square or rectangular shapes. Also, VVC divides a CTU into prediction units (PUs) as a unit to apply prediction process, such as Inter prediction, Intra prediction, etc.
The VVC standard incorporates various new coding tools to further improve the coding efficiency over the HEVC standard. Furthermore, various new coding tools have been proposed for consideration in the development of a new coding standard beyond the VVC. Among various new coding tools, the present invention provides some proposed methods to improve some of these coding tools.
A method and apparatus for video coding are disclosed. According to the method, input data associated with a current block is received, wherein the input data comprise pixel data for the current block to be encoded at an encoder side or coded data associated with the current block to be decoded at a decoder side. An inter prediction tool from a set of inter-prediction coding tools is determined for the current block. An OBMC (Overlapped Boundary Motion Compensation) subblock size for the current block is determined based on information related to the inter prediction tool selected for the current block or the inter prediction tool of a neighboring block. Subblock OBMC is applied to a subblock boundary between a neighboring subblock and a current subblock of the current block according to the OBMC subblock size.
In one embodiment, the OBMC subblock size is dependent on a smallest processing unit associated with the inter prediction tool selected for the current block.
In one embodiment, the inter prediction tool selected for the current block corresponds to a DMVR mode. For example, the OBMC subblock size is set to 8×8 if the inter prediction tool selected for the current block corresponds to the DMVR mode, and the OBMC subblock size is set to 4×4 if the inter prediction tool selected for the current block corresponds to an inter prediction tool other than the DMVR mode.
In one embodiment, the inter prediction tool selected for the current block corresponds to an affine mode. For example, the OBMC subblock size is set to 4×4 if the inter prediction tool selected for the current block corresponds to an affine mode, and the OBMC subblock size is set to include size 8×8 if the inter prediction tool selected for the current block corresponds to an inter prediction tool other than the affine mode.
In one embodiment, the inter prediction tool selected for the current block corresponds to an SbTMVP (Subblock-based Temporal Motion Vector Prediction) mode. For example, the OBMC subblock size is set to 4×4 if the inter prediction tool selected for the current block corresponds to an SbTMVP mode, and the OBMC subblock size is set to include size 8×8 if the inter prediction tool selected for the current block corresponds to an inter prediction tool other than the SbTMVP.
In one embodiment, the OBMC subblock size is set to 8×8 if the inter prediction tool selected for the current block corresponds to a DMVR mode, and the OBMC subblock size is set to 4×4 if the inter prediction tool selected for the current block corresponds to an affine more or an SbTMVP mode.
In one embodiment, the inter prediction tool selected for the current block corresponds to a GPM (Geometric Partition Mode).
It will be readily understood that the components of the present invention, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of the systems and methods of the present invention, as represented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. References throughout this specification to “one embodiment,” “an embodiment,” or similar language mean that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, etc. In other instances, well-known structures, or operations are not shown or described in detail to avoid obscuring aspects of the invention. The illustrated embodiments of the invention will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout. The following description is intended only by way of example, and simply illustrates certain selected embodiments of apparatus and methods that are consistent with the invention as claimed herein.
Overlapped Block Motion Compensation (OBMC)
Overlapped Block Motion Compensation (OBMC) is to find a Linear Minimum Mean Squared Error (LMMSE) estimate of a pixel intensity value based on motion-compensated signals derived from its nearby block motion vectors (MVs). From estimation-theoretic perspective, these MVs are regarded as different plausible hypotheses for its true motion, and to maximize coding efficiency, their weights should minimize the mean squared prediction error subject to the unit-gain constraint.
When High Efficient Video Coding (HEVC) was developed, several proposals were made using OBMC to provide coding gain. Some of them are described as follows.
In JCTVC-C251 (Peisong Chen, et. al., “Overlapped block motion compensation in TMuC”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 3rd Meeting: Guangzhou, CN, 7-15 Oct. 2010, Document: JCTVC-C251), OBMC was applied to geometry partition. In geometry partition, it is very likely that a transform block contains pixels belonging to different partitions. In geometry partition, since two different motion vectors are used for motion compensation, the pixels at the partition boundary may have large discontinuities that can produce visual artifacts similar to blockiness. This in turn decreases the transform efficiency. Let the two regions created by a geometry partition be denoted by region 1 and region 2. A pixel from region 1 (2) is defined to be a boundary pixel if any of its four connected neighbors (left, top, right, and bottom) belongs to region 2 (1).
In JCTVC-F299 (Liwei Guo, et. al., “CE2: Overlapped Block Motion Compensation for 2N×N and N×2N Motion Partitions”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 6th Meeting: Torino, 14-22 Jul. 2011, Document: JCTVC-F299), OBMC was applied to symmetrical motion partitions. If a coding unit (CU) is partitioned into 2 2N×N or N×2N prediction units (PUs), OBMC is applied to the horizontal boundary of the two 2N×N prediction blocks, and the vertical boundary of the two N×2N prediction blocks. Since those partitions may have different motion vectors, the pixels at partition boundaries may have large discontinuities, which may generate visual artifacts and also reduce the transform/coding efficiency. In JCTVC-F299, OBMC is introduced to smooth the boundaries of motion partition.
Currently, the OBMC is performed after normal MC, and BIO is also applied in these two MC processes, separately. That is, the MC results for the overlapped region between two CUs or PUs is generated by another process not in the normal MC process. BIO (Bi-Directional Optical Flow) is then applied to refine these two MC results. This can help to skip the redundant OBMC and BIO processes, when two neighboring MVs are the same. However, the required bandwidth and MC operations for the overlapped region is increased compared to integrating OBMC process into the normal MC process. For example, the current PU size is 16×8, the overlapped region is 16×2, and the interpolation filter in MC is 8-tap. If the OBMC is performed after normal MC, then we need (16+7)×(8+7)+(16+7)×(2+7)=552 reference pixels per reference list for the current PU and the related OBMC. If the OBMC operations are combined with normal MC into one stage, then only (16+7)×(8+2+7)=391 reference pixels per reference list for the current PU and the related OBMC. Therefore, in the following, in order to reduce the computation complexity or memory bandwidth of BIO, several methods are proposed, when BIO and OBMC are enabled simultaneously.
In the JEM (Joint Exploration Model), the OBMC is also applied. In the JEM, unlike in H.263, OBMC can be switched on and off using syntax at the CU level. When OBMC is used in the JEM, the OBMC is performed for all motion compensation (MC) block boundaries except for the right and bottom boundaries of a CU. Moreover, it is applied for both the luma and chroma components. In the JEM, a MC block corresponds to a coding block. When a CU is coded with sub-CU mode (includes sub-CU merge, affine and FRUC mode), each sub-block of the CU is a MC block. To process CU boundaries in a uniform fashion, OBMC is performed at sub-block level for all MC block boundaries, where sub-block size is set equal to 4×4, as illustrated in Fig. A-B.
When OBMC is applied to the current sub-block, besides current motion vectors, motion vectors of four connected neighboring sub-blocks, if available and are not identical to the current motion vector, are also used to derive the prediction block for the current sub-block. These multiple prediction blocks based on multiple motion vectors are combined to generate the final prediction signal of the current sub-block. Prediction block based on motion vectors of a neighboring sub-block is denoted as PN, with N indicating an index for the neighboring above, below, left and right sub-blocks and prediction block based on motion vectors of the current sub-block is denoted as PC.
In the JEM, for a CU with size less than or equal to 256 luma samples, a CU level flag is signaled to indicate whether OBMC is applied or not for the current CU. For the CUs with size larger than 256 luma samples or not coded with the AMVP mode, OBMC is applied by default. At the encoder, when OBMC is applied for a CU, its impact is taken into account during the motion estimation stage. The prediction signal formed by OBMC using motion information of the top neighboring block and the left neighboring block is used to compensate the top and left boundaries of the original signal of the current CU, and then the normal motion estimation process is applied.
In JEM (Joint Exploration Model for VVC development), the OBMC is applied. For example, as shown in
For an M×N block, if the MV is not integer and a 8-tap interpolation filter is applied, a reference block with size of (M+7)×(N+7) is used for motion compensation. However, if the BIO and OBMC is applied, additional reference pixels are required, which increases the worst case memory bandwidth.
There are two different schemes to implement OBMC.
In the first scheme, OBMC blocks are pre-generated when doing motion compensation for each block. These OBMC blocks will be stored in a local buffer for neighboring blocks. In the second scheme, the OBMC blocks are generated before the blending process of each block when doing OBMC.
In both schemes, several methods are proposed to reduce the computation complexity, especially for the interpolation filtering, and additional bandwidth requirement of OBMC.
Decoder Side Motion Vector Refinement (DMVR) in VVC
In order to increase the accuracy of the MVs of the merge mode, a bilateral-matching (BM) based decoder side motion vector refinement is applied in VVC. In bi-prediction operation, a refined MV is searched around the initial MVs (732 and 734) in the reference picture list L0 712 and reference picture list L1 714 for a current block 720 the current picture 710. The collocated blocks 722 and 724 in L0 and L1 are determined according to the initial MVs 730 and 732) and the location of the current block 720 in the current picture as shown in
In VVC, the application of DMVR is restricted and is only applied for the CUs which are coded with following modes and features:
The refined MV derived by the DMVR process is used to generate the inter prediction samples and also used in temporal motion vector prediction for future pictures coding. While the original MV is used in the deblocking process and also used in spatial motion vector prediction for future CU coding.
The additional features of DMVR are mentioned in the following sub-clauses.
DMVR Searching Scheme
In DMVR, the search points are surrounding the initial MV and the MV offset obey the MV difference mirroring rule. In other words, any points that are checked by DMVR, denoted by candidate MV pair (MV0, MV1) obey the following two equations:
MV0′=MV0+MV_offset, (1)
MV1′=MV1−MV_offset. (2)
Where MV_offset represents the refinement offset between the initial MV and the refined MV in one of the reference pictures. The refinement search range is two integer luma samples from the initial MV. The searching includes the integer sample offset search stage and fractional sample refinement stage.
Twenty-five (25) points full search is applied for integer sample offset searching. The SAD of the initial MV pair is first calculated. If the SAD of the initial MV pair is smaller than a threshold, the integer sample stage of DMVR is terminated. Otherwise, SADs of the remaining 24 points are calculated and checked in the raster scanning order. The point with the smallest SAD is selected as the output of integer sample offset searching stage. To reduce the penalty of the uncertainty of DMVR refinement, it is proposed to favour the original MV during the DMVR process. The SAD between the reference blocks referred by the initial MV candidates is decreased by ¼ of the SAD value.
The integer sample search is followed by fractional sample refinement. To save the computational complexity, the fractional sample refinement is derived by using a parametric error surface equation, instead of additional search with SAD comparison. The fractional sample refinement is conditionally invoked based on the output of the integer sample search stage. When the integer sample search stage is terminated with center having the smallest SAD in either the first iteration or the second iteration search, the fractional sample refinement is further applied.
In parametric error surface based sub-pixel offsets estimation, the center position cost and the costs at four neighboring positions from the center are used to fit a 2-D parabolic error surface equation of the following form
E(x,y)=A(x−xmin)2+B(y−ymin)2+C, (3)
where (xmin, ymin) corresponds to the fractional position with the least cost and C corresponds to the minimum cost value. By solving the above equations by using the cost value of the five search points, the (xmin, ymin) is computed as:
x
min=(E(−1,0)−E(1,0))/(2(E(−1,0)+E(1,0)−2E(0,0))), (4)
y
min=(E(0,−1)−E(0,1))/(2((E(0,−1)+E(0,1)−2E(0,0))). (5)
The value of xmin and ymin are automatically constrained to be between −8 and 8 since all cost values are positive and the smallest value is E(0,0). This corresponds to half peal offset with 1/16th-pel MV accuracy in VVC. The computed fractional (xmin, ymin) are added to the integer distance refinement MV to get the sub-pixel accurate refinement delta MV.
Bilinear-Interpolation and Sample Padding
In VVC, the resolution of the MVs is 1/16 luma samples. The samples at the fractional position are interpolated using an 8-tap interpolation filter. In DMVR, the search points are surrounding the initial fractional-pel MV with an integer sample offset, therefore the samples of those fractional position need to be interpolated for the DMVR search process. To reduce the computational complexity, the bi-linear interpolation filter is used to generate the fractional samples for the searching process in DMVR. Another important effect is that by using bi-linear filter with a 2-sample search range, the DVMR does not access more reference samples compared to the normal motion compensation process. After the refined MV is obtained with the DMVR search process, the normal 8-tap interpolation filter is applied to generate the final prediction. In order not to access more reference samples than the normal MC process, the samples, which is not needed for the interpolation process based on the original MV but is needed for the interpolation process based on the refined MV, will be padded from those available samples.
When the width and/or height of a CU are larger than 16 luma samples, it will be further split into subblocks with width and/or height equal to 16 luma samples. The maximum unit size for DMVR searching process is limit to 16×16.
Affine Motion Compensated Prediction in VVC
In HEVC, only translation motion model is applied for motion compensation prediction (MCP). While in the real world, there are many kinds of motion, e.g. zoom in/out, rotation, perspective motions and the other irregular motions. In VVC, a block-based affine transform motion compensation prediction is applied. As shown Fig. A-B, the affine motion field of the block is described by motion information of two control point (4-parameter) in
For 4-parameter affine motion model, motion vector at sample location (x, y) in a block is derived as:
For 6-parameter affine motion model, motion vector at sample location (x, y) in a block is derived as:
Where (mv0x, mv0y) is the motion vector of the top-left corner control point, (mv1x, mv1y) is the motion vector of the top-right corner control point, and (mv2x, mv2y) is the motion vector of the bottom-left corner control point.
In order to simplify the motion compensation prediction, block based affine transform prediction is applied. To derive motion vector of each 4×4 luma subblock, the motion vector of the center sample of each subblock, as shown in
As done for translational motion inter prediction, there are also two affine motion inter prediction modes: affine merge mode and affine AMVP mode.
Affine Merge Prediction
AF_MERGE (i.e., Affine Merge) mode can be applied for CUs with both width and height larger than or equal to 8. In this mode the CPMVs of the current CU is generated based on the motion information of the spatial neighboring CUs. There can be up to five CPMVP candidates and an index is signaled to indicate the one to be used for the current CU. The following three types of CPMV candidate are used to form the affine merge candidate list:
In VVC, there are two inherited affine candidates at most, which are derived from the affine motion model of the neighboring blocks, one from left neighboring CUs and one from above neighboring CUs. The candidate blocks are shown in
Constructed affine candidate means the candidate is constructed by combining the neighbor translational motion information of each control point. The motion information for the control points is derived from the specified spatial neighbors and temporal neighbor of a current block 1210 as shown in
After MVs of four control points are attained, affine merge candidates are constructed based on the motion information of these control points. The following combinations of control point MVs are used to construct in order:
The combination of three CPMVs constructs a 6-parameter affine merge candidate and the combination of two CPMVs constructs a 4-parameter affine merge candidate. To avoid motion scaling process, if the reference indices of control points are different, the related combination of control point MVs is discarded.
After inherited affine merge candidates and constructed affine merge candidate are checked, if the list is still not full, zero MVs are inserted to the end of the list.
Affine AMVP Prediction
Affine AMVP mode can be applied to CUs with both width and height larger than or equal to 16. An affine flag in CU level is signaled in the bitstream to indicate whether affine AMVP mode is used and then another flag is signaled to indicate whether 4-parameter affine or 6-parameter affine. In this mode, the difference of the CPMVs of current CU and their predictors CPMVPs is signaled in the bitstream. The affine AVMP candidate list size is 2 and it is generated by using the following four types of CPMV candidate in order:
The checking order of inherited affine AMVP candidates is same to the checking order of inherited affine merge candidates. The only difference is that, for AVMP candidate, only the affine CU that has the same reference picture as in current block is considered. No pruning process is applied when inserting an inherited affine motion predictor into the candidate list.
Constructed AMVP candidate is derived from the specified spatial neighbors shown in
If affine AMVP list candidates is still less than 2 after valid inherited affine AMVP candidates and constructed AMVP candidate are inserted, mv0, mv1 and mv2 will be added, in order, as the translational MVs to predict all control point MVs of the current CU, when available. Finally, zero MVs are used to fill the affine AMVP list if it is still not full.
Affine Motion Information Storage
In VVC, the CPMVs of affine CUs are stored in a separate buffer. The stored CPMVs are only used to generate the inherited CPMVPs in the affine merge mode and affine AMVP mode for the lately coded CUs. The subblock MVs derived from CPMVs are used for motion compensation, MV derivation of merge/AMVP list of translational MVs and de-blocking.
To avoid the picture line buffer for the additional CPMVs, affine motion data inheritance from the CUs of the above CTU is treated differently for the inheritance from the normal neighboring CUs. If the candidate CU for affine motion data inheritance is in the above CTU line, the bottom-left and bottom-right subblock MVs in the line buffer instead of the CPMVs are used for the affine MVP derivation. In this way, the CPMVs are only stored in a local buffer. If the candidate CU is 6-parameter affine coded, the affine model is degraded to 4-parameter model. As shown in
Prediction Refinement with Optical Flow for Affine Mode
Subblock based affine motion compensation can save memory access bandwidth and reduce computation complexity compared to pixel based motion compensation, at the cost of prediction accuracy penalty. To achieve a finer granularity of motion compensation, prediction refinement with optical flow (PROF) is used to refine the subblock based affine motion compensated prediction without increasing the memory access bandwidth for motion compensation. In VVC, after the subblock based affine motion compensation is performed, luma prediction sample is refined by adding a difference derived by the optical flow equation. The PROF is described as following four steps:
g
x(i,j)=(I(i+1,j)>>shift1)−(I(i−1,j)>>shift1) (8)
g
y(i,j)=(I(i,j+1)>>shift1)−(I(i−1)>>shift1) (9)
ΔI(i,j)=gx(i,j)*Δvx(i,j)+gy(i,j)*Δvy(i,j) (10)
Since the affine model parameters and the sample location relative to the subblock center are not changed from subblock to subblock, Δv(i,j) can be calculated for the first subblock, and reused for other subblocks in the same CU. Let dx(i, j) and dy(i, j) be the horizontal and vertical offsets from the sample location (i,j) to the center of the subblock (xSB, ySB), Δv(x, y) can be derived by the following equation,
In order to keep accuracy, the enter of the subblock (xSB, ySB) is calculated as ((WSB−1)/2, (HSB−1)/2), where WSB and HSB are the subblock width and height, respectively.
For 4-parameter affine model,
For 6-parameter affine model,
where (v0x, v0y), (v1x, v1y), (v2x, v2y) are the top-left, top-right and bottom-left control point motion vectors, w and h are the width and height of the CU.
The fourth step of PROF is as following:
I′(i,j)=I(i,j)+ΔI(i,j) (15)
PROF is not applied in two cases for an affine coded CU: 1) all control point MVs are the same, which indicates the CU only has translational motion; 2) the affine motion parameters are greater than a specified limit because the subblock based affine MC is degraded to CU based MC to avoid large memory access bandwidth requirement.
A fast encoding method is applied to reduce the encoding complexity of affine motion estimation with PROF. PROF is not applied at affine motion estimation stage in following two situations: a) if this CU is not the root block and its parent block does not select the affine mode as its best mode, PROF is not applied since the possibility for current CU to select the affine mode as best mode is low; and b) if the magnitude of four affine parameters (C, D, E, F) are all smaller than a predefined threshold and the current picture is not a low delay picture, PROF is not applied because the improvement introduced by PROF is small for this case. In this way, the affine motion estimation with PROF can be accelerated.
Subblock-Based Temporal Motion Vector Prediction (SbTMVP) in VVC
VVC supports the subblock-based temporal motion vector prediction (SbTMVP) method. Similar to the temporal motion vector prediction (TMVP) in HEVC, SbTMVP uses the motion field in the collocated picture to improve motion vector prediction and merge mode for CUs in the current picture. The same collocated picture used by TMVP is used for SbTMVP. SbTMVP differs from TMVP in the following two main aspects:
The SbTMVP process is illustrated in
In the second step, the motion shift identified in Step 1 is applied (i.e. added to the current block's coordinates) to obtain sub-CU level motion information (motion vectors and reference indices) from the collocated picture as shown in
In VVC, a combined subblock based merge list, which contains both SbTMVP candidate and affine merge candidates, is used for the signaling of subblock based merge mode. The SbTMVP mode is enabled/disabled by a sequence parameter set (SPS) flag. If the SbTMVP mode is enabled, the SbTMVP predictor is added as the first entry of the list of subblock based merge candidates, and followed by the affine merge candidates. The size of subblock based merge list is signaled in SPS and the maximum allowed size of the subblock based merge list is 5 in VVC.
The sub-CU size used in SbTMVP is fixed to be 8×8, and as done for the affine merge mode, SbTMVP mode is only applicable to the CU with both width and height are larger than or equal to 8.
The encoding processing flow of the additional SbTMVP merge candidate is the same as for the other merge candidates, that is, for each CU in P or B slice, an additional RD check is performed to decide whether to use the SbTMVP candidate.
The motion unit in different coding tools is different. For example, in affine is 4×4 subblock and in multi-pass DMVR is 8×8 subblock. Subblock-boundary OBMC uses different motions to do MC to refine each subblock predictor so as to reduce discontinuity/blocking artefact in subblock boundary. However, in current Enhanced Compression Model (ECM) for international video coding standard development beyond the VVC, subblock-boundary OBMC treats all motion units as 4×4 subblock size in the affine mode and in the multi-pass DMVR mode. Therefore, subblock-boundary OBMC may not treat the subblock boundary properly. This issue may also exist in other prediction coding tools supporting subblock processing.
A new adaptive OBMC subblock size method is proposed. In this method, when OBMC is applied to the current block, the OBMC subblock size may be changed according to information related to the inter prediction tool selected for the current block (for example, its current block prediction information, current block mode information, current block size, current block shape or any other information related to the inter prediction tool selected for the current block), information related to the inter prediction tool of a neighboring block (for example, neighboring block information, neighboring block size, neighboring block shape or any other information related to the inter prediction tool of a neighboring block), cost metrics, or any combination of them. The OBMC subblock size can be matched to the smallest (or finest) motion changing unit in different prediction modes, or it can always be the same OBMC subblock size regardless of different prediction mode. The motion changing unit is also referred as the motion processing unit.
In one embodiment, when current block is coded in the DMVR mode, the OBMC subblock size is set to be M1×N1 (M1 and N1 being non-negative integers) for luma, depending on the smallest motion changing unit in the DMVR mode. For example, the OBMC subblock size for the DMVR mode can be set to 8×8 while the OBMC subblock size for other coding modes is always set to M2×N2 (M2 and N2 being non-negative integers) for luma. For example, the OBMC subblock size can be 4×4 for other modes.
In another embodiment, when the current block is coded in the affine mode, the OBMC subblock size is set to be M1×N1 (M1 and N1 being non-negative integers) for luma, depending on the smallest motion changing unit in the affine mode. For example, the OBMC subblock size for the affine mode can be set to 4×4, while the OBMC subblock size for other modes is always set to M2×N2 (M2 and N2 being non-negative integers) for luma. For example, the OBMC subblock size can be 4×4 or 8×8 for other coding modes.
In another embodiment, when the current block is coded in the SbTMVP mode, the OBMC subblock size is set to be M1×N1 (M1 and N1 being non-negative integers) for luma, depending on the smallest motion changing unit in the SbTMVP mode. For example, the OBMC subblock size for the SbTMVP mode can be set to 4×4, while the OBMC subblock size for other modes is always set to M2×N2 (M2 and N2 being non-negative integers) for luma. For example, the OBMC subblock size can be 4×4 or 8×8 for other coding modes.
In another embodiment, when current block is coded in prediction modes that will refine motion in the subblock level, the OBMC subblock size is set to be motion changing subblock size for luma, depending on the smallest motion changing unit in each prediction mode. For example, the 8×8 OBMC subblock size can be used for the current block coded in the DMVR mode, and the 4×4 OBMC subblock size can be used for the current block coded in the affine mode or SbTMVP mode.
In another embodiment, when the current block is coded in a Geometric Prediction Mode (GPM) or partitioned in a geometric shape, the OBMC subblock size is set to be motion changing subblock size for luma, depending on the smallest motion changing unit in its prediction mode shape or its partition shape.
In another embodiment, when the neighboring block is coded in prediction modes that will refine motion in the subblock level, for to current block to be applied OBMC, the OBMC subblock size for the current block is set to be motion changing subblock size for luma, depending on the smallest motion changing unit in each prediction mode from neighboring blocks or from the current block. For example, 8×8 OBMC subblock size is used for blocks coded in the DMVR mode, 4×4 OBMC subblock size for blocks coded in the affine mode or SbTMVP mode.
In another embodiment, when a neighboring block is coded in a Geometric Prediction Mode (GPM) or partitioned in a geometric shape that its motion can be in geometric region, the OBMC subblock size for the current block is set to be the motion changing subblock size in luma, depending on the smallest motion changing unit in each prediction mode from a neighboring block or from the current block. For example, 8×8 OBMC subblock size is used for blocks coded in the DMVR mode, 4×4 OBMC subblock size is used for blocks coded in the affine mode or SbTMVP mode.
In another embodiment, when OBMC is applied to the current block, it may use neighboring reconstruction samples to calculate the cost to decide the OBMC subblock size. For example, the template matching method or bilateral matching method can be used to calculate the cost and determine the smallest motion changing unit accordingly.
In another embodiment, when OBMC is applied to the current block, the template matching is performed for each subblock to calculate the cost between the reconstruction samples and reference samples of the subblock above or left of the current subblock. If the cost is smaller than a threshold, the OBMC subblock size is enlarged since the motion similarity is high. Otherwise (i.e., the cost being larger than the threshold), OBMC subblock size is kept unchanged since the neighboring motion and current motion is not similar.
Any of the foregoing proposed methods can be implemented in encoders and/or decoders. For example, in the encoder side, the required OBMC and related processing can be implemented in a predictor derivation module, such as part of the Inter-Pred. unit 112 as shown in
The flowchart shown is intended to illustrate an example of video coding according to the present invention. A person skilled in the art may modify each step, re-arranges the steps, split a step, or combine steps to practice the present invention without departing from the spirit of the present invention. In the disclosure, specific syntax and semantics have been used to illustrate examples to implement embodiments of the present invention. A skilled person may practice the present invention by substituting the syntax and semantics with equivalent syntax and semantics without departing from the spirit of the present invention.
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
The present invention is a non-Provisional application of and claims priority to U.S. Provisional Patent Application No. 63/329,509, filed on Apr. 11, 2022. The U.S. Provisional Patent Application is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63329509 | Apr 2022 | US |