The present invention relates to a video coding device, a video decoding device, and a video system using block based affine transform motion compensated prediction.
As a video coding scheme, a scheme based on the HEVC (High Efficiency Video Coding) standard is described in Non Patent Literature (NPL) 1. NPL 2 discloses a block based affine transform motion compensated prediction technique to enhance the compression efficiency of HEVC.
With affine transform motion compensated prediction, motion that involves deformation such as zoom or rotation, which cannot be expressed with motion compensated prediction based on a translation model used in HEVC, can be expressed.
An affine transform motion compensated prediction technique is described in NPL 3.
The foregoing block based affine transform motion compensated prediction (hereafter referred to as “typical block based affine transform motion compensated prediction”) is simplified affine transform motion compensated prediction having the following features.
The typical block based affine transform motion compensated prediction will be described below, with reference to explanatory diagrams in
A control point motion vector setting unit 5051 and a subblock motion vector derivation unit 5052 depicted in
The control point motion vector setting unit 5051 sets input two motion vectors as motion vectors (vTL and vTR in (B) in
A motion vector at a position (x, y) {0≤x≤w−1, 0≤y≤h−1} in the block to be processed is expressed as follows.
v(x)=((vTR(x)−vTL(x))×x/w)−((vTR(y)−vTL(y))×y/w)+vTL(x) (1).
v(y)=((vTR(y)−vTL(y))×x/w)+((vTR(x)−vTL(x))×y/w)+vTL(y) (2).
In the formulas, vTL(x), vTL(y), vTR(x), and vTR(y) respectively denote a component of vTL, in the x direction (horizontal direction), a component of vTL in the y direction (vertical direction), a component of vTR in the x direction (horizontal direction), and a component of vTR in the y direction (vertical direction).
Next, the subblock motion vector derivation unit 5052 calculates, for each subblock, a motion vector at the center position in the subblock as a subblock motion vector, based on motion vector expression of the position in the block to be processed.
Thus, the control point motion vector setting unit 5051 and the subblock motion vector derivation unit 5052 determine the subblock motion vectors.
NPL 1: R. Joshi et al., “HEVC Screen Content Coding Draft Text 5” document JCTVC-vtr005, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC1/SC 29/WG 11, 22nd Meeting: Geneva, CH, 15-21 Oct. 2015.
NPL 2: J. Chen et al., “Algorithm Description of Joint Exploration Test Model 5 (JEM 5)” document JVET-E1001-v2, Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 5th Meeting: Geneva, CH, 12-20 Jan. 2017.
NPL 3: K. Zhang et al., “Video coding using affine motion compensated prediction”, ISCASSP 1996.
With the typical block based affine transform motion compensated prediction described above, the motion vectors are scattered in the block to be processed. Consequently, in a video coding device using the typical block based affine transform motion compensated prediction, the amount of memory access relating to reference pictures in a motion compensated prediction process increases massively as compared with the case of using normal motion compensated prediction (motion compensated prediction based on a translation model with which motion vectors are not scattered in a block to be processed).
For example, when the typical block based affine transform motion compensated prediction is applied to a video signal of a large image size such as 8K, there is a possibility that the amount of memory access relating to reference pictures exceeds the peak band of memory included in the device.
Herein, the “large image size” means that at least one of the number of pixels picWidth in the horizontal direction of the picture in depicted in
As described above, the typical block based affine transform motion compensated prediction has a problem in that the implementation cost of the video coding device and the video decoding device increases.
The present invention has an object of providing a video coding device, a video decoding device, a video coding method, a video decoding method, a program, and a video system that can reduce the amount of memory access and reduce the implementation cost in the case of using block based affine transform motion compensated prediction.
A video coding device according to the present invention is a video coding device that performs video coding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video coding device including block based affine transform motion compensated prediction control means for controlling at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using a coding parameter supplied from outside.
A video decoding device according to the present invention is a video decoding device that performs video decoding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video decoding device including block based affine transform motion compensated prediction control means for controlling at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using at least a coding parameter extracted from a bitstream.
A video coding method according to the present invention is a video coding method of performing video coding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video coding method including controlling at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using a supplied coding parameter.
A video decoding method according to the present invention is a video decoding method of performing video decoding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video decoding method including controlling at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using at least a coding parameter extracted from a bitstream.
A video coding program according to the present invention is a video coding program executed in a video coding device that performs video coding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video coding program causing a computer to control at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using a supplied coding parameter.
A video decoding program according to the present invention is a video decoding program executed in a video decoding device that performs video decoding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video decoding program causing a computer to control at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using at least a coding parameter extracted from a bitstream.
A video system according to the present invention is a video system that uses a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video system including: a video coding device for performing video coding using the block based affine transform motion compensated prediction; and a video decoding device for performing video decoding using the block based affine transform motion compensated prediction, wherein the video coding device includes coding-side block based affine transform motion compensated prediction control means for controlling at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using a coding parameter supplied in the video system, and wherein the video decoding device includes decoding-side block based affine transform motion compensated prediction control means for controlling at least one of the block size, the prediction direction, and the motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using at least a coding parameter extracted from a bitstream from the video coding device.
According to the present invention, the amount of memory access can be reduced, and the implementation cost can be reduced.
Moreover, as a result of the video coding device and the video decoding device reducing the amount of memory access by a common method, a video system in which the interconnectivity between the video coding device and the video decoding device is ensured can be provided.
First, intra prediction, inter-frame prediction, and signaling of CU and CTU used in a video coding device according to this exemplary embodiment and the below-described video decoding device will be described below.
Each frame of digitized video is split into coding tree units (CTUs), and each CTU is coded in raster scan order.
Each CTU is split into coding units (CUs) and coded, in a quadtree structure.
Each CU is prediction-coded. Prediction coding includes intra prediction and inter-frame prediction.
A prediction error of each CU is transform-coded based on frequency transform.
A CU of the largest size is referred to as a “largest CU” (largest coding unit: LCU), and a CU of the smallest size is referred to as a “smallest CU” (smallest coding unit: SCU). The LCU size and the CTU size are the same.
Intra prediction is prediction for generating a prediction image from a reconstructed image having the same display time as a frame to be coded. NPL 1 defines 33 types of angular intra prediction depicted in
Inter-frame prediction is prediction for generating a prediction image from a reconstructed image (reference picture) different in display time from a frame to be coded. Inter-frame prediction is hereafter also referred to as “inter prediction”.
In this exemplary embodiment, the video coding device can use the normal motion compensated prediction depicted in
A frame coded including only intra CUs is called “I frame” (or “I picture”). A frame coded including not only intra CUs but also inter CUs is called “P frame” (or “P picture”). A frame coded including inter CUs that each use not only one reference picture but two reference pictures simultaneously for the inter prediction of the block is called “B frame” (or “B picture”).
Inter-frame prediction using one reference picture is referred to as “unidirectional prediction”, and inter-frame prediction using two reference pictures simultaneously is referred to as “bidirectional prediction”.
This completes the description of intra prediction, inter-frame prediction, and signaling of CTU and CU.
A structure and operation of the video coding device according to this exemplary embodiment that receives each CU of each frame of digitized video as an input image and outputs a bitstream will be described below, with reference to
A video coding device depicted in
The predictor 105 determines, for each CTU, a cu_split_flag syntax value for determining a CU partitioning shape that minimizes the coding cost.
The predictor 105 then determines, for each CU, a pred_mode_flag syntax value for determining intra prediction/inter prediction, an inter_affine_flag syntax value indicating whether the inter CU is based on block based affine transform motion compensated prediction, an intra prediction direction (intra prediction direction of motion compensated prediction for the block to be processed), and a motion vector that minimize the coding cost. The predictor 105 includes a block based affine transform motion compensated prediction controller 1050. The prediction direction of motion compensated prediction for the block to be processed is hereafter simply referred to as a “prediction direction”.
The predictor 105 generates a prediction signal corresponding to the input image signal of each CU, based on the determined cu_split_flag syntax value, pred_mode_flag syntax value, inter_affine_flag syntax value, intra prediction direction, motion vector, etc. The prediction signal is generated based on the foregoing intra prediction or inter-frame prediction.
Inter-frame prediction is normal motion compensated prediction when inter_affine_flag=0, and is block based affine transform motion compensated prediction otherwise (i.e. when inter_affine_flag=1).
The transformer/quantizer 101 frequency-transforms a prediction error image obtained by subtracting the prediction signal from the input image signal.
The transformer/quantizer 101 further quantizes the frequency-transformed prediction error image (frequency transform coefficient). The quantized frequency transform coefficient is hereafter referred to as a “transform quantization value”.
The entropy encoder 102 entropy-codes the cu_split_flag syntax value, the pred_mode_flag syntax value, the inter_affine_flag syntax value, the difference information of the intra prediction direction, and the difference information of motion vectors determined by the predictor 105, and the transform quantization value.
The inverse quantizer/inverse transformer 103 inverse-quantizes the transform quantization value. The inverse quantizer/inverse transformer 103 further inverse-frequency-transforms the frequency transform coefficient obtained by the inverse quantization. The prediction signal is added to the reconstructed prediction error image obtained by the inverse frequency transform, and the result is supplied to the buffer 104. The buffer 104 stores the reconstructed image.
The multiplexer 106 multiplexes and outputs the entropy-coded data supplied from the entropy encoder 102, as a bitstream.
The bitstream includes the image size, the prediction direction determined by the predictor 105, and the difference between motion vectors determined by the predictor 105 (in particular, the difference between motion vectors of control points in the block).
Operation of the block based affine transform motion compensated prediction controller 1050 will be described below.
The control point motion vector setting unit 1051 sets input two motion vectors as motion vectors (vTL and vTR in (B) in
A motion vector ata position (x, y) {0≤x≤w−1, 0≤y≤h−1} in the block to be processed is expressed by the foregoing formulas (1) and (2).
The operation of the block based affine transform motion compensated prediction controller 1050 will be described below, with reference to a flowchart in
The control point motion vector setting unit 1051 assigns externally input motion vectors to control points of a block to be processed, as in the control point motion vector setting unit 5051 in
In the case where the image size is greater than the predetermined size, the control function added subblock motion vector derivation unit 1052 sets 8×8 pixels which are larger than 4×4 pixel size depicted in
In the case where the image size is not greater than the predetermined size, the control function added subblock motion vector derivation unit 1052 sets the subblock size to be the same as 4×4 pixel size depicted in
The control function added subblock motion vector derivation unit 1052 calculates, for each subblock, a motion vector at the center position in the subblock based on motion vector representation of position in the block to be processed, and sets the calculated motion vector as a subblock motion vector, as in the subblock motion vector derivation unit 5052 in
The predictor 105 generates a prediction signal for an input image signal of each CU based on the determined motion vector and the like, as described above.
In the case where the image size is greater than the predetermined size, the number of motion vectors of block based affine transform motion compensated prediction for a block to be processed in the video coding device according to this exemplary embodiment is less than the number of motion vectors in a conventional video coding device, as can be understood from the difference between the number of motion vectors in LO direction of subblocks in (C) in
A structure and operation of a video decoding device that receives a bitstream as input from a video coding device or the like and outputs a decoded video frame will be described below, with reference to
The video decoding device according to this exemplary embodiment includes a de-multiplexer 201, an entropy decoder 202, an inverse quantizer/inverse transformer 203, a predictor 204, and a buffer 205.
The de-multiplexer 201 de-multiplexes an input bitstream to extract an entropy-coded video bitstream.
The entropy decoder 202 entropy-decodes the video bitstream. The entropy decoder 202 entropy-decodes the coding parameters and the transform quantization value, and supplies them to the inverse quantizer/inverse transformer 203 and the predictor 204.
The entropy decoder 202 also supplies cu_split_flag, pred_mode_flag, inter_affine_flag, intra prediction direction, and motion vector to the predictor 204.
The inverse quantizer/inverse transformer 203 inverse-quantizes the transform quantization value. The inverse quantizer/inverse transformer 203 further inverse-frequency-transforms the frequency transform coefficient obtained by the inverse quantization.
After the inverse frequency transform, the predictor 204 generates a prediction signal using a reconstructed image stored in the buffer 205, based on the entropy-decoded cu_split_flag, pred_mode_flag, inter_affine_flag, intra prediction direction, and motion vector. The prediction signal is generated based on the foregoing intra prediction or inter-frame prediction.
Inter-frame prediction is normal motion compensated prediction when inter_affine_flag=0, and is block based affine transform motion compensated prediction otherwise (i.e. when inter_affine_flag=1).
The predictor 204 includes a block based affine transform motion compensated prediction controller 2040. The block based affine transform motion compensated prediction controller 2040 sets a motion vector in each control point and then determines a subblock size depending on whether the image size is greater than the predetermined size, as in the block based affine transform motion compensated prediction controller 1050 in the video coding device according to Exemplary Embodiment 1. The block based affine transform motion compensated prediction controller 2040 then calculates, for each subblock, a motion vector at the center position in the subblock based on motion vector representation of position in the block to be processed, and sets the calculated motion vector as a subblock motion vector. In detail, the block based affine transform motion compensated prediction controller 2040 includes blocks that operate in the same way as the control point motion vector setting unit 1051 and the control function added subblock motion vector derivation unit 1052.
After the prediction signal is generated, the prediction signal supplied from the predictor 204 is added to the reconstructed prediction error image obtained by the inverse frequency transform by the inverse quantizer/inverse transformer 203, and the result is supplied to the buffer 205 as a reconstructed image.
The reconstructed image stored in the buffer 205 is then output as a decoded image (decoded video).
In the case where the image size is greater than the predetermined size, the number of motion vectors of block based affine transform motion compensated prediction for a block to be processed in the video decoding device according to this exemplary embodiment is less than the number of motion vectors in a conventional video decoding device, as can be understood from the difference between the number of motion vectors in L0 direction of subblocks in (C) in
In the video coding device according to Exemplary Embodiment 1 and the video decoding device according to Exemplary Embodiment 2, the block based affine transform motion compensated prediction controllers 1050 and 2040 increase the subblock size to reduce the amount of memory access, in the case of determining that the amount of memory access relating to reference pictures is large.
The amount of memory access can also be reduced by making the subblock motion vector into an integer vector (i.e. changing the pixel position designated by the motion vector to an integer position) as depicted in
The video coding device and the corresponding video decoding device according to Exemplary Embodiment 3 may have the same overall structures as those depicted in
The operation of the block based affine transform motion compensated prediction controller 1050 in the video coding device according to Exemplary Embodiment 3 will be described below, with reference to a flowchart in
The control point motion vector setting unit 1051 assigns externally input motion vectors to control points of a block to be processed, as in the control point motion vector setting unit 5051 in
The control function added subblock motion vector derivation unit 1052 then determines whether the image size is greater than a predetermined size (step S1003). In the case where the image size is not greater than the predetermined size, the process ends. In this case, the motion vector v remains to be a vector of fractional precision.
In the case where the image size is greater than the predetermined size, the control function added subblock motion vector derivation unit 1052 rounds the motion vector v of each subblock to a vector of integer precision (step S2001).
The motion vector v is expressed by the following formulas.
vINT(x)=floor(v(x), prec)
vINT(y)=floor(v(x), prec) (3).
In the formulas, floor(a, b) is a function returning a multiple of b. The returned multiple of b is closest to a variable a among plural multiples of b. “prec” means pixel precision of a motion vector. For example, in the case where the motion vector pixel precision is 1/16, prec=16.
The predictor 105 (in the video decoding device, the predictor 204) generates a prediction signal for an input image signal of each CU, based on the determined motion vector and the like.
In the video coding device according to Exemplary Embodiment 1 and the video decoding device according to Exemplary Embodiment 2, the block based affine transform motion compensated prediction controllers 1050 and 2040 increase the subblock size to reduce the amount of memory access, in the case of determining that the amount of memory access relating to reference pictures is large.
The amount of memory access can also be reduced by forcedly setting the motion vector of the block to be processed in bidirectional prediction to unidirectional, instead of increasing the subblock size.
The video coding device and the corresponding video decoding device according to Exemplary Embodiment 4 may have the same overall structures as those depicted in
The operation of the block based affine transform motion compensated prediction controller 1050 in the video coding device according to Exemplary Embodiment 4 will be described below, with reference to a flowchart in
The control point motion vector setting unit 1051 assigns externally input motion vectors to control points of a block to be processed, as in the control point motion vector setting unit 5051 in
The control function added subblock motion vector derivation unit 1052 then determines whether the image size is greater than a predetermined size (step S1003). In the case where the image size is not greater than the predetermined size, the process ends. In this case, the motion vector may be a bidirectional vector.
In the case where the image size is greater than the predetermined size, the control function added subblock motion vector derivation unit 1052 disables the subblock motion vector in L1 direction, to limit the motion vector v of each subblock to unidirectional (step S2002).
The predictor 105 (in the video decoding device, the predictor 204) generates a prediction signal for an input image signal of each CU, based on the determined motion vector and the like.
The control function added subblock motion vector derivation unit 1052 may disable the subblock motion vector in L0 direction, instead of disabling the subblock motion vector in L1 direction. Furthermore, the video coding device may multiplex syntax of information about the prediction direction to be disabled into the bitstream, and the video decoding device may extract the syntax of the information from the bitstream and disable the motion vector in the prediction direction.
The number of motion vectors of block based affine transform motion compensated prediction for a block to be processed in the video coding device and the video decoding device according to this exemplary embodiment is less than the number of motion vectors of block based affine transform motion compensated prediction in a conventional video coding device and video decoding device, as can be understood from the difference between the number of motion vectors of subblocks in (C) in
As is clear from the above description, for all blocks of P pictures not using bidirectional prediction and blocks not using bidirectional prediction (i.e. blocks of unidirectional prediction) in B pictures, the number of motion vectors of block based affine transform motion compensated prediction for a block to be processed in this exemplary embodiment is the same as that in the case of using the typical block based affine transform motion compensated prediction. Accordingly, the block based affine transform motion compensated prediction in this exemplary embodiment may be limited to only blocks using bidirectional prediction.
In the video coding device according to Exemplary Embodiment 1 and the video decoding device according to Exemplary Embodiment 2, the block based affine transform motion compensated prediction controllers 1050 and 2040 determine whether the amount of memory access relating to reference pictures is large based on the image size, and, in the case of determining that the amount of memory access relating to reference pictures is large, increase the subblock size to reduce the amount of memory access.
Instead of performing determination based on the image size, the block based affine transform motion compensated prediction controllers 1050 and 2040 may control the constantly used subblock size S based on syntax. That is, the multiplexer 106 in the video coding device may multiplex log2_affine_subblock_size_minus2 syntax indicating information about the subblock size S into the bitstream, and the de-multiplexer 201 in the video decoding device may extract the syntax of the information from the bitstream and decode the syntax to obtain the subblock size S, which is then used by the predictor 204.
The relationship between the log2_affine_subblock_size_minus2 syntax value and the subblock size S is expressed by the following formula.
S=1<<(log2_affine_subblock_size_minus2+2) (4)
In the formula, << denotes bit shift operation in the left direction.
The operation of the block based affine transform motion compensated prediction controller 1050 in the video coding device according to Exemplary Embodiment 5 that performs the above-described control will be described below, with reference to a flowchart in
The control point motion vector setting unit 1051 assigns externally input motion vectors to control points of a block to be processed, as in the control point motion vector setting unit 5051 in
The control function added subblock motion vector derivation unit 1052 determines the subblock size S from the log2_affine_subblock_size_minus2 syntax value, based on the relational formula (4) (step S2003).
The control function added subblock motion vector derivation unit 1052 calculates, for each subblock, a motion vector at the center position in the subblock, and sets the calculated motion vector as a subblock motion vector, as in the subblock motion vector derivation unit 5052 in
The predictor 105 (in the video decoding device, the predictor 204) generates a prediction signal for an input image signal of each CU, based on the determined motion vector and the like.
The video coding device and the corresponding video decoding device according to Exemplary Embodiment 5 may have the same overall structures as those depicted in
In this exemplary embodiment, the image size determination process is unnecessary, so that the structure of the block based affine transform motion compensated prediction controllers 1050 and 2040 can be simplified.
In the video coding device and the video decoding device according to Exemplary Embodiment 3, the block based affine transform motion compensated prediction controllers 1050 and 2040 determine whether the amount of memory access relating to reference pictures is large based on the image size, and, in the case of determining that the amount of memory access relating to reference pictures is large, make the subblock motion vector into an integer vector to reduce the amount of memory access.
Alternatively, the block based affine transform motion compensated prediction controllers 1050 and 2040 may determine whether to make the subblock motion vector into an integer vector based on syntax indicating whether to make the motion vector into an integer vector.
That is, the multiplexer 106 in the video coding device may multiplex enable_affine_sublock_integer_mv_flag syntax indicating information about whether to apply integer precision (i.e. whether integer precision is enabled) into the bitstream, and the de-multiplexer 201 in the video decoding device may extract the syntax of the information from the bitstream and decode the syntax to obtain the information, which is then used by the predictor 204.
In the case where the enable_affine_sublock_integer_mv_flag syntax value is 1, integer precision is applied (integer precision is enabled). Otherwise (i.e. in the case where the enable_affine_sublock_integer_mv_flag syntax value is 0), integer precision is not applied (integer precision is disabled).
The operation of the block based affine transform motion compensated prediction controller 1050 in the video coding device according to Exemplary Embodiment 6 that performs the above-described control will be described below, with reference to a flowchart in
The control point motion vector setting unit 1051 assigns externally input motion vectors to control points of a block to be processed, as in the control point motion vector setting unit 5051 in
The control function added subblock motion vector derivation unit 1052 calculates, for each subblock, a motion vector at the center position in the subblock, and sets the calculated motion vector as a subblock motion vector, as in the subblock motion vector derivation unit 5052 in
The control function added subblock motion vector derivation unit 1052 determines whether to make the subblock motion vector into an integer vector (i.e. whether integer precision is enabled), from enable_affine_sublock_integer_mv_flag (step S3001). In the case where integer precision is not enabled, the process ends.
In the case where integer precision is enabled, the control function added subblock motion vector derivation unit 1052 rounds the motion vector v of each subblock to a vector of integer precision (step S2001). The motion vector v of integer precision is expressed by the foregoing formula (3).
The predictor 105 (in the video decoding device, the predictor 204) generates a prediction signal for an input image signal of each CU, based on the determined motion vector and the like.
The video coding device and the corresponding video decoding device according to Exemplary Embodiment 6 may have the same overall structures as those depicted in
In the video coding device and the video decoding device according to Exemplary Embodiment 4, the block based affine transform motion compensated prediction controllers 1050 and 2040 determine whether the amount of memory access relating to reference pictures is large based on the image size, and, in the case of determining that the amount of memory access relating to reference pictures is large, forcedly set the motion vector of the block to be processed in bidirectional prediction to be a unidirectional motion vector to reduce the amount of memory access.
Alternatively, the block based affine transform motion compensated prediction controllers 1050 and 2040 may determine whether to forcedly make the motion vector of the block to be processed in bidirectional prediction into a unidirectional motion vector based on syntax indicating whether to make the motion vector into an integer vector.
That is, the multiplexer 106 in the video coding device may multiplex disable_affine_sublock_bipred_mv_flag syntax indicating information about whether to forcedly set the motion vector to unidirectional (i.e. whether change to unidirectional is enabled) into the bitstream, and the de-multiplexer 201 in the video decoding device may extract the syntax of the information from the bitstream and decode the syntax to obtain the information, which is then used by the predictor 204.
In the case where the disable_affine_sublock_bipred_mv_flag syntax value is 1, forced change to unidirectional is not performed (change to unidirectional is disabled). Otherwise (i.e. disable_affine_sublock_bipred_mv_flag syntax value is 0), forced change to unidirectional is performed (change to unidirectional is enabled).
The operation of the block based affine transform motion compensated prediction controller 1050 in the video coding device according to Exemplary Embodiment 7 that performs the above-described control will be described below, with reference to a flowchart in
The control point motion vector setting unit 1051 assigns externally input motion vectors to control points of a block to be processed, as in the control point motion vector setting unit 5051 in
The control function added subblock motion vector derivation unit 1052 calculates, for each subblock, a motion vector at the center position in the subblock, and sets the calculated motion vector as a subblock motion vector, as in the subblock motion vector derivation unit 5052 in
The control function added subblock motion vector derivation unit 1052 determines whether to set the subblock motion vector to unidirectional (i.e. whether change to unidirectional is enabled), from disable_affine_sublock_bipred_mv_flag (step S4001). In the case where change to unidirectional is not enabled, the process ends.
In the case where change to unidirectional is enabled, the control function added subblock motion vector derivation unit 1052 disables the subblock motion vector in L1 direction, to limit the motion vector v of each subblock to unidirectional (step S2001).
The predictor 105 (in the video decoding device, the predictor 204) generates a prediction signal for an input image signal of each CU, based on the determined motion vector and the like.
The video coding device and the corresponding video decoding device according to Exemplary Embodiment 9 may have the same overall structures as those depicted in
As in Exemplary Embodiment 4, the control function added subblock motion vector derivation unit 1052 may disable the subblock motion vector in L0 direction, instead of disabling the subblock motion vector in L1 direction. Furthermore, the video coding device may multiplex syntax of information about the prediction direction to be disabled into the bitstream, and the video decoding device may extract the syntax of the information from the bitstream and disable the motion vector in the prediction direction.
As described above, in the block based affine transform motion compensated prediction in each of the foregoing exemplary embodiments, the control function added subblock motion vector derivation unit determines whether the amount of memory access relating to reference pictures is large, and, in the case of determining that the amount of memory access is large, derives the subblock motion vector so as to reduce the amount of memory access relating to reference pictures
Whether the amount of memory access relating to reference pictures is large is determined using at least one of the image size, the prediction direction (the prediction direction of motion compensated prediction for the block to be processed), and the difference between motion vectors of control points in the block to be processed.
Moreover, the amount of memory access relating to reference pictures is reduced using at least one of limitation of the number of motion vectors and motion vector precision decrease, as follows.
Limitation of the number of motion vectors: increasing the subblock size, setting the prediction direction to unidirectional, or a combination thereof.
Motion vector precision decrease: rounding the motion vector of the subblock to a motion vector of integer precision.
The foregoing exemplary embodiments may be used singly, or two or more exemplary embodiments may be combined as appropriate.
Specifically, although the determination of whether the amount of memory access is large is performed using the image size, the prediction direction of the block to be processed, or the difference between the motion vectors of the control points in the block to be processed in the video coding device and the video decoding device according to each of the foregoing exemplary embodiments, any combination of these three elements may be used in the determination.
Although the reduction of the amount of memory access is performed by increasing the subblock size, making the subblock motion vector into integer vector, or limiting the subblock motion vector to unidirectional in the video coding device and the video decoding device according to each of the foregoing exemplary embodiments, any combination of these three methods may be used.
In this exemplary embodiment, the video coding device 100 and the video decoding device 200 reduce the amount of memory access by a common method. This ensures high interconnectivity between the video coding device 100 and the video decoding device 200.
For example, in the case where the video coding device 100 and the video decoding device 200 are configured according to the foregoing Exemplary Embodiment 5, the value of log2_affine_subblock_size_minus2 syntax corresponding to each image size is prescribed as shown in Table 1. The video system 400 then sets the prescribed value corresponding to the image size in the video coding device 100, as a result of which the interconnectivity between the video coding device 100 and the video decoding device 200 is ensured and service and operation are made more efficient.
For example, in the case where the video coding device 100 and the video decoding device 200 are configured according to the foregoing Exemplary Embodiment 6, the value of enable_affine_sublock_integer_mv_flag syntax corresponding to each image size is prescribed as shown in Table 2. The video system 400 then sets the prescribed value corresponding to the image size in the video coding device 100, as a result of which the interconnectivity between the video coding device 100 and the video decoding device 200 is ensured and service and operation are made more efficient.
For example, in the case where the video coding device 100 and the video decoding device 200 are configured according to the foregoing Exemplary Embodiment 7, the value of disable_affine_sublock_bipred_mv_flag corresponding to each image size is prescribed as shown in Table 3. The video system 400 then sets the prescribed value corresponding to the image size in the video coding device 100, as a result of which the interconnectivity between the video coding device 100 and the video decoding device 200 is ensured and service and operation are made more efficient.
Each of the foregoing exemplary embodiments may be realized by hardware or a computer program.
An information processing system depicted in
In the information processing system depicted in
In the video system 400 depicted in
The term “outside” means outside the block based affine transform motion compensated prediction control unit 11. Examples of the coding parameter supplied from the outside include an image size set outside the block based affine transform motion compensated prediction control unit 11, a prediction direction determined by a prediction unit (e.g. the predictor 105 in
Examples of the coding parameter used for the block based affine transform motion compensated prediction include an image size, a prediction direction determined by a prediction unit (e.g. the predictor 105 in
All or part of the foregoing exemplary embodiments can be described as the following supplementary notes, although the present invention is not limited to the following structures.
(Supplementary note 1) A video coding device that performs video coding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video coding device including block based affine transform motion compensated prediction control means for controlling at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using a coding parameter supplied from outside.
(Supplementary note 2) The video coding device according to supplementary note 1, wherein the block based affine transform motion compensated prediction control means: increases the block size of the subblock in the case of controlling the block size of the subblock; limits the prediction direction to unidirectional in the case of controlling the prediction direction; and rounds the motion vector of the subblock to a motion vector of integer precision in the case of controlling the motion vector precision.
(Supplementary note 3) A video decoding device that performs video decoding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video decoding device including block based affine transform motion compensated prediction control means for controlling at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using at least a coding parameter extracted from a bitstream.
(Supplementary note 4) The video decoding device according to supplementary note 3, wherein the block based affine transform motion compensated prediction control means: increases the block size of the subblock in the case of controlling the block size of the subblock; limits the prediction direction to unidirectional in the case of controlling the prediction direction; and rounds the motion vector of the subblock to a motion vector of integer precision in the case of controlling the motion vector precision.
(Supplementary note 5) A video coding method of performing video coding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video coding method including controlling at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using a supplied coding parameter.
(Supplementary note 6) The video coding method according to supplementary note 5, wherein: the block size of the subblock is increased in the case of controlling the block size of the subblock; the prediction direction is limited to unidirectional in the case of controlling the prediction direction; and the motion vector of the subblock is rounded to a motion vector of integer precision in the case of controlling the motion vector precision.
(Supplementary note 7) A video decoding method of performing video decoding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video decoding method including controlling at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using at least a coding parameter extracted from a bitstream.
(Supplementary note 8) The video decoding method according to supplementary note 7, wherein: the block size of the subblock is increased in the case of controlling the block size of the subblock; the prediction direction is limited to unidirectional in the case of controlling the prediction direction; and the motion vector of the subblock is rounded to a motion vector of integer precision in the case of controlling the motion vector precision.
(Supplementary note 9) A video coding program executed in a video coding device that performs video coding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video coding program causing a computer to control at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using a supplied coding parameter.
(Supplementary note 10) The video coding program according to supplementary note 9, wherein the computer is caused to perform a process for: increasing the block size of the subblock in the case of controlling the block size of the subblock; limiting the prediction direction to unidirectional in the case of controlling the prediction direction; and rounding the motion vector of the subblock to a motion vector of integer precision in the case of controlling the motion vector precision.
(Supplementary note 11) A video decoding program executed in a video decoding device that performs video decoding using a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video decoding program causing a computer to control at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using at least a coding parameter extracted from a bitstream.
(Supplementary note 12) The video decoding program according to supplementary note 11, wherein the computer is caused to perform a process for: increasing the block size of the subblock in the case of controlling the block size of the subblock; limiting the prediction direction to unidirectional in the case of controlling the prediction direction; and rounding the motion vector of the subblock to a motion vector of integer precision in the case of controlling the motion vector precision.
(Supplementary note 13) A video system that uses a block based affine transform motion compensated prediction technique that includes a process of calculating a motion vector of each subblock using motion vectors of control points in a block, the video system including: a video coding device for performing video coding using the block based affine transform motion compensated prediction; and a video decoding device for performing video decoding using the block based affine transform motion compensated prediction, wherein the video coding device includes coding-side block based affine transform motion compensated prediction control means for controlling at least one of a block size, a prediction direction, and a motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using a coding parameter supplied in the video system, and wherein the video decoding device includes decoding-side block based affine transform motion compensated prediction control means for controlling at least one of the block size, the prediction direction, and the motion vector precision of the subblock in the block subjected to the block based affine transform motion compensated prediction, using at least a coding parameter extracted from a bitstream from the video coding device.
(Supplementary note 14) The video system according to supplementary note 13, wherein each of the coding-side block based affine transform motion compensated prediction control means and the decoding-side block based affine transform motion compensated prediction control means: increases the block size of the subblock in the case of controlling the block size of the subblock; limits the prediction direction to unidirectional in the case of controlling the prediction direction; and rounds the motion vector of the subblock to a motion vector of integer precision in the case of controlling the motion vector precision.
(Supplementary note 15) A video coding program for implementing the video coding method according to supplementary note 5 or 6.
(Supplementary note 16) A video decoding program for implementing the video decoding method according to supplementary note 7 or 8.
This application claims priority based on Japanese Patent Application No. 2017-193503 filed on Oct. 3, 2017, the disclosure of which is incorporated herein in its entirety.
Although the present invention has been described with reference to the foregoing exemplary embodiments, the present invention is not limited to the foregoing exemplary embodiments. Various changes understandable by those skilled in the art can be made to the structures and details of the present invention within the scope of the present invention.
10 video coding device
11 block based affine transform motion compensated prediction control unit
20 video decoding device
21 block based affine transform motion compensated prediction control unit
100 video coding device
101 transform/quantizer
102 entropy encoder
103 inverse quantizer/inverse transformer
104 buffer
105 predictor
106 multiplexer
200 video decoding device
201 de-multiplexer
202 entropy decoder
203 inverse quantizer/inverse transformer
204 predictor
205 buffer
300 transmission path
400 video system
1001 processor
1002 program memory
1003 storage medium
1004 storage medium
1050 block based affine transform motion compensated prediction controller
1051 control point motion vector setting unit
1052 control function added subblock motion vector derivation unit
2040 block based affine transform motion compensated prediction controller
Number | Date | Country | Kind |
---|---|---|---|
2017-193503 | Oct 2017 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2018/032349 | 8/31/2018 | WO | 00 |