This application is a National Stage of International Application No. PCT/JP2012/001592 filed Mar. 8, 2012, claiming priority based on Japanese Patent Application Nos. 2011-051291, filed Mar. 9, 2011 and 2011-095395, filed Apr. 21, 2011, the contents of all of which are incorporated herein by reference in their entirety.
The present invention relates to a video encoding technique, and particularly to a video encoding technique which makes a prediction with reference to a reconstructed image and performs data compression by quantization.
A typical video encoding device executes an encoding process that conforms to a predetermined video coding scheme to generate coded data, i.e. a bitstream. In ISO/IEC 14496-10 Advanced Video Coding (AVC) described in Non Patent Literature (NPL) 1 as a representative example of the predetermined video coding scheme, each frame is divided into blocks of 16×16 pixel size called MBs (Macro Blocks), and each MB is further divided into blocks of 4×4 pixel size, setting MB as the minimum unit of encoding.
Each of the divided image blocks is input sequentially to the video encoding device and encoded.
The video encoding device shown in
An input image to the video encoding device is input to the frequency transformer 101 as a prediction error image, after a prediction image supplied from the intra-frame predictor 108 or the inter-frame predictor 109 through the prediction selector 110 is subtracted from the input image.
The frequency transformer 101 transforms the input prediction error image from a spatial domain to a frequency domain, and outputs the result as a coefficient image.
The quantizer 102 quantizes the coefficient image supplied from the frequency transformer 101 using a quantization step size, supplied from the quantization controller 104, controlling the granularity of quantization, and outputs the result as a quantized coefficient image.
The variable-length encoder 103 entropy-encodes the quantized coefficient image supplied from the quantizer 102. The variable-length encoder 103 also encodes the above quantization step size supplied from the quantization controller 104 and an image prediction parameter supplied from the prediction selector 110. These pieces of coded data are multiplexed and output from the video encoding device as a bitstream.
Here, an encoding process for the quantization step size at the variable-length encoder 103 is described with reference to
The quantization step size buffer 10311 holds a quantization step size Q(i−1) assigned to the previous image block encoded immediately before an image block to be encoded.
As shown in the following equation (1), the previous quantization step size Q(i−1) supplied from the quantization step size buffer 10311 is subtracted from an input quantization step size Q(i), and the result is input to the entropy encoder 10312 as a difference quantization step size dQ(i).
dQ(i)=Q(i)−Q(i−1) (1)
The entropy encoder 10312 entropy-encodes the input difference quantization step size dQ(i), and outputs the result as code corresponding to the quantization step size.
The above has described the encoding process for the quantization step size.
The quantization controller 104 determines a quantization step size for the current input image block. In general, the quantization controller 104 monitors the output code rate of the variable-length encoder 103 to increase the quantization step size so as to reduce the output code rate for the image block concerned, or, conversely, to decrease the quantization step size so as to increase the output code rate for the image block concerned. The increase or decrease in quantization step size enables the video encoding device to encode an input moving image by a target rate. The determined quantization step size is supplied to the quantizer 102 and the variable-length encoder 103.
The quantized coefficient image output from the quantizer 102 is inverse-quantized by the inverse quantizer 105 to obtain a coefficient image to be used for prediction in encoding subsequent image blocks. The coefficient image output from the inverse quantizer 105 is set back to the spatial domain by the inverse frequency transformer 106 to obtain a prediction error image. The prediction image is added to the prediction error image, and the result is input to the frame memory 107 and the intra-frame predictor 108 as a reconstructed image.
The frame memory 107 stores reconstructed images of encoded image frames input in the past. The image frames stored in the frame memory 107 are called reference frames.
The intra-frame predictor 108 refers to reconstructed images of image blocks encoded in the past within the image frame being currently encoded to generate a prediction image.
The inter-frame predictor 109 refers to reference frames supplied from the frame memory 107 to generate a prediction image.
The prediction selector 110 compares the prediction image supplied from the intra-frame predictor 108 with the prediction image supplied from the inter-frame predictor 109, selects and outputs one prediction image closer to the input image. The prediction selector 110 also outputs information (called an image prediction parameter) on a prediction method used by the intra-frame predictor 108 or the inter-frame predictor 109, and supplies the information to the variable-length encoder 103.
According to the processing mentioned above, the typical video encoding device compressively encodes the input moving image to generate a bitstream.
The output bitstream is transmitted to a video decoding device. The video decoding device executes a decoding process so that the bitstream will be decompressed as a moving image.
The video decoding device shown in
The variable-length decoder 201 variable-length-decodes the input bitstream to obtain a quantization step size that controls the granularity of inverse quantization, the quantized coefficient image, and the image prediction parameter. The quantization step size and the quantized coefficient image mentioned above are supplied to the inverse quantizer 202. The image prediction parameter is supplied to the prediction selector 207.
The inverse quantizer 202 inverse-quantizes the input quantized coefficient image based on the input quantization step size, and outputs the result as a coefficient image.
The inverse frequency transformer 203 transforms the coefficient image, supplied from the inverse quantizer 202, from the frequency domain to the spatial domain, and outputs the result as a prediction error image. A prediction image supplied from the prediction selector 207 is added to the prediction error image to obtain a decoded image. The decoded image is not only output from the video decoding device as an output image, but also input to the frame memory 204 and the intra-frame predictor 205.
The frame memory 204 stores image frames decoded in the past. The image frames stored in the frame memory 204 are called reference frames.
Based on the image prediction parameter supplied from the variable-length decoder 201, the intra-frame predictor 205 refers to reconstructed images of image blocks decoded in the past within the image frame being currently decoded to generate a prediction image.
Based on the image prediction parameter supplied from the variable-length decoder 201, the inter-frame predictor 206 refers to reference frames supplied from the frame memory 204 to generate a prediction image.
The prediction selector 207 selects either of the prediction images supplied from the intra-frame predictor 205 and the inter-frame predictor 206 based on the image prediction parameter supplied from the variable-length decoder 201.
Here, a decoding process for the quantization step size at the variable-length decoder 201 is described with reference to
The entropy decoder 20111 entropy-decodes input code, and outputs a difference quantization step size dQ(i).
The quantization step size buffer 20112 holds the previous quantization step size Q(i−1).
As shown in the following equation (2), Q(i−1) supplied from the quantization step size buffer 20112 is added to the difference quantization step size dQ(i) generated by the entropy decoder 20111. The added value is not only output as a quantization step size Q(i), but also input to the quantization step size buffer 20112.
Q(i)=Q(i−1)+dQ(i) (2)
The above has described the decoding process for the quantization step size.
According to the processing mentioned above, the typical video decoding device decodes the bitstream to generate a moving image.
In the meantime, in order to maintain the subjective quality of the moving image to be compressed by the encoding process, the quantization controller 104 in the typical video encoding device is generally analyzes either or both of the input image and the prediction error image, as well as analyzing the output code rate, to determine a quantization step size according to the human visual sensitivity. In other words, the quantization controller 104 performs visual-sensitivity-based adaptive quantization. Specifically, when the human visual sensitivity to the current image to be encoded is determined to be high, the quantization step size is set small, while when the visual sensitivity is determined to be low, the quantization step size is set large. Since such control can assign a larger code rate to a low visual sensitivity region, the subjective quality is improved.
As a visual-sensitivity-based adaptive quantization technique, for example, adaptive quantization based on the texture complexity of an input image used in MPEG-2 Test Model 5 (TM5) is known. The texture complexity is typically called activity. Patent Literature (PTL) 1 proposes an adaptive quantization system using the activity of a prediction image in conjunction with the activity of an input image. PTL 2 proposes an adaptive quantization system based on an activity that takes edge portions into account.
When the visual-sensitivity-based adaptive quantization technique is used, it will cause a problem if the quantization step size is often changed within an image frame. In the typical video encoding device for generating a bitstream that confirms to the AVC scheme, a difference from a quantization step size for an image block encoded just before an image block to be encoded is entropy-encoded in encoding the quantization step size. Therefore, as the change in quantization step size in the encoding sequence direction becomes large, the rate required to encode the quantization step size increases. As a result, the code rate assigned to encoding of the coefficient image is relatively reduced, and hence the image quality is degraded.
Since the encoding sequence direction is independent of the continuity of the visual sensitivity on the screen, the visual-sensitivity-based adaptive quantization technique inevitably increases the code rate required to encode the quantization step size. Therefore, even using the visual-sensitivity-based adaptive quantization technique in the typical video encoding device, the image degradation associated with the increase in the code rate for the quantization step size may cancel out the subjective quality improved by the adaptive quantization technique, i.e., there arises a problem that a sufficient improvement in image quality cannot be achieved.
To address this problem, PTL 3 discloses a technique for adaptively setting a range of quantization to zero, i.e. a dead zone according to the visual sensitivity in the spatial domain and the frequency domain instead of adaptively setting the quantization step size according to the visual sensitivity. In the system described in PTL 3, a dead zone for a transform coefficient determined to be low in terms of the visual sensitivity is more widened than a dead zone for a transform coefficient determined to be high in terms of the visual sensitivity. Such control enables visual-sensitivity-based adaptive quantization without changing the quantization step size.
However, when the technique described in PTL 3 is used, quantization adaptive to the visual sensitivity cannot be performed on transform coefficients that do not fall within a dead zone. In other words, even when the visual sensitivity is determined to be low, the rate of coefficient code for the transform coefficients that do not fall within the dead zone cannot be reduced. Further, when the quantization step size is enlarged, the transform coefficient values after being subjected to quantization are concentrated near zero, while when the dead zone is widened, the transform coefficients that do not fall within the dead zone are not concentrated near zero even after being subjected to quantization. In other words, when the dead zone is widened, the entropy-encoding efficiency is insufficient compared with the case where the quantization step size is enlarged. For these reasons, it can be said that there is a problem in typical encoding technology that the assignment of the code rate to a high visual sensitivity region cannot be increased sufficiently.
The present invention has been made in view of the above problems, and it is a first object thereof to provide a video encoding device and a video encoding method capable of changing the quantization step size frequently while suppressing an increase in code rate to achieve high-quality moving image encoding. It is a second object of the present invention to provide a video decoding device and a video decoding method capable of regenerating a high-quality moving image.
A video encoding device according to the present invention for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, comprises quantization step size encoding means for encoding a quantization step size that controls a granularity of the quantization, wherein the quantization step size encoding means predicts the quantization step size that controls the granularity of the quantization by using a quantization step size assigned to a neighboring image block already encoded.
A video decoding device according to the present invention for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of image blocks, comprises quantization step size decoding means for decoding a quantization step size that controls a granularity of the inverse quantization, wherein the quantization step size decoding means predicts the quantization step size that controls the granularity of the inverse quantization by using a quantization step size assigned to a neighboring image block already decoded.
A video encoding method according to the present invention for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, comprises predicting a quantization step size that controls a granularity of the quantization by using a quantization step size assigned to a neighboring image block already encoded.
A video decoding method according to the present invention for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of image blocks, comprises predicting a quantization step size that controls a granularity of the inverse quantization by using a quantization step size assigned to a neighboring image block already decoded.
According to the present invention, even when the quantization step size is changed frequently within an image frame, the video encoding device can suppress an increase in code rate associated therewith. In other words, the quantization step size can be encoded by a smaller code rate. This resolves the problem that the subjective quality improved by the visual-sensitivity-based adaptive quantization is canceled out, that is, high-quality moving image encoding can be achieved. Further, according to the present invention, since the video decoding device can decode the quantization step size frequently changed by receiving only a small code rate, a high-quality moving image can be regenerated by the small code rate.
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings.
Like the video encoding device shown in
The quantization step size buffer 10311 stores and holds quantization step sizes assigned to image blocks encoded in the past.
The predicted quantization step size generator 10313 retrieves quantization step sizes assigned to neighboring image blocks encoded in the past from the quantization step size buffer to generate a predicted quantization step size.
The predicted quantization step size supplied from the predicted quantization step size generator 10313 is subtracted from the input quantization step size, and the result is input to the entropy encoder 10312 as a difference quantization step size.
The entropy encoder 10312 entropy-encodes the input difference quantization step size and outputs the result as code corresponding to the quantization step size.
Such a structure can reduce the code rate required to encode the quantization step size, and hence high-quality moving image encoding can be achieved. The reason is that the absolute amount for the difference quantization step size input to the entropy encoder 10312 can be reduced because the predicted quantization step size generator 10313 generates the predicted quantization step size using the quantization step sizes of neighboring image blocks independent of the encoding sequence. The reason why the absolute amount for the difference quantization step size input to the entropy encoder 10312 can be reduced if the predicted quantization step size is generated using the quantization step sizes of the neighboring image blocks is because there is generally correlation between neighboring pixels in a moving image and hence the degree of similarity of quantization step sizes assigned to neighboring image blocks having high correlation with each other is high when visual-sensitivity-based adaptive quantization is used.
A specific operation of the quantization step size encoder in the video encoding device in the first exemplary embodiment is described below by using a specific example.
In this example, it is assumed that the image block size as the unit of encoding is a fixed size. It is also assumed that three image blocks respectively adjacent leftwardly, upwardly, and diagonally right upward within the same image frame are used as neighboring image blocks used for prediction of the quantization step size.
Suppose that the current image block to be encoded is denoted by X, and three neighboring image blocks A, B, and C are located respectively adjacent leftwardly, upwardly, and diagonally right upward to the image block X as shown in
pQ(X)=Median(Q(A),Q(B),Q(C)) (3)
Note that Median(x, y, z) is a function for determining an intermediate value from three values of x, y, z.
The entropy encoder 10312 encodes a difference quantization step size dQ(X) obtained by the following equation (4) using signed Exp-Golomb (Exponential-Golomb) code as one of entropy codes, and outputs the result as code corresponding to a quantization step size for the image block concerned.
dQ(X)=Q(X)−pQ(X) (4)
In this example, the three image blocks adjacent leftwardly, upwardly, and diagonally right upward within the same image frame are used as the neighboring image blocks used for prediction of the quantization step size. However, the neighboring image blocks are not limited thereto. For example, image blocks adjacent leftwardly, upwardly, and diagonally left upward may be used to determine the predicted quantization step size by the following equation (5).
pQ(X)=Median(Q(A),Q(B),Q(D)) (5)
The number of image blocks used for prediction may be any number rather than three, and a mean value or the like rather than the intermediate value may be used as the calculation used for prediction may use. The image blocks used for prediction are not necessarily to be adjacent to the image block to be encoded. The image blocks used for prediction may be separated by a predetermined distance from the image block to be encoded. Further, the image blocks used for prediction are not limited to image blocks located in the spatial neighborhood, i.e. within the same image frame, they may be image blocks within any other image frame already encoded.
Further, in this example, it is assumed that the image block to be encoded and the neighboring image blocks are of the same fixed size. However, the present invention is not limited to the case of the fixed size, and the block size as the unit of encoding may be a variable size.
Further, in this example, encoding is performed based on the Exp-Golomb code to encode the difference between the quantization step size of the image block to be encoded and the predicted quantization step size. However, the present invention is not limited to use of the Exp-Golomb code, and encoding may be performed based on any other entropy code. For example, encoding based on Huffman code or arithmetic code may be performed.
The above has described the video encoding device in the first exemplary embodiment of the present invention.
Like the video decoding device shown in
The entropy decoder 20111 entropy-decodes input code to output a difference quantization step size.
The quantization step size buffer 20112 stores and holds quantization step sizes decoded in the past.
Among quantization step sizes decoded in the past, the predicted quantization step size generator 20113 retrieves quantization step sizes corresponding to neighboring pixel blocks of the current image block to be decoded from the quantization step size buffer to generate a predicted quantization step size. Specifically, for example, the predicted quantization step size generator 20113 operates the same way as the predicted quantization step size generator 10313 in the specific example of the video encoding device in the first exemplary embodiment.
The predicted quantization step size supplied from the predicted quantization step size generator 20113 is added to a difference quantization step size generated by the entropy decoder 20111, and the result is not only output as the quantization step size, but also input to the quantization step size buffer 20112.
Such a structure of the quantization step size decoder enables the video decoding device to decode the quantization step size by receiving only a smaller code rate. As a result, a high-quality moving image can be decoded and regenerated. The reason is that the entropy decoder 20111 only has to decode the difference quantization step size near zero, because the predicted quantization step size comes close to the actually assigned quantization step size when the predicted quantization step size generator 20113 generates the predicted quantization step size using quantization step sizes of neighboring image blocks independent of the decoding sequence. The reason why the predicted quantization step size close to the actually assigned quantization step size can be obtained by generating the predicted quantization step size using the quantization step sizes of the neighboring image blocks is because there is generally correlation between neighboring pixels in a moving image and hence the degree of similarity of quantization step sizes assigned to neighboring image blocks having high correlation with each other is high when visual-sensitivity-based adaptive quantization is used.
The above has described the video decoding device in the second exemplary embodiment of the present invention.
Like the video encoding device in the first exemplary embodiment of the present invention, a video encoding device in a third exemplary embodiment of the present invention includes the frequency transformer 101, the quantizer 102, the variable-length encoder 103, the quantization controller 104, the inverse quantizer 105, the inverse frequency transformer 106, the frame memory 107, the intra-frame predictor 108, the inter-frame predictor 109, and the prediction selector 110 as shown in
Since the operation of the quantization step size buffer 10311 and the entropy encoder 10312 is the same as that of the quantization step size encoder in the video encoding device in the first exemplary embodiment, redundant description is omitted here.
The predicted quantization step size generator 10313 uses the image prediction parameter to select an image block to be used for prediction of the quantization step size from among image blocks encoded in the past. The predicted quantization step size generator 10313 generates a predicted quantization step size from a quantization step size corresponding to the image block selected.
Such a structure enables the video encoding device to further reduce the code rate required to encode the quantization step size compared with the video encoding device in the first exemplary embodiment. As a result, high-quality moving image encoding can be achieved. The reason is that the quantization step size can be predicted from neighboring image blocks having high correlation with the image block concerned because the predicted quantization step size generator 10313 predicts the quantization step size using the image prediction parameter.
Like the video decoding device in the second exemplary embodiment of the present invention, a video decoding device in a fourth exemplary embodiment of the present invention includes the variable-length decoder 201, the inverse quantizer 202, the inverse frequency transformer 203, the frame memory 204, the intra-frame predictor 205, the inter-frame predictor 206, and the prediction selector 207 as shown in
Since the operation of the entropy decoder 20111 and the quantization step size buffer 20112 is the same as that of the quantization step size decoder in the video decoding device in the second exemplary embodiment, redundant description is omitted here.
The predicted quantization step size generator 20113 uses the image prediction parameter to select an image block to be used for prediction of the quantization step size from among the image blocks decoded in the past. The predicted quantization step size generator 20113 generates a predicted quantization step size from a quantization step size corresponding to the image block selected. A difference quantization step size output from the entropy decoder 20111 is added to the generated predicted quantization step size, and the result is not only output as the quantization step size, but also input to the quantization step size buffer 20112.
Since the derivation method for the predicted quantization step size at the predicted quantization step size generator 20113 is the same as the generation method for the predicted quantization step size at the predicted quantization step size generator 10313 in the video encoding device in the third exemplary embodiment mentioned above, redundant description is omitted here.
Such a structure enables the video decoding device to decode the quantization step size by receiving only a further smaller code rate compared with the video decoding device in the second exemplary embodiment. As a result, a high-quality moving image can be decoded and regenerated. The reason is that the quantization step size can be predicted from neighboring image blocks having higher correlation with the image block concerned because the predicted quantization step size generator 20113 predicts the quantization step size using the image prediction parameter.
Using an example, a specific operation of the quantization step size encoder in the video encoding device in the third exemplary embodiment mentioned above is described below.
In the example, the prediction direction of intra-frame prediction is used as the image prediction parameter to be used for prediction of the quantization step size. Further, as the intra-frame prediction, directional prediction of eight directions and average prediction (illustrated in
It is assumed that the image block size as the unit of encoding is a fixed size. It is also assumed that the block as the unit of determining the quantization step size (called quantization step size transmission block) and the block as the unit of intra-frame prediction (called a prediction block) are of the same size. If the current image block to be encoded is denoted by X, and four neighborhood blocks A, B, C, and D have a positional relationship shown in
pQ(X)=pQ(B); if m=0
pQ(X)=pQ(A); if m=1
pQ(X)=(pQ(A)+pQ(B)+1)/2; if m=2
pQ(X)=pQ(C); if m=3
pQ(X)=pQ(D); if m=4
pQ(X)=(pQ(C)+pQ(D)+1)/2; if m=5
pQ(X)=(pQ(A)+pQ(D)+1)/2; if m=6
pQ(X)=(pQ(B)+pQ(D)+1)/2; if m=7
pQ(X)=pQ(A); if m=8 (6)
Note that m is an intra-prediction direction index in a frame shown in
The entropy encoder 10312 applies the quantization step size Q(X) and the predicted quantization step size pQ(X) to equation (4) to obtain a difference quantization step size dQ(X). The entropy encoder 10312 encodes the obtained difference quantization step size dQ(X) using the signed Exp-Golomb code as one of entropy codes, and outputs the result as code corresponding to a quantization step size for the image block concerned.
In the example, directional prediction of eight directions and average prediction are used as intra-frame prediction, but the present invention is not limited thereto. For example, directional prediction of 33 directions described in NPL 2 and average prediction may be used, or any other intra-frame prediction may be used.
Further, the number of image blocks used for prediction may be any number other than four. In the example, as shown in the equation (6) mentioned above, either a quantization step size in any one of image blocks or an average value of quantization step sizes in two image blocks is used as the predicted quantization step size. However, the present invention is not limited to equation (6) mentioned above, and any other calculation result may be used as the predicted quantization step size. For example, as shown in the following equation (7), either a quantization step size in any one of image blocks or an intermediate value of three quantization step sizes may be used, or the predicted quantization step size may be determined using any other calculation. Further, the image blocks used for prediction are not necessarily to be adjacent to the current image block to be encoded. The image blocks used for prediction may be separated by a predetermined distance from the current image block to be encoded.
pQ(X)=pQ(B); if m=0, 5 or 7
pQ(X)=pQ(A); if m=1, 6 or 8
pQ(X)=pQ(C); if m=3
pQ(X)=pQ(D); if m=4
pQ(X)=Median(pQ(A),pQ(B),pQ(C)); if m=2 (7)
In the example, it is assumed that the image block to be encoded the neighboring image blocks are of the same fixed size. However, the present invention is not limited to the fixed size, and the block as the unit of encoding may be of a variable size.
Further, in the example, it is assumed that the quantization step size transmission blocks and the prediction block are of the same size. However, the present invention is not limited to the same size, and the quantization step size transmission blocks and the prediction block may be of different sizes. For example, if two or more prediction blocks are included in the quantization step size transmission blocks, a prediction block in any one of the two or more prediction blocks may be used for prediction of the quantization step size. Alternatively, the result of adding any calculation, such as an intermediate value calculation or an average value calculation, to the prediction directions of the two or more prediction blocks may be used for prediction of the quantization step size.
Further, in the example, the difference between the quantization step size of the image block to be encoded and the predicted quantization step size is encoded based on the Exp-Golomb code. However, the present invention is not limited to use of the Exp-Golomb code, and encoding based on any other entropy code may be performed. For example, encoding based on Huffman code or arithmetic code may be performed.
Using another example, a specific operation of the quantization step size encoder in the video encoding device in the third exemplary embodiment mentioned above is described below.
In this example, a motion vector of inter-frame prediction is used as the image prediction parameter used for prediction of the quantization step size. Prediction defined by the translation of block units as shown in FIG. 7 is assumed as inter-frame prediction. It is assumed that a prediction image is generated from an image block located in a position which is out of the same spatial position as the block to be encoded in the reference frame by a displacement corresponding to the motion vector. Also, as shown in
Here, the block to be encoded is denoted by X, the center position of block X is denoted by cent(X), the motion vector in inter-frame prediction of X is denoted by V(X), and the reference frame to be referred to in inter-frame prediction is denoted by RefPic(X). Then, as show in
pQ(X)=Q(Block(RefPic(X),cent(X)+V(X)) (8)
The derivation of dQ(X) and the encoding process by the entropy encoder 10312 are the same as those in the first example.
In the example, one-directional prediction is assumed, but the present invention is not limited to use of one-directional prediction. For example, in the case of bi-directional prediction, where a prediction image is generated by weighted averaging reference image blocks in two reference frames, if one reference frame is denoted by RefPic0(X), a motion vector to RefPic0(X) is denoted by V0(X), the other reference frame is denoted by RefPic1(X), a motion vector to RefPic1(X) is denoted by V1(X), a weight given to RefPic0(X) upon generation of the prediction image is denoted by w0, and a weight given to RefPic1(X) is denoted by w1, the predicted quantization step size generator 10313 may determine the predicted quantization step size pQ(X) by the following equation (9).
pQ(X)=w0Q(Block(RefPic0(X),cent(X)+V0(X))+w1Q(Block(RefPic1(X),cent(X)+V1(X)) (9)
Further, in the example, the quantization step size of the block to which the center position of the reference image block belongs is used as the predicted quantization step size, but the predicted quantization step size is not limited thereto. For example, a quantization step size of a block to which an upper left position of the reference image block belongs may be used as the predicted quantization step size. Alternatively, quantization step sizes of blocks to which all pixels of the reference image block belong may be respectively referred to use an average value of these quantization step sizes as the predicted quantization step size.
Further, in the example, prediction represented by the translation between blocks is assumed as inter-frame prediction. However, the reference image block is not limited thereto, and it may be of any shape.
Further, in the example, it is assumed that the quantization step size transmission blocks and the prediction block are of the same size. However, like in the first example of the video encoding device in the third exemplary embodiment mentioned above, the quantization step size transmission blocks and the prediction block may be of sizes different from each other.
Using still another example, a specific operation of the quantization step size encoder in the video encoding device in the third exemplary embodiment mentioned above is described below.
In the example, prediction of a motion vector of inter-frame prediction, i.e. a predicted motion vector is used as the image prediction parameter used for prediction of the quantization step size. When the predicted motion vector is derived from neighboring image blocks of the block to be encoded, quantization step sizes of the neighboring image blocks used for derivation of the predicted motion vector is used to predict a motion vector of the block to be encoded.
In the example, it is assumed that the quantization step size transmission blocks and the prediction block are of the same size. Also, like in the second example of the video encoding device in the third exemplary embodiment mentioned above, one-directional prediction represented by a motion vector is assumed as inter-frame prediction. In the example, a predicted motion vector derived by a predetermined method is subtracted from the motion vector shown in
Here, the predicted motion vector derivation method used in the example is briefly described. The block to be encoded is denoted by X, and blocks adjacent leftwardly, upwardly, diagonally right upward, diagonally left upward, and diagonally left downward as shown in
Further, a motion vector determined by the following equation (10) is denoted by mvMed, and a motion vector of a block in the same spatial position as the block to be encoded on a reference frame assigned to the image frame to be encoded (illustrated as an in-phase block XCol with respect to the block X to be encoded in
mvMed=(mvMedx,mvMedy)
mvMedx=Median(mvAx,mvBx,mvCx)
mvMedy=Median(mvAy,mvBy,mvCy) (10)
As described above, five motion vectors, i.e. mvMed, mvA, mvB, mvC, and mvCol are candidates for the predicted motion vector in the block X to be encoded. Any one motion vector is selected according to a predetermined priority order from among the candidates, and set as the predicted motion vector pMV(X) of the block to be encoded. An example of the predetermined priority order is described in “8.4.2.1.4 Derivation process for luma motion vector prediction” and “8.4.2.1.8 Removal process for motion vector prediction” of NPL 2.
When the predicted motion vector pMV(X) is determined as mentioned above, the predicted quantization step size generator 10313 determines a predicted quantization step size pQ(X) of the block X to be encoded by the following equation (11).
pQ(X)=Q(A); if pMV(X)=mvA
pQ(X)=Q(B);otherwise if pMV(X)=mvB
pQ(X)=Q(C);otherwise if pMV(X)=mvC,and mvC is motion vector of block C
pQ(X)=Q(D);otherwise if pMV(X)=mvC,and mvC is motion vector of block D
pQ(X)=Q(E);otherwise if pMV(X)=mvC,and mvC is motion vector of block E
pQ(X)=Q(XCol);otherwise if pMV(X)=mvCol
pQ(X)=Median(Q(A),Q(B),Q(C));otherwise (11)
In the example, one-directional prediction is assumed, but the present invention is not limited to use of one-directional prediction. Like in the second example of the video encoding device in the third exemplary embodiment mentioned above, this example can also be applied to bi-directional prediction.
Further, in the example, the predicted motion vector derivation method described in “8.4.2.1.4 Derivation process for luma motion vector prediction” of NPL 2 is used as the predicted motion vector derivation method, but the present invention is not limited thereto. For example, as described in “8.4.2.1.3 Derivation process for luma motion vectors for merge mode” of NPL 2, if the motion vector of the block X to be encoded is predicted by a motion vector of either block A or block B, the predicted quantization step size generator 10313 may determine the predicted quantization step size pQ(X) of the block X to be encoded by the following equation (12), or any other predicted motion vector derivation method may be used.
pQ(X)=Q(A); if pMV(X)=mvA
pQ(X)=Q(B);otherwise (12)
Further, in the example, the image blocks used for prediction of the quantization step size are referred to as shown in equation (11) in order of blocks A, B, C, D, E, and XCol. However, the present invention is not limited to this order, and any order may be used. As for the number and positions of image blocks used for prediction of the quantization step size, any number and positions of image blocks may be used. Further, in the example, an intermediate value calculation like in equation (3) is used when pMV(X) agrees with none of mvA, mvB, mvC, and mvCol, but the present invention is not limited to use of the intermediate value calculation. Any calculation such as the average value calculation like in the first exemplary embodiment may also be used.
Further, in the example, it is assumed that the quantization step size transmission blocks and the prediction block are of the same size. However, the quantization step size transmission blocks and the prediction block may be of sizes different from each other like in the first example and second example of the video encoding device in the third exemplary embodiment mentioned above.
In comparison with the video encoding device shown in
Further, as shown in
The quantization step size prediction controller 111 supplies control information for controlling the quantization step size prediction operation of the predicted quantization step size generator 10313 to the variable-length encoder 103 and the multiplexer 112. The control information for controlling the quantization step size prediction operation is called a quantization step size prediction parameter.
The multiplexer 112 multiplexes the quantization step size prediction parameter into a video bitstream supplied from the variable-length encoder 103, and outputs the result as a bitstream.
Using the image prediction parameter and the quantization step size prediction parameter, the predicted quantization step size generator 10313 selects an image block used for prediction of the quantization step size from among image blocks encoded in the past. The predicted quantization step size 10313 also generates a predicted quantization step size from a quantization step size corresponding to the image block selected.
Such a structure of the video encoding device in the exemplary embodiment can further reduce the code rate required to encode the quantization step size in comparison with the video encoding device in the third exemplary embodiment. As a result, high-quality moving image encoding can be achieved. The reason is that the quantization step size can be predicted for the image block with a higher accuracy, because the predicted quantization step size generator 10313 uses the quantization step size prediction parameter in addition to the image prediction parameter to switch or correct a prediction value of the quantization step size using the image prediction parameter. The reason why the quantization step size can be predicted with a higher accuracy by switching or correction using the quantization step size prediction parameter is because the quantization controller 104 shown in
A specific operation of the quantization step size encoder in the video encoding device in the fifth exemplary embodiment mentioned above is described using a specific example below.
In this example, like in the second example of the video encoding device in the third exemplary embodiment mentioned above, a motion vector of inter-frame prediction is used as the image prediction parameter used for prediction of the quantization step size. Prediction defined by the translation of block units as shown in
Here, the block to be encoded is denoted by X, the frame to be encoded is denoted by Pic(X), the center position of block X is denoted by cent(X), the motion vector in inter-frame prediction of X is denoted by V(X), and the reference frame to be referred to in inter-frame prediction is denoted by RefPic(X). Then, as show in
pQ(X)=Q(Block(RefPic(X),cent(X)+V(X)); if temporal_qp_pred_flag=1
pQ(X)=Median(pQ(A),pQ(B),Q(C));otherwise (13)
Here, temporal_qp_pred_flag represents a flag for switching between whether or not the motion vector between frames can be used for prediction of the quantization step size. The flag is supplied from the quantization step size prediction controller 111 to the predicted quantization step size generator 10313.
The predicted quantization step size generator 10313 may also use an offset value for compensating for a change in quantization step size between the frame Pic(X) to be encoded and the reference frame RefPic(X), i.e. an offset to the quantization step size Qofs(Pic(X), RefPic(X)) to determine the predicted quantization step size pQ(X) by the following equation (14).
pQ(X)=Q(Block(RefPic(X),cent(X)+V(X))+Qofs(Pic(X),RefPic(X)) (14)
Further, the predicted quantization step size generator 10313 may use both temporal_qp_pred_flag mentioned above and the offset to the quantization step size to determine the predicted quantization step size pQ(X) by the following equation (15).
pQ(X)=Q(Block(RefPic(X),cent(X)+V(X))+Qofs(Pic(X),RefPic(X)); if temporal_qp_pred_flag=1
pQ(X)=Median(pQ(A),pQ(B),Q(C));otherwise (15)
For example, if the initial quantization step size of any frame Z is denoted by Qinit(Z), the offset to the quantization step size Qofs(Pic(X), RefPic(X)) in equations (14) and (15) mentioned above may be determined by the following equation (16).
Qofs(Pic(X),RefPic(X))=Qinit(Pic(X))−Qinit(RefPic(X)) (16)
The initial quantization step size is a value given as the initial value of the quantization step size for each frame, and SliceQPY described in “7.4.3 Slice header semantics” of NPL 1 may be used, for example.
For example, as illustrated in a list shown in
In the list shown in
In the example, the motion vector of inter-frame prediction is assumed as the image prediction parameter. However, the present invention is not limited to use of the motion vector of inter-frame prediction. Like in the first example of the video encoding device in the third exemplary embodiment mentioned above, the prediction direction of intra-frame prediction may be so used that the flag mentioned above will switch between whether to use the prediction direction of intra-frame prediction or not for prediction of the quantization step size. Like in the third example of the video encoding device in the third exemplary embodiment mentioned above, the prediction direction of the predicted motion vector may be used, or any other image prediction parameter may be used.
Further, in the example, one-directional prediction is assumed as inter-frame prediction. However, the present invention is not limited to use of one-directional prediction. Like in the second example of the video encoding device in the third exemplary embodiment mentioned above, the present invention can also be applied to bi-directional prediction.
Further, in the example, the quantization step size of a block to which the center position of the reference image block belongs is used as the predicted quantization step size. However, the derivation of the predicted quantization step size in the present invention is not limited thereto. For example, the quantization step size of a block to which the upper left position of the reference image block belongs may be used as the predicted quantization step size. Alternatively, quantization step sizes of blocks to which all pixels of the reference image block belong may be respectively referred to use an average value of these quantization step sizes as the predicted quantization step size.
Further, in the example, prediction represented by the translation between blocks of the same shape is assumed as inter-frame prediction. However, the reference image block in the present invention is not limited thereto, and it may be of any shape.
Further, in the example, as shown in equation (13) and equation (15), when inter-frame prediction information is not used, the quantization step size is predicted from three spatially neighboring image blocks based on the intermediate value calculation, but the present invention is not limited thereto. Like in the specific example of the first exemplary embodiment, the number of image blocks used for prediction may be any number other than three, and an average value calculation or the like may be used instead of the intermediate value calculation. Further, the image blocks used for prediction are not necessarily to be adjacent to the current image block to be encoded, and the image blocks may be separated by a predetermined distance from the current image block to be encoded.
Further, in the example, it is assumed that the quantization step size transmission blocks and the prediction block are of the same size, but like in the first example of the video encoding device in the third exemplary embodiment mentioned above, the quantization step size transmission blocks and the prediction block may be of sizes different from each other.
In comparison with the video decoding device shown in
Further, in comparison with the quantization step size decoder shown in
The de-multiplexer 208 de-multiplexes a bitstream to extract a video bitstream and control information for controlling the quantization step size prediction operation. The de-multiplexer 208 further supplies the extracted control information to the quantization step size prediction controller 209, and the extracted video bitstream to the variable-length decoder 201, respectively.
The quantization step size prediction controller 209 sets up the operation of the predicted quantization step size generator 20113 based on the control information supplied.
The predicted quantization step size generator 20113 uses the image prediction parameter and the quantization step size prediction parameter to select an image block used for prediction of the quantization step size from among the image blocks decoded in the past. The predicted quantization step size generator 20113 further generates a predicted quantization step size from a quantization step size corresponding to the selected image block. A difference quantization step size output from the entropy decoder 20111 is added to the generated predicted quantization step size, and the result is not only output as the quantization step size, but also input to the quantization step size buffer 20112.
Since the derivation method for the predicted quantization step size at the predicted quantization step size generator 20113 is the same as the generation method for the predicted quantization step size at the predicted quantization step size generator 10313 in the video encoding device in the fifth exemplary embodiment mentioned above, redundant description is omitted here.
Such a structure enables the video decoding device to decode the quantization step size by receiving only a further smaller code rate compared with the video decoding device in the fourth exemplary embodiment. As a result, a high-quality moving image can be decoded and regenerated. The reason is that the quantization step size can be predicted for the image block with a higher accuracy because the predicted quantization step size generator 20113 uses the quantization step size prediction parameter in addition to the image prediction parameter to switch or correct a predicted value of the quantization step size using the image prediction parameter.
Like the video encoding device in the third exemplary embodiment, a video encoding device in a seventh exemplary embodiment of the present invention includes the frequency transformer 101, the quantizer 102, the variable-length encoder 103, the quantization controller 104, the inverse quantizer 105, the inverse frequency transformer 106, the frame memory 107, the intra-frame predictor 108, the inter-frame predictor 109, and the prediction selector 110 as shown in
Since the operation of the quantization step size buffer 10311, the entropy encoder 10312, and the predicted quantization step size generator 10313 is the same as the operation of the quantization step size encoder in the video encoding device in the third exemplary embodiment, redundant description is omitted here.
The quantization step size selector 10314 selects either a quantization step size assigned to the previously encoded image block or a predicted quantization step size output from the predicted quantization step size generator 10313 according to the image prediction parameter, and outputs the result as a selectively predicted quantization step size. The quantization step size assigned to the previously encoded image block is saved in the quantization step size buffer 10311. The selectively predicted quantization step size output from the quantization step size selector 10314 is subtracted from the quantization step size input to the quantization step size encoder and to be currently encoded, and the result is input to the entropy encoder 10312.
Such a structure enables the video encoding device in the exemplary embodiment to further reduce the code rate required to encode the quantization step size compared with the video encoding device in the third exemplary embodiment. As a result, high-quality moving image encoding can be achieved. The reason is that the quantization step size can be encoded by the operation of the quantization step size selector 10314 to selectively use the predicted quantization step size derived from the image prediction parameter and the previously encoded quantization step size. The reason why the code rate required to encode the quantization step size can be further reduced by selectively using the predicted quantization step size derived from the image prediction parameter and the previously encoded quantization step size is because the quantization controller 104 in the encoding device not only performs visual-sensitivity-based adaptive quantization but also monitors the output code rate to increase or decrease the quantization step size as described above.
A specific operation of the quantization step size encoder in the video encoding device in the seventh exemplary embodiment is described using a specific example below.
Here, the prediction direction of intra-frame prediction is used as the image prediction parameter used for prediction of the quantization step size. Further, as the intra-frame prediction, directional prediction of eight directions and average prediction (see
It is assumed that the image block size as the unit of encoding is a fixed size. It is also assumed that the block as the unit of determining the quantization step size (called quantization step size transmission block) and the block as the unit of intra-frame prediction (called a prediction block) are of the same size. If the current image block to be encoded is denoted by X, and four neighborhood blocks A, B, C, and D have a positional relationship shown in
The quantization step size selector 10314 selects either the predicted quantization step size pQ(X) obtained by equation (6) or the previously encoded quantization step size Q(Xprev) according to the following equation (17) to generate a selectively predicted quantization step size sQ(X), i.e. the predicted quantization step size determined by equation (6) is used as the selectively predicted quantization step size for directional prediction and the previous quantization step size is used as the selectively predicted quantization step size for average value prediction.
sQ(X)=Q(Xprev); if m=2
sQ(X)=pQ(X); if m=0, 1, 3, 4, 5, 6, 7 or 8 (17)
Note that m is an intra-frame prediction direction index in the frame shown in
The entropy encoder 10312 encodes a difference quantization step size dQ(X) obtained by the following equation (18) using the signed Exp-Golomb (Exponential-Golomb) code as one of entropy codes, and outputs the result as code corresponding to a quantization step size for the image block concerned.
dQ(X)=Q(X)−sQ(X) (18)
In the exemplary embodiment, direction prediction of eight directions and average prediction are used as intra-frame prediction, but the present invention is not limited thereto. For example, directional prediction of 33 directions described in NPL 2 and average prediction may be used, or any other intra-frame prediction may be used.
Further, in the exemplary embodiment, the selection between the predicted quantization step size and the previously encoded quantization step size is made based on the parameters of intra-frame prediction, but the present invention is not limited to use of intra-frame prediction information. For example, selections may be made to use the predicted quantization step size in the intra-frame prediction block and the previously encoded quantization step size in the inter-frame prediction block, or vice versa. When the parameters of inter-frame prediction meet a certain specific condition, a selection may be made to use the previously encoded quantization step size.
The number of image blocks used for prediction may be any number other than four. Further, in the exemplary embodiment, either a quantization step size in any one of image blocks or an average value of quantization step sizes in two image blocks is used as the predicted quantization step size as shown in equation (6). However, the predicted quantization step size is not limited to those in equation (6). Any other calculation result may be used as the predicted quantization step size. For example, as shown in equation (7), either a quantization step size in any one of image blocks or an intermediate value of three quantization step sizes may be used, or the predicted quantization step size may be determined using any other calculation. Further, the image blocks used for prediction are not necessarily to be adjacent to the current image block to be encoded. The image blocks used for prediction may be separated by a predetermined distance from the current image block to be encoded.
Further, in the exemplary embodiment, it is assumed that the image block to be encoded and the image blocks used for prediction are of the same fixed size. However, the present invention is not limited to the case where the image block as the unit of encoding is of a fixed size. The image block as the unit of encoding may be of a variable size, and the image block to be encoded and the image blocks used for prediction may be of sizes different from each other.
Further, in the exemplary embodiment, it is assumed that the quantization step size transmission blocks and the prediction block are of the same size. However, the present invention is not limited to the case of the same size, and the quantization step size transmission blocks and the prediction block may be of different sizes. For example, when two or more prediction blocks are included in the quantization step size transmission blocks, the prediction direction of any one prediction block of the two or more prediction blocks may be used for prediction of the quantization step size. Alternatively, the result of adding any calculation, such as the intermediate value calculation or the average value calculation, to the prediction directions of the two or more prediction blocks may be used for prediction of the quantization step size.
Further, in the exemplary embodiment, the difference between the quantization step size of the image block to be encoded and the predicted quantization step size is encoded based on the Exp-Golomb code. However, the present invention is not limited to use of the Exp-Golomb code, and encoding based on any other entropy code may be performed. For example, encoding based on Huffman code or arithmetic code may be performed.
Like the video decoding device in the fourth exemplary embodiment of the present invention, a video decoding device in an eighth exemplary embodiment of the present invention includes the variable-length decoder 201, the inverse quantizer 202, the inverse frequency transformer 203, the frame memory 204, the intra-frame predictor 205, the inter-frame predictor 206, and the prediction selector 207 as shown in
Since the operation of the entropy decoder 20111, the quantization step size buffer 20112, and the predicted quantization step size generator 20113 is the same as the operation of the quantization step size decoder in the video encoding device in the fourth exemplary embodiment, redundant description is omitted here.
The quantization step size selector 20114 selects either a quantization step size assigned to the previously decoded image block or a predicted quantization step size output from the predicted quantization step size generator 20113 according to the image prediction parameter, and outputs the result as a selectively predicted quantization step size. The quantization step size assigned to the previously decoded image block is saved in the quantization step size buffer 20112. A difference quantization step size generated by the entropy decoder 20111 is added to the selectively predicted quantization step size output, and the result is not only output as the quantization step size, but also stored in the quantization step size buffer 20112.
Such a structure enables the video decoding device to decode the quantization step size by receiving only a further smaller code rate compared with the video decoding device in the fourth exemplary embodiment. As a result, a high-quality moving image can be decoded and regenerated. The reason is that the quantization step size can be decoded by the operation of the quantization step size selector 20114 to selectively use the predicted quantization step size derived from the image prediction parameter and the previously encoded quantization step size so that the quantization step size can be decoded with a smaller code rate for a bitstream generated by applying both the visual-sensitivity-based adaptive quantization and the increase or decrease in quantization step size resulting from monitoring the output code rate, and hence a moving image can be decoded and regenerated by the smaller code rate.
Each of the exemplary embodiments mentioned above may be realized by hardware, or a computer program.
An information processing system shown in
In the information processing system shown in
Part or all of the aforementioned exemplary embodiments can be described as Supplementary notes mentioned below, but the structure of the present invention is not limited to the following structures.
(Supplementary Note 1)
A video encoding device for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, comprising quantization step size encoding means for encoding a quantization step size that controls the granularity of quantization, and prediction image generation means for using an image encoded in the past and a predetermined parameter to generate a prediction image of an image block to be encoded, the quantization step size encoding means for predicting the quantization step size by using the parameter used by the prediction image generation means, wherein the prediction image generation means generates the prediction image by using at least inter-frame prediction, and the quantization step size encoding means uses a motion vector of the inter-frame prediction to predict the quantization step size.
(Supplementary Note 2)
A video encoding device for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, comprising quantization step size encoding means for encoding a quantization step size that controls the granularity of quantization, and prediction image generation means for generating a prediction image of an image block to be encoded by using an image encoded in the past and a predetermined parameter, the quantization step size encoding means for predicting the quantization step size by using the parameter used by the prediction image generation means, wherein the quantization step size encoding means predicts the quantization step size by using a quantization step size assigned to a neighboring image block already encoded, the prediction image generation means generates the prediction image by using at least inter-frame prediction, predicted motion vector generation means for predicting a motion vector used for inter-frame prediction by using a motion vector assigned to the neighboring image block already encoded is further comprised, and the quantization step size encoding means uses a prediction direction of the predicted motion vector to predict the quantization step size.
(Supplementary Note 3)
A video decoding device for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, comprising quantization step size decoding means for decoding a quantization step size that controls a granularity of inverse quantization, and prediction image generation means for generating a prediction image of an image block to be decoded by using an image decoded in the past and a predetermined parameter, the quantization step size decoding means for predicting the quantization step size by using the parameter assigned to a neighboring image block already decoded, wherein the quantization step size decoding means predicts the quantization step size by using the parameter used to generate the prediction image, the prediction image generation means generates the prediction image by using at least inter-frame prediction, and the quantization step size decoding means uses a motion vector of the inter-frame prediction to predict the quantization step size.
(Supplementary Note 4)
A video decoding device for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, comprising quantization step size decoding means for decoding a quantization step size that controls the granularity of inverse quantization, and prediction image generation means for generating a prediction image of an image block to be decoded by using an image decoded in the past and a predetermined parameter, the quantization step size decoding means for predicting the quantization step size by using a quantization step size assigned to a neighboring image already decoded, wherein the quantization step size decoding means predicts the quantization step size using the prediction image used to generate the prediction image, the prediction image generation means generates the prediction image using at least inter-frame prediction, predicted motion vector generation means for using a motion vector assigned to the neighboring image block already encoded to predict a motion vector used for inter-frame prediction is further comprised, and the quantization step size decoding means uses a prediction direction of the predicted motion vector to predict the quantization step size.
(Supplementary Note 5)
A video encoding method for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, comprising a step of predicting a quantization step size that controls the granularity of quantization using a quantization step size assigned to a neighboring image block already encoded, and a step of to generating a prediction image of an image block to be encoded by using an image encoded in the past and a predetermined parameter, wherein the quantization step size is predicted by using the parameter used to generate the prediction image.
(Supplementary Note 6)
The video encoding method according to Supplementary note 5, wherein the prediction image is generated using at least intra-frame prediction in the step of generating the prediction image, and a prediction direction of the intra-frame prediction is used to predict the quantization step size.
(Supplementary Note 7)
The video encoding method according to Supplementary note 5, wherein the prediction image is generated using at least inter-frame prediction in the step of generating the prediction image, and a motion vector of the inter-frame prediction is used to predict the quantization step size.
(Supplementary Note 8)
The video encoding method according to Supplementary note 5, the prediction image is generated using at least inter-frame prediction in the step of generating the prediction image, a step of using a motion vector assigned to a neighboring image block already encoded to a predict a motion vector used for inter-frame prediction is comprised, and a prediction direction of the predicted motion vector is used to predict the quantization step size.
(Supplementary Note 9)
A video encoding method for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, comprising a step of predicting a quantization step size that controls the granularity of inverse quantization by using a quantization step size assigned to a neighboring image block already decoded, and a step of generating a prediction image using at least inter-frame prediction, wherein a motion vector of the inter-frame prediction is used to predict the quantization step size.
(Supplementary Note 10)
A video decoding method for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, comprising a step of predicting a quantization step size that controls the granularity of inverse quantization by using a quantization step size assigned to a neighboring image block already decoded, and a step of generating a prediction image using at least inter-frame prediction, a motion vector assigned to a neighboring image block already encoded is used to predict a motion vector is used for inter-frame prediction, and a prediction direction of the predicted motion vector is used to predict the quantization step size.
(Supplementary Note 11)
A video encoding program used in a video encoding device for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, causing a computer to use a quantization step size assigned to a neighboring image block already encoded in order to predict a quantization step size that controls the granularity of quantization.
(Supplementary Note 12)
The video encoding program according to Supplementary note (11), causing the computer to use an image encoded in the past and a predetermined parameter to execute a process of generating a prediction image of an image block to be encoded in order to predict the quantization step size using the parameter used to generate the prediction image.
(Supplementary Note 13)
The video encoding program according to Supplementary note (12), causing the computer to execute the process of generating the prediction image using at least intra-frame prediction in order to predict the quantization step size using a prediction direction of the intra-frame prediction.
(Supplementary Note 14)
The video encoding program according to Supplementary note (12), causing the computer to execute the process of generating the prediction image using at least inter-frame prediction in order to predict the quantization step size using a motion vector of the inter-frame prediction.
(Supplementary Note 15)
The video encoding program according to Supplementary note (12), causing the computer to execute the process of generating the prediction image using at least inter-frame prediction and a process of using a motion vector assigned to a neighboring image block already encoded to predict a motion vector used in inter-frame prediction in order to predict the quantization step size using a prediction direction of the predicted motion vector.
(Supplementary Note 16)
A video decoding program used in a video decoding device for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, causing a computer to use a quantization step size assigned to a neighboring image block already decoded in order to predict a quantization step size that controls the granularity of inverse quantization.
(Supplementary Note 17)
The video decoding program according to Supplementary note (16), causing the computer to execute a process of using an image decoded in the past and a predetermined parameter to generate a prediction image of an image block to be decoded in order to predict the quantization step size using the parameter used to generate the prediction image.
(Supplementary Note 18)
The video decoding program according to Supplementary note (17), causing the computer to execute the process of generating the prediction image using at least intra-frame prediction in order to predict the quantization step size using a prediction direction of the intra-frame prediction.
(Supplementary Note 19)
The video decoding program according to Supplementary note (17), causing the computer to execute the process of generating the prediction image using at least inter-frame prediction in order to predict the quantization step size using a motion vector of the inter-frame prediction.
(Supplementary Note 20)
The video decoding program according to Supplementary note (17), causing the computer to execute the process of generating the prediction image using at least inter-frame prediction and a process of using a motion vector assigned to a neighboring image block already encoded to predict a motion vector used in inter-frame prediction in order to predict the quantization step size using a prediction direction of the predicted motion vector.
(Supplementary Note 21)
A video encoding device for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, comprising quantization step size encoding means for encoding a quantization step size that controls the granularity of quantization; prediction image generation means for generating a prediction image of an image block to be encoded by using an image encoded in the past and a predetermined parameter, wherein the quantization step size encoding means predicts the quantization step size using the parameter used by the prediction image generation means; quantization step size prediction control means for controlling the operation of the quantization step size encoding means based on the predetermined parameter; and multiplexing means for multiplexing an operational parameter of the quantization step size encoding means into the result of the compressive encoding process.
(Supplementary Note 22)
The video encoding device according to Supplementary note 21, wherein the operational parameter of the quantization step size encoding means includes at least a flag representing whether to use the parameter used by the prediction image generation means or not, and the quantization step size prediction control means controls the operation of the quantization step size encoding means based on the flag.
(Supplementary Note 23)
The video encoding device according to Supplementary note 21, wherein the operational parameter of the quantization step size encoding means comprises at least a modulation parameter of the quantization step size, and the quantization step size encoding means uses the modulation parameter to modulate the quantization step size determined based on the parameter used by the prediction image generation means in order to predict the quantization step size.
(Supplementary Note 24)
The video encoding device according to Supplementary note 23, wherein the quantization step size encoding means adds a predetermined offset to the quantization step size determined based on the parameter used by the prediction image generation means in order to predict the quantization step size.
(Supplementary Note 25)
A video decoding device for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, comprising: quantization step size decoding means for decoding a quantization step size that controls the granularity of inverse quantization; prediction image generation means for using an image decoded in the past and a predetermined parameter to generate a prediction image of an image block to be decoded wherein the quantization step size decoding means uses a quantization step size assigned to a neighboring image block already decoded to predict the quantization step size; de-multiplexing mean for de-multiplexing a bitstream including an operational parameter of the quantization step size decoding means; and quantization step size prediction control means for controlling the operation of the quantization step size decoding means based on the de-multiplexed operational parameter of the quantization step size decoding means.
(Supplementary Note 26)
The video decoding device according to Supplementary note 25, wherein the de-multiplexing means extracts, as the operational parameter of the quantization step size decoding means, at least a flag representing whether to use the parameter used by the prediction image generation means, and the quantization step size prediction control means controls the operation of the quantization step size decoding means based on the flag.
(Supplementary Note 27)
The video decoding device according to Supplementary note 25, wherein the de-multiplexing means extracts, as the operational parameter of the quantization step size decoding means, at least a modulation parameter of the quantization step size, and the quantization step size decoding means uses the modulation parameter to modulate the quantization step size determined based on the parameter used by the prediction image generation means in order to predict the quantization step size.
(Supplementary Note 28)
The video decoding device according to Supplementary note 27, wherein the quantization step size decoding means adds a predetermined offset to the quantization step size determined based on the parameter used by the prediction image generation means in order to predict the quantization step size.
(Supplementary Note 29)
A video encoding method for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, comprising: encoding a quantization step size that controls the granularity of quantization; using an image encoded in the past and a predetermined parameter to generate a prediction image of an image block to be encoded; predicting the quantization step size using the parameter used in generating the prediction image; and multiplexing an operational parameter used in encoding the quantization step size into the result of the compressive encoding process.
(Supplementary Note 30)
The video encoding method according to Supplementary note 29, wherein the operational parameter used in encoding the quantization step size includes at least a flag representing whether to use the parameter upon generation of the prediction image in order to control an operation for encoding the quantization step size based on the flag.
(Supplementary Note 31)
The video encoding method according to Supplementary note 29, wherein the operational parameter used in encoding the quantization step size comprises at least a modulation parameter of the quantization step size, and upon encoding the quantization step size, the modulation parameter is used to modulate the quantization step size determined based on the parameter used in generating the prediction image in order to predict the quantization step size.
(Supplementary Note 32)
The video encoding method according to Supplementary note 31, wherein a predetermined offset is added to the quantization step size determined based on the parameter used in generating the prediction image to predict the quantization step size.
(Supplementary Note 33)
A video decoding method for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, comprising: decoding a quantization step size that controls the granularity of inverse quantization; using an image decoded in the past and a predetermined parameter to generate a prediction image of an image block to be decoded; using a quantization step size assigned to a neighboring image block already decoded to predict the quantization step size upon decoding the quantization step size; de-multiplexing a bitstream including an operational parameter used in decoding the quantization step size, and controlling an operation for decoding the quantization step size based on the de-multiplexed operational parameter.
(Supplementary Note 34)
The video decoding method according to Supplementary note 33, wherein at least a flag representing whether to use the parameter used in generating the prediction image of the image block to be decoded is extracted as the operational parameter used in decoding the quantization step size, and the operation for decoding the quantization step size is controlled based on the flag.
(Supplementary Note 35)
The video decoding method according to Supplementary note 33, wherein at least a modulation parameter of the quantization step size is extracted as the operational parameter used in decoding the quantization step size, and the modulation parameter is used to modulate the quantization step size determined based on the parameter used in generating the prediction image of the image block to be decoded in order to predict the quantization step size.
(Supplementary Note 36)
The video decoding method according to Supplementary note 35, wherein upon decoding the quantization step size, a predetermined offset is added to the quantization step size determined based on the parameter used in generating the prediction image of the image block to be decoded in order to predict the quantization step size.
(Supplementary Note 37)
A video encoding program for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, causing a computer to execute: a process of encoding a quantization step size that controls the granularity of quantization; a process of using an image encoded in the past and a predetermined parameter to generate a prediction image of an image block to be encoded; a process of predicting the quantization step size using the parameter used in generating the prediction image; and multiplexing an operational parameter used in encoding the quantization step size into the result of the compressive encoding process.
(Supplementary Note 38)
The video encoding program according to Supplementary note 37, wherein the operational parameter used in encoding the quantization step size includes at least a flag representing whether to use the parameter upon generation of the prediction image, and the computer is caused to control an operation for encoding the quantization step size based on the flag.
(Supplementary Note 39)
The video encoding program according to Supplementary note 37, wherein the operational parameter used in encoding the quantization step size includes at least a modulation parameter of the quantization step size, and upon encoding the quantization step size, the computer is caused to use the modulation parameter to modulate the quantization step size determined based on the parameter used in generating the prediction image in order to predict the quantization step size.
(Supplementary Note 40)
The video encoding program according to Supplementary note 39, wherein the computer is caused to add a predetermined offset to the quantization step size determined based on the parameter used in generating the prediction image in order to predict the quantization step size.
(Supplementary Note 41)
A video decoding program for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, causing a computer to execute: a process of decoding a quantization step size that controls the granularity of inverse quantization; a process of using an image decoded in the past and a predetermined parameter to generate a prediction image of an image block to be decoded; a process of using a quantization step size assigned to a neighboring image block already decoded to predict the quantization step size upon decoding the quantization step size; a process of de-multiplexing a bitstream including an operational parameter used in decoding the quantization step size, and a process of controlling an operation for decoding the quantization step size based on the de-multiplexed operational parameter.
(Supplementary Note 42)
The video decoding program according to Supplementary note 41, causing the computer to further execute: a process of extracting, as the operational parameter used in decoding the quantization step size, at least a flag representing whether to use the parameter used in generating the prediction image of the image block to be decoded; and a process of controlling an operation for decoding the quantization step size based on the flag.
(Supplementary Note 43)
The video decoding program according to Supplementary note 41, causing the computer to further execute: a process of extracting, as the operational parameter used in decoding the quantization step size, at least a modulation parameter of the quantization step size; and a process of using the modulation parameter to modulate the quantization step size determined based on the parameter used in generating the prediction image of the image block to be decoded in order to predict the quantization step size.
(Supplementary Note 44)
The video decoding program according to Supplementary note 43, wherein upon decoding the quantization step size, the computer is caused to add a predetermined offset to the quantization step size determined based on the parameter used in generating the prediction image of the image block to be decoded in order to predict the quantization step size.
(Supplementary Note 45)
A video encoding device for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, comprising quantization step size encoding means for encoding a quantization step size that controls the granularity of quantization, wherein the quantization step size encoding means predicts the quantization step size that controls the granularity of quantization by using an average value of quantization step sizes assigned to multiple neighboring image blocks already encoded.
(Supplementary Note 46)
A video decoding device for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, comprising quantization step size decoding means for decoding a quantization step size that controls the granularity of inverse quantization, wherein the quantization step size decoding means predicts the quantization step size that controls the granularity of inverse quantization by using an average value of quantization step sizes assigned to multiple neighboring image blocks already encoded.
(Supplementary Note 47)
A video encoding method for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, comprising using an average value of quantization step sizes assigned to multiple neighboring image blocks already encoded to predict a quantization step size that controls the granularity of quantization.
(Supplementary Note 48)
A video decoding method for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, comprising using an average value of quantization step sizes assigned to multiple neighboring image blocks already decoded to predict a quantization step size that controls the granularity of inverse quantization.
(Supplementary Note 49)
A video encoding program for dividing input image data into blocks of a predetermined size, and applying quantization to each divided image block to execute a compressive encoding process, causing a computer to execute: a process of encoding a quantization step size that controls the granularity of quantization; and a process of using an average value of quantization step sizes assigned to multiple neighboring image blocks already encoded to predict the quantization step size that controls the granularity of quantization.
(Supplementary Note 50)
A video decoding program for decoding image blocks using inverse quantization of input compressed video data to execute a process of generating image data as a set of the image blocks, causing a computer to execute: a process of decoding a quantization step size that controls the granularity of inverse quantization; and a process of using an average value of quantization step sizes assigned to multiple neighboring image blocks already decoded to predict the quantization step size that controls the granularity of inverse quantization.
While the present invention has been described with reference to the exemplary embodiments and examples, the present invention is not limited to the aforementioned exemplary embodiments and examples. Various changes understandable to those skilled in the art within the scope of the present invention can be made to the structures and details of the present invention.
This application claims priority based on Japanese Patent Application No. 2011-51291, filed on Mar. 9, 2011, and Japanese Patent Application No. 2011-95395, filed on Apr. 21, 2011, the disclosures of which are incorporated herein in their entirety.
Number | Date | Country | Kind |
---|---|---|---|
2011-051291 | Mar 2011 | JP | national |
2011-095395 | Apr 2011 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2012/001592 | 3/8/2012 | WO | 00 | 8/12/2013 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2012/120888 | 9/13/2012 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6363113 | Faryar et al. | Mar 2002 | B1 |
6590936 | Kadono | Jul 2003 | B1 |
20060018552 | Malayath et al. | Jan 2006 | A1 |
20090010557 | Zheng | Jan 2009 | A1 |
20090213930 | Ye | Aug 2009 | A1 |
20090296808 | Regunathan et al. | Dec 2009 | A1 |
20100020872 | Shimizu et al. | Jan 2010 | A1 |
Number | Date | Country |
---|---|---|
2732218 | Aug 2011 | CA |
1378384 | Nov 2002 | CN |
1926880 | Mar 2007 | CN |
101171842 | Apr 2008 | CN |
101946516 | Jan 2011 | CN |
103428497 | Dec 2013 | CN |
0542288 | Aug 1997 | EP |
1615444 | Jan 2006 | EP |
1727371 | Nov 2006 | EP |
2646921 | Aug 1997 | JP |
3012698 | Feb 2000 | JP |
4529919 | Aug 2010 | JP |
4613909 | Jan 2011 | JP |
2013-034037 | Feb 2013 | JP |
20100120691 | Nov 2010 | KR |
2345503 | Jan 2009 | RU |
2407221 | Dec 2010 | RU |
2006072894 | Jul 2006 | WO |
2006099229 | Sep 2006 | WO |
2009105732 | Aug 2009 | WO |
2009158113 | Dec 2009 | WO |
2011156458 | Dec 2011 | WO |
2012023806 | Feb 2012 | WO |
2013003284 | Jan 2013 | WO |
Entry |
---|
Communication dated Dec. 9, 2014, issued by the Russian Patent Office in corresponding Russian Application No. 2013145077/08. |
Extended European Search Report dated Mar. 31, 2015, issued by the European Patent Office in counterpart European application No. 15151219.1. |
Extended European Search Report dated Apr. 2, 2015, issued by the European Patent Office in counterpart European application No. 15151220.9. |
Karczewicz M et al., “R-D optimized quantization”, 27. JVT Meeting; Jun. 4, 2008 to Oct. 4, 2008; Geneva, ; (Joint Video Team OFISO/IECJTCl/SC29/WG11 and ITU-T SG.16), No. JVT-AA026, Apr. 19, 2008, XP030007369, 8 total pages. |
“Information technology—Coding of audio-visual objects—Part 10: Advanced Video Coding”, ISO/IEC 14496-10, May 15, 2009, 16 pages, Fifth Edition. |
Hirofumi Aoki, “Prediction-based QP derivation”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 JVT E215 5th Meeting at Geneva, CH, Mar. 16, 2011, pp. 1-11. |
Thomas Wiegand, “WD1: Working Draft 1 of High-Efficiency Video Coding”, Document JCTVC_C403, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 3rd Meeting at Guangzhou, China, Oct. 2010, 4 pages. |
International Search Report for PCT/JP2012/001592, dated May 29, 2012. |
Communication dated Aug. 11, 2014, issued by the European Patent Office in corresponding Application No. 12755202.4. |
Budagavi et al., “Delta QP signaling at sub-LCU level”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 4th Meeting: Daegu, KR, Jan. 20-28, 2011, Texas Instruments, Inc., 5 total pages, Document: JCTVC-D038. |
Aoki et al., “CE4 Subtest 2: QP prediction based on intra prediction (test2.3.g)”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 6th Meeting: Torino, IT, Jul. 14-22, 2011, NEC Corporation & Canon Inc., 9 total pages, Document: JCTVC-F159. |
Kobayashi et al., “Sub-LCU level delta QP signaling”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 5th Meeting: Genève, CH, Mar. 16-23, 2011, Canon Inc., 9 total pages, Document: JCTVC-E198. |
“Working Draft No. 2, Revision 2 (WD-2)”, Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Geneva, Switzerland, Jan. 29-Feb. 1 2002, 106 total pages, Document JVT-B118r2. |
Communication dated Nov. 27, 2014, issued by the Korean Intellectual Property Office in counterpart Korean application No. 10-2013-7020857. |
Communication dated Feb. 22, 2016 from the State Intellectual Property Office of the P.R.C. in counterpart application No. 201280012218.8. |
Communication dated Oct. 4, 2016 from the Russian Patent and Trademark Office in counterpart application No. 2015121683/08. |
Communication dated Jan. 18, 2017, from the Canadian Patent Office in counterpart application No. 2,909,242. |
Communication dated Dec. 5, 2016, issued by the Russian Patent Office in corresponding Russian Application No. 2015121760/08. |
Chao Pang et al., “Improved dQP calculation method”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, (4 pages total), Mar. 2011. URL, http://phenix.int-evry.fr/jd/doc_end_user/current_document.php?id=2145. |
Communication dated Nov. 17, 2015 from the Japanese Patent Office in counterpart application No. 2013-503397. |
Communication dated Nov. 1, 2017 from the Russian Patent Office in counterpart Application No. 2017119468/08. |
Communication dated Feb. 27, 2018, issued by the Japanese Patent Office in counterpart Japanese application No. 2017-073515. |
Communication dated Jul. 19, 2018 from the State Intellectual Property Office of the P.R.C. in counterpart Chinese application No. 201710448815.X. |
Communication dated Jul. 23, 2018 from the Russian Patent Office in counterpart Russian application No. 2018115678/07. |
Communication dated May 24, 2018, issued by the Intellectual Property Office of India in corresponding Indian Application No. 7019/CHENP/2013. |
Communication dated Nov. 7, 2018 in Canadian Intellectual Property Office in counterpart application No. 2909242. |
Communication dated Jan. 30, 2019, from the State Intellectual Property Office of the P.R.C in counterpart application No. 201710448724.6. |
Number | Date | Country | |
---|---|---|---|
20130322526 A1 | Dec 2013 | US |