The disclosure generally relates to a method of encoding a video picture, a method of decoding a video picture, an apparatus for encoding a video picture, an apparatus for decoding a video picture and a computer program product.
Recently, the popularity of video applications on mobile devices, video conferences, VOD, live video broadcasting, and so forth, has been drastically increasing. However, a wireless channel may be error-prone and a packet loss may be commonplace. A packet loss may cause a decoding error and/or a perception of severe quality degradation. Therefore, there is a demand to minimize such quality degradation due to a data loss.
According to one aspect of the disclosure, a method of encoding a video picture includes dividing an original picture into a plurality of picture parts; selecting different reference pictures for the respective picture parts; carrying out inter prediction of the picture parts using the respective reference pictures selected in the selecting; and dispersing data acquired based on the inter prediction in packets in such a manner that encoded slices of the data corresponding to one of the picture parts will be included in one of the packets and encoded slices of the data corresponding to another of the picture parts will be included in another of the packets.
Other objects, features and advantages of the disclosure will become more apparent from the following detailed description when read in conjunction with the accompanying drawings.
According to the embodiment of the present invention, the following three steps (1), (2) and (3) are carried out by an encoder.
(1) While encoding a picture with an inter-prediction technique, the original picture is divided into some parts based on some rules. The thus acquired parts will be referred to as “subbands”, hereinafter. Each subband is down sampled from the original picture. The specific rules will be described later using
(2) The slices encoded from the different subbands are transmitted in different packets to reduce the probability of all the subbands being lost at the same burst.
(3) During a motion estimation (ME) process, the encoder generates a random minimal distance (RMD) for each one of the subbands and ensures different RMDs for the different subbands. As a result, when the blocks in a subband are processed, another picture is used which has a greater distance from the current picture than the RMD of this subband as a reference picture only. The calculation of the RMD will be described later.
As a result of the step (3) being carried out in encoding, while decoding a block whose reference picture is lost, the data can be interpolated with neighboring blocks. Because the respective neighboring blocks belong to different subbands and the reference pictures of these blocks are different and are transmitted separately, the probability of all of these reference pictures being lost is low.
One example of dividing an original picture into some subbands in the step (1) is as follows.
1) The number (N, i.e., the “dividing number”) of subbands is determined based on an estimated channel condition. Commonly, a worse channel condition requires more subbands.
2) The block B(x,y) is allocated to the subband S(i) based on (x+y) mod N.
Because burst and random are two main features of packet losses in a wireless channel, a key point of the present embodiment is that highly correlated data is not transmitted at the same packet or even at neighboring packets. In the present embodiment, the residual and motion vectors (MVs) of a current block, the reference of the current block and the data of neighboring blocks are transmitted in different packets with some separation distance. Thereby, if part of data is lost during transmission, there is other data that can be used to conceal the error as much as possible.
An RMD mentioned above in the description of the step (3) is determined in such a manner that an RMD is sufficiently large as to be able to deal with a possible burst loss. For example, an RMD is determined to be greater than an average distance between two adjacent burst losses. For example, an RMD is “8” or more if the average distance between two adjacent burst losses is “7” in a specific channel. Each of RMDs of respective subbands can be determined in a random manner but meets the condition of greater than or equal to “8”. Note that the actual “distance” can be measured by, for example, the number of pictures (frames) inserted between a current picture and its reference picture.
In the example of
As shown in
Similarly, when another of the blocks is to be predicted, another reference picture is determined to have a distance Δ (>RMD(1)) from the current picture. Assuming that the block to be predicted belongs to a subband S(1), the RMD(1) is generated therefor in the above-mentioned step (3). Then, the block is predicted by using a predetermined block of the thus determined reference picture not belonging to the subband S(1). Thus, when one of the blocks is to be predicted, a reference picture is determined to have a distance Δ (>RMD(i)) (where i=0, 1, 2, . . . , N) from the current picture. Assuming that the block to be predicted belongs to a subband S(i), the RMD(i) is generated therefor in the above-mentioned step (3). Then, the block is predicted by using a predetermined block of the thus determined reference picture not belonging to the subband S(i).
Next, using
As shown in
An n-th frame (i.e., a current picture) 51 in a given video sequence is divided into subbands by the band splitter 11 as described above as the step (1) to acquire blocks 52 of an i-th subband S(i) (where i=1, 2, 3, . . . , N).
Each block 52 of the subband S(i) has a predicted block subtracted by the subtractor 17 to acquire a residual Dn.
The transformation block 18 and the quantizer 19 carry out a transformation process and a quantization process on the thus acquired residual Dn to acquire data X.
The inverse quantizer 22 and the inverse transformer 23 carry out an inverse quantization process and an inverse transformation process on the residual Dn. The adder 24 adds the prediction to the thus acquired residual D′n to acquire the reconstructed block uB′n.
The filter 25 carries out a filtering process on the reconstructed block uB′n and the thus acquired reconstructed block 54 is stored to be used as a reference block(s) B′n-1,iT 53 for an inter-prediction process of another frame carried out by the motion estimator 12 and the motion compensator 13.
The motion estimator 12 carries out a ME process including a process of finding the reference block B′n-1,iT 53 (of a reference picture) from the subbands of the previously encoded frame(s) (F′n-1) except the current subband S(i) in the manner described above for the step (3) (to meet the condition of RMD) to predict the current block. The motion compensator 13 carries out a MC process of acquiring a predicted block based on the result of the ME process.
The Choose Intra prediction module 14 selects intra prediction or inter prediction according to a standard rule, for example, a rule defined in a video coding standard such as MPEG2, MPEG4, AVC, or HEVC.
The Intra predictor 15 carries out an intra-prediction process of the current block 52 using the reconstructed block uB′n not belonging to the current subband S(i), when the Choose Intra prediction module 14 selects intra prediction.
The changeover switch 16 selects the intra-predicted block P that is output from the Intra predictor 15 when the Choose Intra prediction module 14 selects intra prediction. The changeover switch 16 selects the inter-predicted block P that is output from the motion compensator 13 when the Choose intra prediction module 14 selects inter prediction.
After all the subbands of the current frame have been thus processed, the thus acquired data is processed by the reorder module 20.
Besides reordering the P pictures to the front of the B pictures, which is the standard function of the reorder module in a video coding standard such as MPEG2, MPEG4, AVC, or HEVC, the reorder module 20 carries out dispersing the data X in such a manner that the encoded slices (that are output from the entropy encoder 21) from the different subbands will be included in different packets.
The entropy encoder 21 carries out an entropy encoding process of the data of the current frame thus processed by the reorder module 20 and the entropy encoded data is transmitted in a form of NAL units (i.e., packets).
The other components and the other functions of the above-described components can belong to a standard video encoder.
Next, using
In Step S1, the current picture (frame) is divided into a plurality of subbands by the band splitter 11. In the example of
Each block in each of the subbands 1 and 2 is encoded according to a standard compression algorithm, except that a block(s) in the other subband(s) is(are) referenced during the intra-prediction process and an ME process in the inter-prediction process.
Thus, a compressed bitstream is constructed.
In more detail, in Steps S11 and S41, each block in each of the subbands 1 and 2 is processed in sequence.
In Steps S12 and S42, for each of the subbands 1 and 2, intra prediction or inter prediction is selected according to the predetermined rule by the Choose Intra prediction module 14. When intra prediction is selected, the process proceeds to Step S21 and S51. When inter prediction is selected, the process proceeds to Step S31 and S61.
In Step S21, an intra-prediction process is carried out by the Intra predictor 15 using a reconstructed block 91 belonging to the subband 2. In Step S51, an intra-prediction process is carried out by the Intra prediction block 15 using a reconstructed block 92 belonging to the subband 1.
In each of Steps S22 and S52, a transformation process, a quantization process, an inverse quantization process and an inverse transformation process are carried out based on the data acquired from the corresponding one of Steps S21 and S51 by the transformer 18, the quantizer 19, the inverse quantizer 22 and the inverse transformer 23.
In Step S31, a ME process including a process of finding a reference block 91 (of a reference picture) from the subband 2 of the previously encoded frame(s) is carried out by the motion estimator 12 in the manner described above for the step (3) (to meet the condition of RMD) to predict the current block. In Step S61, a ME process including a process of finding a reference block 92 (of a reference picture) from the subband 1 of the previously encoded frame(s) is carried out by the motion estimator 12 in the manner described above for the step (3) (to meet the condition of RMD) to predict the current block.
In each of Steps S32 and S62, a transformation process, a quantization process, an inverse quantization process and an inverse transformation process are carried out on the data, acquired from the corresponding one of Steps S31 and S61, by the transformer 18, the quantizer 19, the inverse quantizer 22 and the inverse transformer 23.
In each of Step S33 and Step S63, a MC process of acquiring a predicted block is carried out by the motion compensator 13 based on the result of the ME process of the corresponding one of Steps S31 and S61.
In each of Steps S34 and S64, the current block is reconstructed and stored according to a standard process, and will be used for intra prediction of a subsequent block or inter prediction of a subsequent frame (picture).
In each of Steps S35 and S65, the process returns to the corresponding one of Steps S11 and S41 until all the blocks in the corresponding one of the subbands 1 and 2 have been processed.
In Step S71, the data of all the blocks of the subbands 1 and 2 of the current picture (frame) acquired from the transformation process and the quantization process in Steps S22, S32, S52 and S62 are reordered by the reorder module 20 in the manner described above. Thus, besides reordering the P pictures to the front of the B pictures, which is the standard function of the reorder module in a video coding standard such as MPEG2, MPEG4, AVC, or HEVC, the data are dispersed in such a manner that the encoded slices from the different subbands 1 and 2 will be included in different packets. In other words, the encoded slices from the subband 1 will be included in a certain packet(s) while the encoded slices of the subband 2 will be included in the other packet(s), for example.
In Step S72, the thus processed data of the current picture is entropy encoded by the entropy encoder 21.
Note that a specific method of selecting a reference block not belonging to a current subband to be used to predict a current block can be such that, if the reference block to be used to predict the current block belongs to the current subband according to a standard process, the nearest block belonging to another subband can be selected, for example.
Next, as shown in
As shown in
The entropy decoder 101 receives given data. The data can be of a form of NAL units (i.e., packets) and output data of the encoder 10 described above and in
The reorder module 102 reorders the data processed by the entropy decoder 101 into the original order to acquire data X corresponding to an original picture.
The inverse quantizer 103 and the inverse transformer 104 carry out an inverse quantization process and an inverse transformation process on a block included in the thus acquired data of the original picture to acquire a residual D′n.
The adder 105 adds the residual D′n and a predicted block P to acquire a block of a current frame uF′n.
The filter 109 carries out a filtering process to acquire a reconstructed block B′n,i 152.
The motion compensator 107 carries out a MC process using the reference block(s) B′n-1,iT 151 from the subbands of the previously encoded frame(s) except the current subband S(i) as in the encoder 10 described above.
The intra predictor 108 carries out an intra-prediction process using a reference block not belonging to the current subband S(i) as in the encoder 10 described above.
Next, using
In Step S101, given data is received and parsed. The data can be of a form of NAL units (i.e., packets) and output data of encoding process described above and in
In Step S102, the parsed data is entropy encoded by the entropy decoder 101.
In Step S103, the entropy encoded data is reordered into the original order to acquire data corresponding to an original picture by the reorder module 102.
In Step S104, a block included in the reordered data is inverse quantized by the inverse quantizer 103.
In Step S105, the inverse quantized block is inverse transformed by the inverse transformer 104 to acquire a residual.
In Step 106, intra prediction or inter prediction is selected according to whether intra prediction or inter prediction was selected when the current block was encoded. When intra prediction is selected, the process proceeds to Step S107. When inter prediction is selected, the process proceeds to Step S109.
In Step S107, the reference block is read which is not included in the current processed subband as in the encoding process described above.
In Step S108, an intra-prediction process is carried out by the intra predictor 108 using the thus read reference block.
In Step S109, the reference block(s) is(are) read from the subbands of the previously encoded frame(s) except the current subband as in the encoding process described above.
In Step S110, a MC process is carried out by the motion compensator 107 using the thus read reference block(s).
In Step S111, a filtering process is carried out by the filter 109 on the thus acquired block.
In Step S112, the thus acquired block is saved as a block included in a reconstructed (decoded) picture.
Each of the encoder 10 described above using
As shown in
The CPU 210 controls the entirety of the computer 200 by executing a program loaded in the RAM 220. The CPU 210 also performs various functions by executing a program(s) (or an application(s)) loaded in the RAM 120.
The RAM 220 stores various sorts of data and/or a program(s).
The ROM 230 also stores various sorts of data and/or a program(s).
The storage device 240, such as a hard disk drive, a SD card, a USB memory and so forth, also stores various sorts of data and/or a program(s).
The input device 250 includes a keyboard, a mouse and/or the like for a user of the computer 200 to input data and/or instructions to the computer 200.
The output device 260 includes a display device or the like for showing information such as a processed result to the user of the computer 200.
The computer 200 performs the process described above using
Thus, the method of encoding a video picture, the apparatus for encoding a video picture, the method of decoding a video picture, the apparatus for decoding a video picture and the computer program product have been described by the specific embodiment. However, the present invention is not limited to the embodiment, and variations and replacements can be made within the scope of the claimed invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2014/094479 | 12/22/2014 | WO | 00 |