The present invention generally relates to the field of video compression and decompression and, more particularly, to a video coding method for the compression of a bitstream corresponding to an original video sequence that has been divided into successive groups of frames (GOFs) the size of which is N=2n with n=1, or 2, or 3, . . . , said coding method comprising the following steps, applied to each successive GOF of the sequence:
The invention also relates to a corresponding coding device, to a transmittable video signal generated by means of such a coding method, to a method for decoding said signal, and to a decoding device for carrying out said decoding method.
From MPEG-1 to H.264, standard video compression schemes were based on so-called hybrid solutions (an hybrid video encoder uses a predictive scheme where each frame of the input video sequence is temporally predicted from a given reference frame, and the prediction error thus obtained by difference between said frame and its prediction is spatially transformed, for instance by means of a bi-dimensional DCT transform, in order to get advantage of spatial redundancies). A different approach, later proposed, consists in processing a group of frames (GOF) as a three-dimensional (3D, or 2D+t) structure and spatio-temporally filtering it in order to compact the energy in the low frequencies (as described for instance in “Three-dimensional subband coding of video”, C. I. Podilchuk and al., IEEE Transactions on Image Processing, vol. 4, no. 2, February 1995, pp. 125-139). Moreover, the introduction of a motion compensation step in such a 3D subband decomposition scheme allows to improve the overall coding efficiency and leads to a spatio-temporal multiresolution (hierarchical) representation of the video signal thanks to a subband tree, as depicted in
The 3D wavelet decomposition with motion compensation, illustrated in said
However, all the 3D subband solutions suffer from the following drawback: since an entire GOF is processed at once, all the pictures in the current GOF have to be stored before being spatio-temporally analyzed and encoded. The problem is the same at the decoder side, where all the frames of a given GOF are decoded together. A solution to said problem is described in a european patent application filed by the applicant on Jun. 28, 2002, with the registration number 02291621.7 (PHFR020065). In said document, the proposed low-memory solution, in which a progressive branch-by branch reconstruction of the frames of a GOF of the sequence is performed instead of a reconstruction of the whole GOF at once, is based on the following remarks. As illustrated in
It appears that only the subbands H0, LH0, LLH0 and LLL0 are needed to decode the first two frames F1, F2 (i.e. the couple C0) of the GOF. Furthermore, the first subband H0 contains some information only on these two first frames F1,F2. So, once these frames F1, F2 are decoded, the first subband H0 becomes useless and can be deleted and replaced: the next subband H1 is now loaded in order to decode the next couple C1 including the two frames F3, F4. Only the subbands H1, LH0, LLL0 and LLH0 are now needed to decode these frames F3, F4 and, as previously for H0, the subband H1 contains some information only on these two frames F3, F4. So, once these two frames F3, F4 are decoded, the second subband H1 can be deleted, and replaced by H2. And so on: these operations are repeated for F5,F6 and F7,F8 (in the general case, for all the successive couples of frames of the GOF). The bitstream (the illustrated organization of which is only an example that does not limit the scope of the invention at the decoding side) thus formed for each successive GOF may be encoded by means of an entropy coder followed by an arithmetic coder (for instance, referenced 21 and 22 respectively). In the illustrated specific example, the coded bitstream finally available (and transmitted or stored) successively comprises, for the current GOF, a header and the coding bits corresponding to the subbands LLL0, LLH0, LH0, LH1, H0, H1, H2 and H3.
The practical operations performed according to the low-memory solution proposed in the cited european patent application were then the following. The part of the coded bitstream corresponding to the current GOF is decoded a first time, but only the coded part that, in said bitstream, corresponds to the first couple of frames C0 (the two first frames F1 and F2)—i.e. the subbands H0, LH0, LLL0, LLH0—is, in fact, stored and decoded. When the first two frames F1, F2 have been decoded, the first H subband, referenced H0, becomes useless and its memory space can be used for the next subband to be decoded. The coded bitstream is therefore read a second time, in order to decode the second H subband, referenced H1, and the next couple of frames C1 (F3, F4). When this second decoding step has been performed, said subband H1 becomes useless and the first LH subband too (referenced LH0). They are consequently deleted and replaced by the next H and LH subbands (respectively referenced H2 and LH1), that will be obtained thanks to a third decoding of the same input coded bitstream, and so on for each couple of frames of the current GOF.
This multipass decoding solution, comprising an iteration per couple of frames in a GOF, is detailed with reference to FIGS. 3 to 6. During the first iteration, the coded bitstream CODB received at the decoding side is decoded by an arithmetic decoder 31, but only the decoded parts corresponding to the first couple of frames C0 are stored, i.e. the subbands LLL0, LLH0, LH0 and H0 (see
When this first decoding step is achieved, a second one can begin. The coded bitstream is read a second time, and only the decoded parts corresponding to the second couple of frames C1 are now stored: the subbands LLL0, LLH0, LH0 and H1 (see
When this second decoding step is achieved, a third one can begin similarly. The coded bitstream is read a third time, and only the decoded parts corresponding to the third couple of frames C2 are now stored: the subbands LLL0, LLH0, LH1 and H2 (see
When this third decoding step is achieved, a fourth one can begin similarly. The coded bitstream is read a fourth time (the last one for a GOF of four couples of frames), only the decoded parts corresponding to the fourth couple of frames C3 being stored: the subbands LLL0, LLH0, LH1 and H3 (see
This procedure is repeated for all the successive GOFs of the video sequence. When decoding the coded bitstream according to this procedure, at most two frames (for example: F1, F2) and four subbands (with the same example: H0, LH0, LLH0, LLL0) have to be stored at the same time, instead of a whole GOF. A drawback of that low-memory solution is however its complexity. The same input bitstream has to be decoded several times (as many times as the number of couples of frames in a GOF) in order to decode the whole GOF.
It is therefore a first object of the invention to propose a coding method allowing to significantly reduce at the decoding side the memory space needed to decode the 3D subband encoded bitstream while avoiding the previous iterative solution.
To this end, the invention relates to a video coding method such as defined in the introductory part of the description and which is further characterized in that, in the encoding step, the 2n frequency subbands available at the end of the analysis step for each GOF are coded in an order that corresponds to a progressive reconstruction of the couples of frames of said GOF in their original order, the bits necessary to later decode the first couple of frames being at the beginning of the coded bitstream, followed by the extra bits necessary to decode the second couple of frames, and so on, up to the last couple of frames of the current GOF. The invention also relates to a corresponding coding device, allowing to carry out said coding method.
It is also an object of the invention to propose a transmittable video signal consisting of a coded bitstream generated by such a coding method, a method for decoding said signal, using a reduced memory space with respect to the decoding method previously described, and a corresponding decoding device, allowing to carry out said decoding method.
The present invention will now be described, by way of example, with reference to the accompanying drawings in which:
FIGS. 3 to 6 illustrate, in a decoding method already proposed by the applicant, the operations iteratively performed for decoding the input coded bitstream;
FIGS. 8 to 10 show respectively the three successive parts of a flowchart that illustrates an implementation of the video coding method according to the invention;
The principle of the invention is the following: the input bitstream is re-organized at the coding side in such a way that the bits necessary to decode the first two frames are at the beginning of the bitstream, followed by the extra bits necessary to decode the second couple of frames, followed by the extra bits necessary to decode the third couple of frames, etc. This solution according to the invention is illustrated in
As indicated, these elementary bitstreams BS0 to BS3 are then concatenated in order to constitute the global bitstream BS which will be transmitted. In said bitstream BS, it does not mean that the part BS1 (for example) is sufficient to reconstruct the frames F3, F4 or even to decode the associated subband H1. It only means that with the part BS0 of the bitstream, the minimum amount of information needed to decode the first two frames F1, F2 (couple C0) is available, then that with said part BS0 and the part BS1, the following couple of frames C1 can be decoded, then that with said parts BS0 and BS1 and the part BS2, the following couple of frames C2 can be decoded, and then that with said parts BS0, BS1, BS2 and the part BS3, the last couple of frames C3 can be decoded (and so on, in the general case of 2n couples of frames in a GOF).
With this re-organized bitstream, the multiple-pass decoding scheme as previously proposed is no longer necessary. The coded bitstream has been organized in such a way that, at the decoding side, every new decoded bit is relevant for the reconstruction of the current frames.
An implementation of the video coding method according to the invention is illustrated in the flowchart of FIGS. 8 to 10. As illustrated in
As illustrated in
An updating step 94 (UPDAT) is then provided for establishing a connection between each of the subbands thus obtained and the original couples of frames, i.e. for determining if a given subband will be involved or not at the decoding side in the reconstruction of a given couple of frames of the current GOF. At the end of the temporal decomposition, the following subbands:
After an entropy coding step 110 (ENC), a control (step BUDLEV 111) of the bit budget level is performed at the output of the encoder. If the bit budget is not reached, the current output bit b is considered (step 112), n is initialized (step 113), and a test 115 is performed on a considered subband S (step 114) from the ensemble T. If b contains some information about S (step BINFS 115) and if S is linked with the couple Cn (step SLINKCN 116), the concerned bit b is appended (step BAPP 117) to the bitstream BSn (n=0, 1, 2, 3 in the example previously given with reference to FIGS. 1 to 7) and the following output bit b is considered (i.e. a repetition of the steps 111 to 117 is carried out). If b does not contain any information about S, or if S is not linked with the couple Cn, the next subband S is considered (step NEXTS 118). If all subbands in T have not been considered (step ALLS 119), the operations (steps 115 to 118) are further performed. If all said subbands have been parsed, the value of n is increased by one (step 120), and the operations (steps 114 to 120) are further performed for the next original couple of frames (and so on, up to the last value of n). At the output of the coding step 110, if the bit budget has been reached, no more output b is considered.
Finally, when all output bits have been considered or if the bit budget has been reached (step 111), the whole coding step is considered as achieved and the individual bitstream BSn obtained are concatenated (step CCAT 130) into the final bitstream BS (from n=0 to its maximum value). At the decoding side, the decoding step is performed as now explained with reference to
The described functioning of the decoding of the first couple C0 (state “0”) is therefore fairly straightforward with the above explanations, and
Number | Date | Country | Kind |
---|---|---|---|
02291803.1 | Jul 2002 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB03/03159 | 7/11/2003 | WO | 1/12/2005 |