The invention relates to a method and a device for transcoding a sequence of images from the MPEG2 standard to an MPEG4 standard.
The field is that of video compression for storing or transmitting audio and video data. Interest is particularly focused on the block compression schemes applied to the interlaced images, in the context of transcoding operations from MPEG2 to MPEG4.
Interlaced video is the format most commonly used for television. A frame is made up of two fields, even and odd, also called top field and bottom field, which respectively represent the odd and even lines of the image. Since the top field and the bottom field are acquired at two different instants, certain images of a sequence present interlacing artifacts due to a motion between the two acquisitions.
To better support this format, the MPEG 4 or H.264 standard can be used to code an image according to three different modes: “frame”, “field”, “MBAFF” (acronym for “Macro Block Adaptive Field Frame”). In frame mode, the interlaced image is coded as it is, and in field mode, the two fields are coded separately. The MBAFF mode can be used in addition to the frame mode to enhance this mode by making it possible to locally separate the fields of the image, this mode, also called frame+MBAFF, making it possible to code groups of macroblocks of the frame in field mode.
Hereinafter, a block made up of lines of a single field will be called field block, a block made up alternately of an even field line and an odd field line will be called frame block. A block of coefficients calculated by discrete cosine transformation of a block of field, respectively frame, residues will be called field DCT block, respectively frame DCT block, obtained by field DCT coding, respectively frame DCT coding.
Thus, as mentioned in section 6.1.3 of the MPEG2 standard, document ISO/IEC 13818-2: 1996, entitled Macrobloc, in the frames, for which the DCT coding can be used to form images both with two fields and with one field, the internal organization of the macroblock is different according to the image type:
In the case of single-field images, each contains only the lines obtained from one of the two fields. In this case, each block of a macroblock is made up of lines extracted from the succession of the lines of the image.
In the figures, the shaded lines correspond to a first field or top field and the unshaded lines to a second field or bottom field.
The DCT transform is performed on macroblocks of 4×4 pixels making up these macroblocks or this supermacroblock.
The MPEG2 standard allows a prediction between fields, of the same parity or of opposite parity, or between frames, and, for the prediction between frames, a DCT coding of blocks of 8×8 frame residues or of blocks of 8×8 field residues, after rearrangement of the macroblock. The MPEG4 standard dictates, for the DCT coding of the blocks of residues of a macroblock, the same mode, field or frame, as that used for the prediction of this macroblock.
The MPEG2 stream to be transcoded to an MPEG4 stream is partially decoded. For example, the headers relating to the macroblocks are decoded, the coding mode, the motion vectors, the DCT type used, field or frame, etc., are extracted. The transcoding in MPEG4 mode uses this information, the coding modes being, generally, retained for the compatible modes. The DCT coefficients are not decoded. This transcoding makes it possible to save, among other things, on costly movement estimation operations.
The problem arises when a macroblock of size 16×16 of a frame of the MPEG2 stream is coded according to the field DCT coding mode, the 16×16 macroblock being arranged in two 16×8 field blocks, themselves arranged in two 8×8 field blocks for the DCT transformation. In this case, according to one solution from the prior art, the MPEG4 transcoding is performed by assuming that the DCT blocks are frame blocks corresponding to a movement prediction or estimation on frame macroblocks. In practice, as indicated previously, the MPEG4 standard uses, for the DCT coding, only the blocks whose structure, field or frame, corresponds to that on which the movement estimation or prediction was performed. This solution, which therefore consists in likening the field DCT blocks of the coded frame of the MPEG2 stream to frame DCT blocks, on the MPEG4 transcoding, generates decoding errors. In practice, this is tantamount to likening the MPEG2 DCT coefficients to frame block coefficients whereas the calculations have been performed on field blocks, the decoding of these blocks then creating artifacts.
Another solution involves performing a decoding of the field DCT blocks by inverse discrete cosine transformation to obtain blocks of field residues, rearranging in the macroblock these field blocks to obtain frame blocks and performing a DCT transformation of these frame pixel blocks to obtain frame block DCT coefficients. This procedure incurs a high processing cost, the coding cost possibly also being high, in particular for object boundary areas in relative motion, because of the high vertical frequencies, as represented for example in
These solutions are therefore not optimal, whether from the processing time or image quality point of view.
One aim of the invention is to overcome the abovementioned drawbacks. The subject of the invention is a method of transcoding data from the MPEG2 standard to an MPEG4-type standard comprising an MBAFF (MacroBlock Adaptive Field Frame) mode, characterized in that, if the data relating to a first macroblock indicates that it is coded in frame prediction and field DCT mode, it comprises the following steps:
According to one particular embodiment, the supermacroblock is of size 16×32, is partitioned into two 16×16 field macroblocks if the MPEG2 motion vectors V0 and V1 are equal or into four 8×16 field sub-macroblocks otherwise.
According to one particular embodiment, the motion vectors associated with the partitions of the supermacroblock are calculated by using a field reference base instead of frame reference base.
According to one particular embodiment, the step for calculation of a motion vector of a partition of a supermacroblock consists in calculating the vertical component of the motion vectors according to the following equations:
According to one particular embodiment, the headers relating to the macroblocks are modified to insert the information for defining the new calculated coding modes corresponding to the partitions, field predictions and field DCT codings, and the motion vector fields are modified to insert the information for defining the calculated values of the motion vectors.
The invention also relates to a transcoding device according to the abovementioned method, characterized in that it comprises:
The de-interlacing of the macroblocks to be coded, obtained by the MBAFF mode, makes it possible in an initial MPEG2 frame prediction and field DCT context, to best adapt to the supermacroblock structure provided with the prediction and DCT field constraint, involving an adjustment of the motion vectors. The MPEG2 data stream can supply blocks of field DCT coefficients for a macroblock predicted in frame mode. By exploiting the possibility of an MPEG4 coding of the macroblocks of a supermacroblock of a frame in field mode, it is possible to use the fact that the blocks, of a frame macroblock, have been determined in field DCT mode in the MPEG2 coding. The MBAFF mode makes it possible to reconfigure a supermacroblock of a frame as two macroblocks or field macroblock partitions. A frame macroblock structure is converted into a field structure, the calculation of the DCT for the MPEG4 standard then working implicitly in field structure mode, on the residual prediction error as for the MPEG2 standard, but in 4×4 DCT or 8×8 DCT mode depending on the nature of the profile used, namely 4×4 DCT for the main profile, 4×4 DCT or 8×8 DCT for the high profile. The duly retained field mode is the mode that has made it possible to obtain the fewer high frequency coefficients or at least lower amplitude coefficients, in the DCT transformation, because of the very de-interlacing of the residue blocks.
A frame prediction in the MPEG2 coding presupposes that this prediction gives the best correlation. Now, the transcoding, according to the inventive method, proposes dictating a field prediction when the DCT coding is performed on a field block. In practice, the saving on the DCT coding is, on average, much greater than the loss that would result from the change of prediction, that is, from a less good prediction or correlation between fields than between frames. In other words, that which can be lost in prediction is much less than that which is gained in the DCT. Also, the choice of the frame prediction mode does not necessarily indicate a better correlation, the coding of the vectors in field prediction mode being more costly than in frame prediction mode.
The method therefore consists in selecting, for the transcoding relating to a current macroblock, the second macroblock forming, with this current macroblock, the supermacroblock, then in determining, from the MPEG2 motion vectors, the prediction vectors of each of the field macroblocks of the supermacroblock of the frame. The MPEG2 motion vectors used for the frame prediction are corrected to be used, in the MPEG4 decoding, to calculate the predicted field block in the reference field, the coding of a motion vector using a different reference base, taking into account the numbering of the field lines, and no longer frame, lines.
Other features and advantages of the invention will become clearly apparent from the description given below, by way of non-exclusive example, and given in light of the appended drawings which represent:
The field DCT coding mode, for a macroblock in frame prediction mode, occurs when the prediction is applied in frame mode to an object in motion and in particular to the boundaries of the object. This phenomenon is illustrated by
When the prediction, in the MPEG2 standard, has been performed, for the macroblock, in frame mode, and the DCT coding has been performed in field mode, the inventive method calculates a correction of the motion vectors obtained from the motion estimation in the MPEG2 coding, to adapt them to a field mode prediction, and defines an appropriate partition for the supermacroblock, this information being inserted into the data stream in place of or in addition to the data originating from the MPEG2 coding.
The conversion of the vectors originating from the MPEG2 coding of a frame into vectors associated with the macroblocks coded in field mode, and the determination of the sub-partitions, are described below.
The MPEG2 stream data is stored, at least at the level of an image, to associate the macroblocks of the image in pairs of macroblocks. Let MB0 and MB1 be the two macroblocks “obtained” from MPEG2, the counterparts of the top (MBtop) and bottom (MBbot) macroblocks of MPEG4 supermacroblock SMB, respectively provided with the vectors:
V0(dx0, dy0), V1(dx1, dy1) dx and dy being the horizontal and vertical components of the vectors.
The vertical components of the motion vectors dy are modified, to become Dy, when changing from a frame prediction to a field prediction, that is, from a frame reference base to a field reference base, the horizontal components being retained.
If V0=V1 for the MPEG2 motion vectors relating to the frame macroblocks, the MPEG4 predictions can be assumed to be carried out, with these vectors, on each of the two 16×16 field sub-partitions of the supermacroblock, referenced 18 and 19 in
The vectors of these two macroblocks or sub-partitions that make up the supermacroblock are named:
The prediction is carried out for the sub-partitions of size 16×16.
If V0 is not equal to V1 for the MPEG2 motion vectors relating to the frame macroblocks, it is wise not to use one and the same vector, that is, one and the same MPEG4 prediction, for a field macroblock comprising lines from each of the frame macroblocks. Thus, the predictions will be assumed to be carried out for each of the 8×16 sub-partitions of the supermacroblock, 8 lines of 16 pixels, referenced 20 to 23 in
The vectors originating from the MPEG2 coding on the one hand are accurate to half a pixel and on the other hand are expressed in the frame reference base.
Depending on the affiliation of the sub-partition to the even field or to the odd field, the value of the vertical movement vector dy expressed in the frame reference base will dictate the choice of the reference field.
The proposed procedure is as follows:
If the modulo 2 of the absolute value of the vector dy (denoted |dy| %2) is equal to 1:
Otherwise ((|dy| %2≠1):
The values Dy (Dy0 or Dy1) take account of the field reference base, that is, the numbering of the lines for a field, dy being relative to a frame reference base.
Alternatively, for a vector of an 8×16 sub-partition of the supermacroblock, a value dy (dy0 or dy1 depending on the partition concerned) that is a multiple of 2 corresponds to a choice of the reference field of the same parity, a movement that is not a multiple of 2 corresponds to a choice of field of opposite parity. The opposite parity field is chosen when in frame prediction mode (MPEG2), the movement corresponds to a change of field. Once the vector is converted, in case of a non-integer movement between fields of the same parity, half or quarter pixel, interpolation is preferred, the integer movement not posing a problem. Similarly, once the vector is converted, in case of non-integer movement between field of opposite parity, half or quarter pixel, interpolation is used.
The invention also relates to a transcoding device implementing the method described previously. This device comprises a circuit for receiving an MPEG2-type data stream. From this stream are extracted, among other things, the coding modes and motion vectors of the macroblocks of the coded image, via an extraction and storage circuit linked to the preceding reception circuit. The extracted information is stored, for example at the level of the complete image. A processing circuit retrieves the extracted data relating to the macroblocks, detects the macroblocks coded in frame prediction+field DCT mode and associates them or pairs them with a top or bottom macroblock in the image, to form MPEG4-type supermacroblocks. This circuit then performs a partitioning and a correction of the motion vectors for the paired macroblocks or supermacroblocks. Thus, it calculates the partitioning of the supermacroblock and the motion vectors assigned to the partitions. This data is then structured to be inserted into or substituted for data from the MPEG2 stream to provide an MPEG4 data stream, via a data insertion or substitution circuit.
The invention applies to the MPEG2 and MPEG4 standards, in particular the MPEG4 part 10 or H.264 standard, that uses the MBAFF coding mode. The applications relate, among other things, to data transmission such as broadcasting and data storage.
Number | Date | Country | Kind |
---|---|---|---|
0654937 | Nov 2006 | FR | national |