The present invention relates to a method for generating an improved image signal when estimating the motion of image sequences, in particular a prediction signal for video images using motion-compensating prediction, with motion vectors, which, for each picture block of a current image, indicate the position of the picture block used for the prediction with respect to a chronologically preceding reference image, being formed for picture blocks.
European Patent No. 0 558 922 describes a method for improving motion estimation in image sequences, in half-pel accuracy, according to the full-search method. There, in a first process step, the search area, and in a second process step, the match block are filtered with the aid of an additional digital filter which enables a raster shift of the pixel raster by ¼ pel. Using this measure, a distortion of the motion vector field can be ruled out.
In “MPEG-4 Video Verification Model Version 7.0”, Bristol, April 1997, MPEG 97/N1642 in ISO/IEC JTC1/SC 29/WG11, an encoder and decoder for object-based coding of video image sequences are specified. In this context, one no longer encodes and transmits rectangular pictures of a fixed size to the receiver, but instead, so-called “VIDEO OBJECTS” (VO) of any shape and size. The image formation of such a VO in the camera image plane at a specific instant is referred to as a VIDEO OBJECTS PLANE (VOP). Consequently, the relation between VO and VOP is equivalent to the relation between image sequence and image in the case of the transmission of rectangular pictures of fixed size.
The motion-compensating prediction in the verification model is carried out with the assistance of so-called “blockwise motion vectors” which, for each block of the size 8×8 or 16×16 pixels of the current image, specifies the position of the block used for the prediction in an already transmitted reference image. In this context, the resolution of the motion vectors is limited to half of a pixel, pixels between the scanning raster (half-pixel position) being generated by a bilinear interpolation filtering from the pixels on the scanning raster (integer pixel position) (
a=A, b=(A+B)//2, c=(A+C)//2,
d=(A+B+C+D)//4, // indicating a rounded integer division.
By applying the principles of the present invention, one can improve the quality of the prediction signal and, thus, the coding efficiency. In so doing, a greater local neighborhood is considered than in the case of bilinear interpolation, to generate pixels between the pixel scanning raster. The aliasing-reducing interpolation filtering according to the present invention leads to an increased resolution of the motion vector and, consequently, to a prediction gain and an increased coding efficiency. In the present invention, the FIR filter coefficients can be adapted to the signals to be coded, and be transmitted separately for each video object, thereby further increasing coding efficiency and enhancing the flexibility of the method.
In contrast to the design approach according to European Patent No. 0 558 922, there is no need to design any additional polyphase filter structures for intermediate positions having ¼ pel pixel resolution in the horizontal and vertical directions.
By applying the principles of the present invention, the image sequence frequency of an MPEG-1 coder can be doubled from 25 Hz to 50 Hz, with the data rate remaining constant. In the case of an MPEG-2 coder, the data rate can be reduced by up to 30%, with the image quality remaining constant.
In the case of the method according to the present invention, motion vectors are formed for picture blocks, the motion vectors, for each picture block of a current image, indicating the position of the picture block used for the prediction with respect to a chronologically preceding reference image.
The motion vectors for the prediction are determined in three successive steps:
In a first search step, a motion vector is determined for each picture block with pel accuracy in accordance with a conventional method, for example, in accordance with the full-search block matching method. In this context, the minimum error criterion is determined for possible motion positions, and the vector which best describes the motion of the picture block is selected (European Patent No. 0 368 151).
In a second search step, which, again, is based on such a search for the minimum error criterion, an improved motion vector is ascertained with sub-pel accuracy, starting out from the motion vector ascertained in the first step, using an aliasing-reducing interpolation filtering, with the aid of a digital, symmetric FIR (finite impulse response) filter. In the process, a higher resolution is selected than in the first search step. Preferably, one selects a resolution of a half pixel relative to the pixel raster.
b=(CO1x(A−1+A+1)+CO2x(A−2+A+2)+CO3x(A−3+A+3)+CO4x(A−4+A+4))/256
ci=(CO1x(Ai+Ei)+CO2x(Bi+Fi)+CO3x(Ci+Gi)+GO4x(Di+Hi))/256
d=(CO1x(c−1+c+1)+CO2x(c−2+c+2)+CO3x(c−3+c+3)+CO4x(c−4+c+4))/256
The structure of the FIR interpolation filter used is apparent in
In the third search step, starting from the motion vector determined with an accuracy of ½ pel, a local search is performed using a further interpolation filtering, taking the eight neighboring pixels as a basis, with resolution that is increased still further, preferably to ¼ pixel. As before, one selects the motion vector having the lowest prediction error performance.
The interpolation is carried out relative to the pixel raster, with a half-pixel resolution from the second search step, using filter coefficients CO1′=½, CO2′=O, CO3′=O, CO4′=O.
The same previously introduced interpolation technique is used for the motion-compensating prediction.
If the processing is carried out within a coder having a reduced image format (SIF format within an MPEG1 coder or Q-CIF in an H.263 coder), but the original input format is used for the display, for example, CCIR 601[1] in the case of MPEG-1 or CIF in the case of H.263, a local interpolation filtering must be carried out as a post-processing. The described aliasing-compensating interpolation filtering can be used for this purpose as well.
To activate the aliasing-compensating interpolation using ¼ resolution, activation bits can be inserted into an image-transmission bit stream.
To predict video objects, filter coefficients CO1 through CO4, and CO1′ through CO4′ can be separately conditioned for each of the video objects VO, and inserted into the image-transmission bit stream at the beginning of transmission of the video object in question.
For the encoding of a motion vector, the range of values of the motion vector differences to be coded can be adapted to the increased resolution.
Number | Date | Country | Kind |
---|---|---|---|
197 30 305 | Jul 1997 | DE | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/DE98/01938 | 7/11/1998 | WO | 00 | 5/8/2000 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO99/04574 | 1/28/1999 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
4890160 | Thomas | Dec 1989 | A |
5043808 | Knauer et al. | Aug 1991 | A |
5347599 | Yamashita et al. | Sep 1994 | A |
5502489 | Kim et al. | Mar 1996 | A |
5541660 | Kim et al. | Jul 1996 | A |
5600731 | Sezan et al. | Feb 1997 | A |
5684538 | Nakaya et al. | Nov 1997 | A |
5991447 | Eifrig et al. | Nov 1999 | A |
6057884 | Chen et al. | May 2000 | A |
6069670 | Borer | May 2000 | A |
6122017 | Taubman | Sep 2000 | A |
6714593 | Benzler et al. | Mar 2004 | B1 |
Number | Date | Country |
---|---|---|
0 348 207 | Dec 1989 | EP |
0 558 922 | Sep 1993 | EP |