The invention relates to a device and a method for preprocessing prior to coding of a sequence of video images.
Image coding devices are all the more effective when they code images possessing reduced temporal or spatial entropy.
They are therefore often associated with image preprocessing devices in which the images are processed in such a way as to allow better coding.
In a known manner, preprocessing devices suitable for reducing the entropy of a video sequence use linear or nonlinear filters which increase the temporal redundancy from image to image, in such a way as to decrease the cost of coding of the predicted or interpolated images.
These various procedures have, however, drawbacks including:
The use of motion compensated filters makes it possible to reduce these drawbacks but may give rise to artefacts when the motion estimator does not estimate the motion correctly.
The invention therefore proposes the use of morphological operators carrying out a smoothing of the weak temporal variations of amplitude of each pixel.
For this purpose, the invention proposes a device for preprocessing prior to coding of a sequence of images comprising means of estimation of motion, for each pixel of the current frame, between the current pixel and the corresponding pixel of the previous frame and of the previous frame of like parity.
According to the invention the device comprises:
According to an advantageous embodiment, the device comprises:
According to a preferred embodiment, the means of comparing with a predetermined motion threshold the motion of the current pixel with respect to the previous frame and with respect to the previous frame of like parity compare a vector modulus calculated over a neighbourhood of the current point with the said predetermined thresholds.
In a preferred embodiment, the means of defining the pixels of the structuring element for the current pixel are suitable for forming a structuring element comprising three pixels.
In a preferred embodiment, the means of defining the pixels of the structuring element for the current pixel are suitable for selecting
According to a preferred embodiment, the device comprises means of validating, as a function of the comparison with a predetermined threshold, the pixels selected so as to define the structuring element.
In an advantageous manner, the means of validating the pixels selected are suitable for validating
In a preferred embodiment, the means of performing a morphological processing are suitable for performing, on the structuring element, successively an erosion operation followed by a dilatation operation followed by an erosion operation followed by a dilatation operation.
The invention also relates to a method of preprocessing prior to coding of a sequence of images comprising a step of estimation of motion, for each pixel of the current frame, between the current pixel and the corresponding pixel of the previous frame and of the previous frame of like parity. According to the invention, the method furthermore comprises the steps of:
The invention will be better understood and illustrated by means of wholly nonlimiting, advantageous exemplary embodiments and implementations, with reference to the appended figures in which:
The modules represented are functional units, which may or may not correspond to physically distinguishable units. For example, these modules or some of them may be grouped together in a single component, or constitute functionalities of one and the same software. Conversely, certain modules may possibly be composed of separate physical entities.
The video signal Si input to the precoding device is a video signal of interlaced type.
In order to improve the performance of the precoding device, the video signal Si is deinterlaced by the deinterlacer 1. The deinterlacer 1 doubles the number of lines per frame of the video signal Si using a deinterlacing method known to the person skilled in the art based on three consecutive frames of the video signal Si. Progressive frames are thus obtained which each contain the complete vertical definition of an image making it possible to subsequently perform framewise comparisons, the respective lines of two consecutive frames being spatially in the same place in the image.
A module 15 makes it possible to delay the video signal by a frame. The module 15 is advantageously composed of a static RAM type memory.
A module 17 for detecting fixed zones receives as input the deinterlaced video signal emanating from the deinterlacer 1 and the video signal delayed by a frame emanating from the module 15.
The module 17 detects the fixed zones of the current frame with respect to the previous frame.
A module 18 receives as input the video signal emanating from the deinterlacer 1 and the video signal emanating from the deinterlacer 1 delayed by two frames by a delay module 16 of the same type as the module 15.
The module 18 detects the fixed zones of the current frame with respect to the frame of like parity of the previous image.
The detection of fixed zones consists in detecting the zones which from frame to frame or from image to image remain devoid of motion. The detection of the fixed zones is performed on the basis of the luminance information and is performed on blocks of variable size. The mean error between the blocks with the same coordinates of each frame is calculated. This error is compared with a predetermined threshold to validate or otherwise the fixed zone. The weaker the size of the blocks, the more accurate the analysis is but the more sensitive it is to noise.
The fixed zones are not calculated for each pixel of the image but for blocks of 2*2 pixels, so as to ensure a degree of stability.
The module 17 outputs a signal ZFT which indicates that the current pixel forms part of a zone said to be fixed with respect to the previous frame.
The module 18 outputs a signal ZFI which indicates that the current pixel forms part of a zone said to be fixed with respect to the frame of like parity of the previous image.
A module 19 receives as input the signal ZFT. The module 19 also receives as input a frame motion vector for the current pixel.
A module 20 receives as input the signal ZFI. The module 20 also receives as input an image motion vector for the current pixel.
The frame and image motion vectors are calculated by a module (not represented) making it possible to calculate the frame motion vectors according to procedures known to the person skilled in the art.
The modules 19 and 20 zero the motion vectors of the points being detected as being fixed zones.
Hence, at the output of the modules 19 and 20 are obtained the motion vectors VT and VI for the pixels which are not detected as a fixed zone and zero vectors for the pixels which are detected as a fixed zone. The vectors VT and VI are illustrated in
A vector module 21 receives as input the motion vectors emanating from the module 19.
A vector module 22 receives as input the motion vectors emanating from the module 20.
The vector module 21 calculates a vector modulus MOD VT according to the following formula:
MODVT=VTx2+VTy2
The vector module 22 calculates a vector modulus MOD VI according to the following formula:
MODVI=VIx2+VIy2
VTx, VTy, VIx and VIy represent the respective coordinates of the moduli VT and VI along the horizontal axis and along the vertical axis.
The vector moduli MOD VI and MOD VT are calculated in blocks of two pixels. This advantageously makes it possible to minimize the instabilities of the image.
The modules 21 and 22 are respectively connected to the input of two comparators 23 and 24.
The comparators 23 and 24 also respectively receive as input thresholds ST and SI.
The thresholds ST and SI are equal to or different depending on the applications.
The comparators 23 and 24 respectively compare the values of MODVT and MODVI with predetermined thresholds ST and SI.
The thresholds ST and SI represent the value of the moduli MODVT and MODVI for which the motion in the block of pixels is regarded as significant.
If MODVT≧ST then MT=0 else MT=1
If MODVI>SI then MI=0 else MI=1
A psychovisual characteristic is used, according to which an object which undergoes strong motion is difficult to capture with the eye (in contradistinction to an object having medium or weak motion), thus the processing is applied if a strong motion is present, and the processing is disabled if there is a medium or weak motion; a very weak motion may be likened to a fixed zone according to the detection thresholds applied.
The video signal is also received by a delay module 2. The delay module 2 outputs the previous video frame, motion compensated, that is to say in which the coordinates of the pixels may have been modified following the motion compensation.
The video signal deinterlaced by the deinterlacer 1 is also received by a delay module 3. The delay module 3 outputs the previous video frame of like parity, motion compensated, that is to say in which the coordinates of the pixels may have been modified following the motion compensation.
A morphological operator 4 performing an erosion operation receives as input the outputs from the modules 2 and 3. The erosion module 4 also receives as input the signals MI and MT emanating respectively from the comparators 24 and 23.
The erosion module 4 performs the erosion operation on a structuring element. The structuring element is composed of the current pixel and possibly of pixels of the previous frames.
The structuring element is calculated in the following manner, in a first step:
In a second step the structuring element as previously calculated is modified as a function of the masks MI and MT.
Thus, the structuring element of each morphological operator may consist of 3, 2 or 1 pixel which is then the current pixel.
The erosion module 4 therefore performs the erosion operation on the structuring element previously calculated.
The erosion function consists in keeping the pixel having the minimum value from among the pixels of the structuring element.
By way of illustrative example:
Subsequently, the output of the erosion module 4 is connected to the input of a morphological operator 7 which performs a dilatation operation.
The dilatation module 7 also receives as input the output from the module 4, motion compensated and delayed by a frame (which is the output of a delay module 5) as well as the output of the module 4, motion compensated and delayed by an image (in fact the previous frame of like parity) which is the output of a delay module 6.
The dilatation operation is performed on the pixels of the structuring element.
The dilatation operation consists in keeping the pixel having the maximum value from among the pixels of the structuring element.
The output of the dilatation module 7 is connected to the input of a second dilatation module 10. The module 10 also receives as input the output of the module 7, motion compensated and delayed by a frame (which is the output of a delay module 8) as well as the output of the module 6, motion compensated and delayed by an image (in fact the previous frame of like parity) which is the output of a delay module 9.
The dilatation operation performed by the module 10 consists in taking the maximum from among the pixels of the structuring element as defined previously.
The output of the dilatation module 10 is connected to the input of a second erosion module 13. The module 13 also receives as input the output of the module 10, motion compensated and delayed by a frame (which is the output of a delay module 11) as well as the output of the module 10, motion compensated and delayed by an image (in fact the previous frame of like parity) which is the output of a delay module 12.
The erosion operation is performed on the pixels of the structuring element and consists in keeping the pixel having the minimum value from among the pixels of the structuring element.
The morphological operators therefore perform an opening operation (which consists of an erosion followed by a dilatation) followed by a closing operation (dilatation followed by an erosion).
In other embodiments, it is possible to perform the closing operation before the opening operation.
The output of the erosion module is subsequently transmitted to an interlacing module 14 which transforms the progressive video signal into an interlaced video signal.
The video signal is subsequently transmitted in an advantageous manner to a video coding device. The coding device can perform the video coding on the video signal whose entropy has been reduced.
The invention is of course not limited to the exemplary embodiment described hereinabove.
Number | Date | Country | Kind |
---|---|---|---|
0401870 | Feb 2004 | FR | national |