Pre-Processing Device and Method Before Encoding of a Video Image Sequence

Description

The invention relates to a pre-processing device and method before encoding of a video image sequence.

The image encoding devices become all the more effective as the temporal or spatial entropy of the images they encode reduces.

They are therefore often associated with image pre-processing devices in which the images are processed in order to allow a better encoding.

As is known, the pre-processing devices for reducing the entropy of a video sequence use linear or non-linear filters which reduce, even eliminate, the high-frequency components that are mainly responsible for the image encoding cost in intra mode. There are numerous filters available, including one- or two-dimensional low-pass filters, Nagao filters, averaging filters and median filters.

The main drawbacks with these methods are:

- a reduction in spatial definition that is too visible, in particular in the vertical axis, due to the fact that each frame of an interlaced video has only half the vertical resolution of an image,
- blurring effects on the objects,
- degraded contours.

The invention proposes to resolve at least one of the abovementioned drawbacks.

To this end, the invention proposes a pre-processing device before encoding of a video image sequence, characterized in that it comprises:

- means of applying a plurality of morphological processing steps to the video image sequence,
- mixers for applying a weighting, after each morphological processing step, to the video sequence having been subjected to one of said processing steps.

Applying a raw morphological processing would have a devastating effect on the quality of the image. The presence of a mixer between each morphological operator weights the effect of this processing by mixing the raw result of the operator and the input of the same operator.

According to a preferred embodiment, the device comprises means of measuring the complexity of said video image sequence before applying the plurality of morphological processing steps.

In practice, a pre-processing for reducing the spatial entropy of the image is recommended mainly for images with high complexity. Thus the pre-processing can be controlled according to the complexity of the image and suited to the complexity of the image.

Advantageously, the means of measuring the complexity of said video image sequence measure the intra-image correlation.

According to a preferred embodiment, the means of measuring the complexity compute a weighting coefficient for each mixer.

Advantageously, the weighting coefficient is identical for each mixer.

Advantageously, the weighting coefficients are inversely proportional to the intra-image correlation.

Preferably, the means of applying a plurality of morphological processing steps and the mixers apply the processing steps to the luminance component of the video signal, pixel by pixel, for each image.

Preferably, the device comprises:

- means of deinterlacing said video image sequence before measuring the intra-image correlation and
- means of interlacing said video sequence after the last weighting.

This makes it possible to obtain progressive frames which each contain the complete vertical definition of an image. It is then possible to consider without bias a processing in both axes of the image, horizontal and vertical.

The invention also relates to a method of pre-processing before encoding a video image sequence. According to the invention, the method comprises:

- a plurality of morphological processing steps on the incoming video image sequence,
- a plurality of weighting steps for applying, after each morphological processing step, a weighting to the result of the morphological processing.

The invention will be better understood and illustrated by means of examples of embodiments and advantageous implementations, by no means limiting, with reference to the appended figures in which:

FIG. 1 represents a preferred embodiment of a device according to the invention,

FIG. 2 represents the vicinity of the current point P taken into account to define the complexity of the current image.

The modules represented are functional units, which may or may not correspond to physically distinguishable units. For example, these modules, or some of them, may be grouped in a single component, or form functionalities of one and the same software. Conversely, certain modules may, if necessary, comprise separate physical entities.

The video signal Ei at the input of the pre-encoding device is an interlaced type video signal.

In order to improve the performance of the pre-encoding device, the video signal Ei is deinterlaced by the deinterlacer 1. The deinterlacer 1 doubles the number of lines per field of the video signal Ei using a deinterlacing method known to a person skilled in the art based on three consecutive fields of the video signal Ei. Progressive format is then obtained, where each field is becoming frame and contains the complete vertical definition of an image, so that a processing can be performed in the vertical axis.

The signal E is obtained at the output of the deinterlacer 1.

A complexity analysis of the image is then carried out. In practice, spatial entropy reduction is applied mainly to images having a high spatial entropy.

The device therefore includes upstream means 2 of measuring the correlation of each image.

FIG. 2 illustrates an example of the vicinity taken into account for computing the complexity of the current image in the module 2.

For each pixel of the image, the pixel result “Rp” is computed using the luminance of the current point and that of four of its neighbours:

Rp=[abs(P−P(−1,0))+abs(P−P(0,−1))+abs(P−P(0,+1))+abs(P−P(+1,0))]/4

Then, all these pixel results are accumulated over one frame.

$C_{intra} = \frac{\sum_{0}^{nbpixels} Rp}{nblignes \times nbcol}$

This correlation measurement is used to ascertain the average deviation, over one image, between a pixel and its adjacent pixels. Interesting information on the definition in the image is thus obtained.

In other embodiments, it is possible to modify these equations in order to obtain a more complete definition of the complexity of the image. It is also possible to enlarge the vicinity of the current pixel taken into account in computing the image complexity.

From this measurement, a coefficient K is computed within the range [0,1], as a function of the complexity of the incoming images.

The table below illustrates the values of the coefficient K as a function of C_intra, given as an illustration only.

The value of Cintra is encoded on 8 pixels and is therefore between 0 and 255.

Correlation type
C_intravalue
Coefficient K value

Very strong
C_intra= [0 . . . 2]
1/8 = Kmin

Strong
C_intra= [2 . . . 4]
2/8

Average
C_intra= [5 . . . 8]
3/8

Weak
C_intra= [9 . . . 16]
4/8

Very weak
C_intra= [17 . . . 30]
5/8

Insignificant
C_intra= [30 . . . 255]
6/8 = Kmax

When the correlation is very strong (very little definition), the pre-processing of the image is still performed, but in lesser proportions, illustrated by the value of the coefficient K.

Conversely, a weak correlation is an indicator of strong entropy. The example of a random noise (weak, even zero correlation) is a good example (high entropy).

The coefficient K can be the same for each mixer as in the preferred embodiment or different for all the mixers.

The processing carried out in the device of FIG. 1 is applied only to the luminance component of the video.

In practice, a processing of the chrominance component may provoke disagreeable artefacts in colour, and, above all, it is the luminance component that has most of the complexity of the image.

The signal E_in(identical to the signal E) at the output of the module 2 is then subjected to erosion in the module 3. The erosion process consists in keeping the pixel that has the minimum luminance value among the pixels of a structuring element that it receives as input. The structuring element comprises a 3*3 window, three rows by three columns, around the current pixel, or 9 pixels. However, another window size can be found bearing in mind that the size of the window is directly proportional to the severity of the erosion.

The module 3 therefore computes for each pixel of the incoming video signal E_in, its new luminance value.

The video signal T0 at the output of the module 3 therefore represents the eroded signal E_in, or therefore for each pixel, its luminance value modified in relation to T0 corresponding to the minimum value of the structuring element of which it is part.

Then, the signal T0 is transmitted to a mixer 4. The mixer 4 also receives as input the coefficient K transmitted by the module 2.

The mixer 4 produces as output the signal S0 according to the following formula:

s0=K×T0+(1−K)×Ein

The signal S0 is input into the dilatation module 5.

The dilatation operation consists in retaining the pixel that has the maximum luminance value among the elements of a structuring element centred on the current pixel, with a size of 3×3=9 pixels.

The dilatation module produces as output a new video signal T1 transmitted to the input of a mixer 6. The mixer 6 also receives as input the weighting coefficient K.

The mixer 6 then produces a signal S1 according to the following formula by weighting, for each pixel, the luminance value that it receives as input:

S1=K×T1+(1−K)×S0

The signal S1 is then input into a second dilatation module 7.

The dilatation module 7 then performs an dilatation operation on the signal S1. The dilatation operation consists in retaining the pixel that has the maximum luminance value among the elements of a structuring element centred on the current pixel with a size of 3×3=9 pixels.

The module 7 then produces as output a signal T2 which is input into the mixer 8 which weights the luminance component of the signal received, for each pixel received. The mixer 8 also receives as input the weighting coefficient K.

The mixer 8 produces as output a signal S2, according to the formula

S2=K×T2+(1−K)×S1

The signal S2 is then input into a second erosion module 9.

The module 9 applies erosion to the signal S2, the erosion operation consisting as indicated previously in replacing the current pixel with the minimum pixel among the pixels of the structuring element defined as previously.

The erosion module 9 produces as output a signal T3 which is input into a fourth mixer 10. The mixer 10 also receives as input the weighting coefficient K.

The mixer produces as output a signal S3 according to the following formula:

S3=K×T3 +(1−K)×S2

The signal S3 is then transmitted to the input of an interlacer 11 which is used to interlace the signal S3 in order to obtain the video output Si of the pre-processing device which is then transmitted to an encoding device.

The incoming video signal Si then benefits from a reduced spatial entropy and its subsequent encoding is made easier. Any encoding type can be considered subsequently.

Naturally, the invention is not limited to the embodiment described above. A person skilled in the art will easily understand that modifying the number of morphological operations of the pre-processing device can be considered, as can modifying the number of associated mixers.

An erosion followed by a dilation is called an opening.

A dilatation followed by an erosion is called a closing.

Claims

1-10. (canceled)
11. Method of processing an image of a video image sequence, wherein it comprises the following successive steps: a step for computing a complexity value representative of the complexity of said image;a first step of morphological processing applied on said image, said first step generating a first processed image;a second step for mixing said image and said first processed image depending on said complexity value, said second step generating a mixed image;a third step of morphological processing applied on said mixed image, said third step generating a second processed image; anda fourth step for mixing said mixed image and said second processed image depending on said complexity value.
12. Method according to claim 11, wherein said complexity value is the intra-image correlation.
13. Method according to claim 11, wherein said second step consists in a linear combination of said image and said first processed image, said combination depending on a first weighting coefficient computed from said complexity value.
14. Method according to claim 13, wherein said fourth step consists in a linear combination of said mixed image and said second processed image, said combination depending on a second weighting coefficient computed from said complexity value.
15. Method according to claims 14, wherein said first and second weighting coefficients are equal.
16. Method according to claim 15, wherein said first and second weighting coefficients are proportional to said complexity value.
17. Method according to claim 11, wherein said first step is an erosion.
18. Method according to claim 11, wherein said third step is a dilatation.
19. Device for processing an image of a video image sequence, wherein it comprises: means for computing a complexity value representative of the complexity of said image;means for applying a morphological processing on said image, said means generating a first processed image;means for mixing said image and said first processed image depending on said complexity value, said means generating a mixed image;means for applying a morphological processing on said mixed image, said means generating a second processed image; andmeans for mixing said mixed image and said second processed image depending on said complexity value.

Priority Claims (1)

Number	Date	Country	Kind
04/51381	Jul 2004	FR	national

PCT Information

Filing Document	Filing Date	Country	Kind	371c Date
PCT/EP05/52908	6/22/2005	WO	00	9/7/2007

Pre-Processing Device and Method Before Encoding of a Video Image Sequence

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims

Priority Claims (1)

PCT Information