The invention relates to a selector for selecting a background motion vector for a pixel in an occlusion region of an image, from a set of motion vectors being computed for the image.
The invention further relates to an up-conversion unit for computing a pixel value in an occlusion region of an output image, on basis of a sequence of input images, the up-conversion unit comprising:
The invention further relates to an image processing apparatus comprising:
The invention further relates to a method of selecting a background motion vector for a pixel in an occlusion region of an image, from a set of motion vectors being computed for the image.
The invention further relates a computer program product to be loaded by a computer arrangement, comprising instructions to select a background motion vector for a pixel in an occlusion region of an image, from a set of motion vectors being computed for the image.
In images resulting from motion compensated image rate converters, artifacts are visible at the boundaries of moving objects, where either covering or uncovering of background occurs. These artifacts are usually referred to as halos. There are two reasons for these halos. The first, rather trivial, cause is the resolution of the motion vector field. Usually, the density of the grid at which the motion vectors are available is much less than that of the pixel grid. If, for example, motion vectors are available for blocks of 8×8 pixels then the contours of moving objects can only roughly be approximated at the vector grid, resulting in a blocky halo effect. A second, less trivial cause, is that a motion estimation unit, estimating motion between two successive images of a video sequence, cannot perform well in regions where covering or uncovering occurs, as it is typical for these regions that the background information only occurs in either of the two images.
Moreover, up-conversion units usually combine information from both images, i.e. bi-directional interpolation, using the wrongly estimated motion vectors, to create the up-converted image. Since, one of these images does not contain the correct information, due to the occlusion, the up-converted image is incorrect for occlusion regions.
In order to solve these problems, an up-conversion unit should be able to detect the occlusion regions, detect the type of occlusion present in these regions (i.e. covering or uncovering), determine the correct motion vectors for these regions, and perform the up-conversion. The book “Video processing for multimedia systems”, by G. de Haan, University Press Eindhoven, 2000, ISBN 90-9014015-8, chapter 4, describes methods for the detection of occlusion regions and for the covering/uncovering classification. So, remains the requirement for determining the correct motion vector in occlusion regions.
It is an object of the invention to provide a selector for easily determining an appropriate motion vector in an occlusion region.
This object of the invention is achieved in that the selector comprises:
Typically, the set of motion vectors being computed for the occlusion region comprises a motion vector which corresponds with the movement of the foreground, i.e. the foreground motion vector and a motion vector which corresponds with the movement of the background, i.e. the background motion vector. However it is not directly known which one of the motion vectors of the set corresponds to the background. This background motion vector might correspond to the null vector, i.e. no motion. However, it is to be noticed that in many cases the camera is moving to track the main subject of the scene. That means that the foreground motion vector corresponds to the null vector and the background motion vector is not equal to the null vector.
To select the background motion vector from the set of motion vectors, use is made of a global motion model of the background of the image. Based on the model a model-based motion vector is determined for the particular pixel. The motion vectors of the set are compared with the model-based motion vector. The one which fits best is selected as the background motion vector.
Preferably the global motion model is based on motion vectors of the borders of the motion vector field. In other words, the part of the motion vector field which is applied for determining the motion model corresponds with motion vectors being estimated for groups of pixels in the neighborhood of the borders of the image. The probability that these motion vectors correspond with the background is relatively high.
In an embodiment of the selector according to the invention, the comparing unit is arranged to compute differences between the model-based motion vector and the respective motion vectors of the set of motion vectors and the selecting unit is arranged to select the particular motion vector if the corresponding difference is the minimum difference of the differences. The difference might be a L1-norm, i.e. the sum of absolute differences of the components of the motion vectors to be compared. Alternatively, the difference is a L2-norm, i.e. the sum of squared differences of the components of the motion vectors to be compared.
In an embodiment of the selector according to the invention, the motion model comprises translation and zoom. The parameters of such a model are relatively easy to compute, while the model is robust. With such a pan-zoom model the most frequent geometrical operations within video images can be described. With this pan-zoom model, the model-based motion vector {right arrow over (D)}b for a particular pixel can be determined by:
where tx and ty define the translation, zx and zy define the zoom and x and y the location in the image. In U.S. Pat. No. 6,278,736 and in the article “An efficient true-motion estimator using candidate vectors from a parametric motion model”, by G. de Haan, et al., in IEEE Transactions on circuits and systems for video technology, Vol. 8, no. 1, pages 85-91, March 1998 is described how a motion model can be made based on a part of a motion vector field.
It is a further object of the invention to provide an up-conversion unit of the kind described in the opening paragraph comprising a selector for easily determining an appropriate motion vector in an occlusion region.
This object of the invention is achieved in that the selector for selecting the background motion vector for the pixel is as claimed in claim 1.
It is a further object of the invention to provide an image processing apparatus of the kind described in the opening paragraph comprising a selector for easily determining an appropriate motion vector in an occlusion region.
This object of the invention is achieved in that the selector for selecting the background motion vector for the pixel is as claimed in claim 1.
The image processing apparatus may comprise additional components, e.g. a display device for displaying the output images. The image processing apparatus might support one or more of the following types of image processing:
It is a further object of the invention to provide a method for easily determining an appropriate motion vector in an occlusion region.
This object of the invention is achieved in that the method comprises:
It is a further object of the invention to provide a computer program product of the kind described in the opening paragraph for easily determining an appropriate motion vector in an occlusion region.
This object of the invention is achieved in that the computer program product, after being loaded, provides processing means with the capability to carry out:
These and other aspects of the selector, of the method, the up-conversion unit, the image processing apparatus and of the computer program product according to the invention will become apparent from and will be elucidated with respect to the implementations and embodiments described hereinafter and with reference to the accompanying drawings, wherein:
Same reference numerals are used to denote similar parts throughout the figures.
Consider the situation in
In general, a motion estimation unit determines a motion vector for a group of pixels by selecting the best matching motion vector from a set of candidate motion vectors. The match error is usually a Sum of Absolute Differences (SAD) obtained by fetching pixels from the input image at n−1 and comparing those pixels with pixels fetched from the input image at n, using the candidate motion vector, i.e.:
here {right arrow over (D)} is the motion vector, B({right arrow over (X)}) is the block located at block position {right arrow over (X)}, {right arrow over (x)} is a pixel position, F({right arrow over (x)},n) is a luminance frame, n is the image number and a is a relative position. An example is given in
A problem occurs in occlusion areas. In these areas no motion vector can result in a correct match since the information is not present in one of the two frames. In case of uncovering new information appears and is therefore not present in image 100 at time n−1. In case of covering information disappears and is therefore not present in image 104 at time n. The result of this is that the motion vector field is erroneous in occlusion areas.
In known up-conversion units, usually pixel value information from both images, F(n) and F(n−1), is used for interpolation. For example, motion compensated averaging uses a motion compensated pixel from the image 100 at time n−1 and a motion compensated pixel from the image 104 at time n:
Even if the correct motion vector is used, the result in occlusion areas is erroneous since either the pixel from the image 100 at time n−1 or from the image 104 at time n is wrong.
A solution to the halo problem comprises at least two actions. Firstly, adjust the probably wrong motion vector in occlusion regions such that the correct motion vector is used in the up-conversion. Secondly, using the correct motion vector, fetch the pixel value information from the correct image, i.e. use unidirectional fetches instead of bi-directional fetches.
There are some difficulties however. In order to perform the first action it must be known where the occlusion areas are. Hence occlusion detection and foreground/background motion detection is required.
In order to perform the second action it must be known what type of occlusion there is. If it is covering, then the pixel value information from the image at time n−1 must be fetched. If it is uncovering, then the pixel value information from the image at time n must be fetched. Hence covering/uncovering detection is required. The book “Video processing for multimedia systems”, by G. de Haan, University Press Eindhoven, 2000, ISBN 90-9014015-8, chapter 4, describes methods for the detection of occlusion regions and for the covering/uncovering classification.
In the following the foreground/background motion detection according to the invention is described.
In order to determine the background motion vector of a location {right arrow over (x)}, in an occlusion region a set of motion vectors being determined by the motion estimation unit are required. Typically this set of motion vector comprises two motion vectors. The first one is the one which has been estimated for the location {right arrow over (x)} by the motion estimation unit 502: {right arrow over (D)}c={right arrow over (D)}({right arrow over (x)}) and an alternative motion vector in a motion vector being determined for a location {right arrow over (x)}+δ in the neighborhood, {right arrow over (D)}a={right arrow over (D)}({right arrow over (x)}+δ). In general, one of these motion vectors corresponds to the foreground motion vector and the other corresponds to the background motion vector. In order to determine the alternative motion vector {right arrow over (D)}a, motion vectors from locations a number of pixels (typically δ=16) to the left {right arrow over (D)}l and right {right arrow over (D)}r of the current position are evaluated. The motion vector being most different from the current vector is selected as the alternative motion vector {right arrow over (D)}a,
where {right arrow over (D)}({right arrow over (x)}) is the vector field. (See also U.S. Pat. No. 5,777,682)
In order to classify the motion vectors {right arrow over (D)}c and {right arrow over (D)}a into foreground and background these motion vectors are compared with the motion vector which is computed on basis of the motion model for the background of the image, {right arrow over (D)}b. The actual background vector is the motion vector which has the minimal distance to {right arrow over (D)}b, i.e.:
If |{right arrow over (D)}c−{right arrow over (D)}b|<|{right arrow over (D)}a−{right arrow over (D)}b|{right arrow over (b)}g={right arrow over (D)}c and {right arrow over (f)}g={right arrow over (D)}a (6)
If |{right arrow over (D)}c−{right arrow over (D)}b|≧|{right arrow over (D)}a−{right arrow over (D)}b|{right arrow over (b)}g={right arrow over (D)}a and {right arrow over (f)}g={right arrow over (D)}c (7)
The working of the up-conversion unit 500 is as follows. On the input connector 514 a signal representing a series of input images 100 and 104 is provided. The up-conversion unit 500 is arranged to provide a series of output images at the output connector 516, comprising the input images 100 and 104 and intermediate images, e.g. 102. The motion estimation unit 504 is arranged to compute a motion vector field 400 for the intermediate image on basis of the input images 100 and 104. On basis of the pixel values 524 of the input images 100 and 104 and on basis of the motion vectors 522 the interpolating unit 506 is arranged to compute the pixel values of the intermediate image 102. In principle this is done by means of a bi-directional fetch of pixel values. However, as explained above, this results in artifacts in occlusion regions. Because of that, the up-conversion unit 500 according to the invention is arranged to perform an alternative interpolation for these occlusion regions.
The up-conversion unit 500 comprises a detection unit 508 for detecting the occlusion regions in the image and for control of the interpolating unit 506. The detection unit 508 is arranged to classify the type of occlusion as described in patent application EP1048170. The classification is based on comparing neighboring motion vectors. The classification is as follows:
with Dl,x the x-component of the left motion vector and Dr,x the x-component of the right motion vector to be compared. The detection unit 508 provides the selector 502 with a set of motion vectors 518. Typically this set of motion vectors comprises two motion vectors. The selector 502 is arranged to determine which of these motion vectors corresponds to the background motion and which of these motion vectors corresponds with the foreground motion. On basis of the background motion vector 526 the interpolation unit 506 is arranged to fetch the corresponding pixel value in the appropriate image:
In summary the halo reduction is as follows. The halo reduction starts by determining the occlusion regions. Only in the occlusion regions the upconversion deviates from the “normal” upconversion, motion compensated averaging, as specified in Equation 3. In occlusion regions the motion vector field is inaccurate. Therefore, it is tested whether or not an alternative motion vector {right arrow over (D)}a is better than the one {right arrow over (D)}c which has been estimated by the motion estimation unit 504 for the current pixel. These two motion vectors, the current {right arrow over (D)}c and alternate {right arrow over (D)}a motion vector are provided to the selector 502 which is arranged to determine the background motion vector. With the appropriate motion vector the appropriate pixel value is fetched from the preceding or succeeding image.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be constructed as limiting the claim. The word ‘comprising’ does not exclude the presence of elements or steps not listed in a claim. The word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements and by means of a suitable programmed computer. In the unit claims enumerating several means, several of these means can be embodied by one and the same item of hardware.
Number | Date | Country | Kind |
---|---|---|---|
03100134 | Jan 2003 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB03/06182 | 12/16/2003 | WO | 00 | 7/20/2005 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2004/066624 | 8/5/2004 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6005639 | Thomas et al. | Dec 1999 | A |
6219436 | De Haan et al. | Apr 2001 | B1 |
6278736 | De Haan et al. | Aug 2001 | B1 |
6954498 | Lipton | Oct 2005 | B1 |
20020154695 | Cornog et al. | Oct 2002 | A1 |
20040252763 | Mertens | Dec 2004 | A1 |
Number | Date | Country | |
---|---|---|---|
20060072790 A1 | Apr 2006 | US |