The invention relates to an image conversion unit for converting a first image sequence, comprising a first image with a first resolution and a second image with the first resolution into a second image sequence comprising a third image with a second resolution, the image conversion unit comprising:
a coefficient-calculating means for calculating a first filter coefficient on basis of pixel values of the first image;
an adaptive filtering means for calculating a third pixel value of the third image on basis of a first one of the pixel values of the first image and the first filter coefficient.
The invention further relates to a method of converting a first image sequence, comprising a first image with a first resolution and a second image with the first resolution into a second image sequence comprising a third image with a second resolution, the method comprising:
calculating a first filter coefficient on basis of pixel values of the first image; and
calculating a third pixel value of the third image on basis of a first one of the pixel values of the first image and the first filter coefficient.
The invention further relates to an image processing apparatus comprising:
receiving means for receiving a signal corresponding to a first image sequence; and
the above mentioned image conversion unit for converting the first image sequence into a second image sequence.
The advent of HDTV emphasizes the need for spatial up-conversion techniques that enable standard definition (SD) video material to be viewed on high definition (HD) television (TV) displays. Conventional techniques are linear interpolation methods such as bi-linear interpolation and methods using poly-phase low-pass interpolation filters. The former is not popular in television applications because of its low quality, but the latter is available in commercially available ICs. With the linear methods, the number of pixels in the frame is increased, but the high frequency part of the spectrum is not extended, i.e. the perceived sharpness of the image is not increased. In other words, the capability of the display is not fully exploited.
Additional to the conventional linear techniques, a number of non-linear algorithms have been proposed to achieve this up-conversion. Sometimes these techniques are referred to as content-based or edge dependent spatial up-conversion. Some of the techniques are already available on the consumer electronics market.
An embodiment of the image conversion unit of the kind described in the opening paragraph is known from the article “New Edge-Directed Interpolation”, by Xin Li et al., in IEEE Transactions on Image Processing, Vol. 10, No 10, October 2001, pp. 1521-1527. In this image conversion unit, the filter coefficients of an interpolation up-conversion filter are adapted to the local image content. The interpolation up-conversion filter aperture uses a fourth order interpolation algorithm as specified in Equation 1:
with FHD(i, j) the luminance values of the HD output pixels, FSD(i, j) the luminance values of the input pixels and wi the filter coefficients. The filter coefficients are obtained from a larger aperture using a Least Mean Squares (LMS) optimization procedure. In the cited article is explained how the filter coefficients are calculated. The method according to the prior art is also explained in connection with
Although the “New Edge-Directed Interpolation” method according to the cited prior art works relatively well in many image parts, there is a problem with selecting the appropriate window for the LMS method. For windows of size it by in, there are (n−2)(m−2) equations. Experimentally, the inventor found that a window of 4 by 4, which results in 4 equations did not lead to a robust up scaling. Better results have been obtained using windows of 8 by 8, i.e. with 36 equations. Although the up-conversion was more robust, there was also more blurring. It is assumed that this is due to the fact that the image statistics are not constant over this larger area, which causes the filter to converge towards a plain averaging filter. To conclude: there is a conflict that complicates the choice of the window size. On the one hand, because of the robustness the window size has to be large. On the other hand, for constant image statistics the window size has be as small as possible. Finally, the LMS optimization requires at least the same number of equations as there are unknown coefficients, which gives a lower bound to the window size.
It is an object of the invention to provide an image conversion unit of the kind described in the opening paragraph which is relatively robust while the amount of image blur is relatively low.
This object of the invention is achieved in that the coefficient-calculating means is arranged to calculate the first filter coefficient on basis of further pixel values of the second image. In other words the aperture of the coefficient-calculating means is enlarged in the temporal domain rather than in the spatial domain. The assumption then is that in corresponding -smaller- image parts of different images, the statistics are more similar than in different locations of a -larger- part in the same image. This is particularly to be expected in the case that the corresponding image parts are taken along the motion trajectory. So, additional to the assumption that edge orientation is independent of scale, it is now assumed that edge orientation is constant over time when corrected for motion. Pixel values are luminance values or color values.
Notice that the further pixel values are not applied in the direct path of processing the input pixels of the first image into output pixels, i.e. the pixels of the third image, but in the control path to determine the filter coefficients. Combining input pixel values of multiple input fields into a single output pixel value of a single output image, i.e. frame, is for instance known as de-interlacing. Interlacing is the common video broadcast procedure for transmitting the odd and even numbered image lines alternately. De-interlacing attempts to restore the full vertical resolution, i.e. make odd and even lines available simultaneously for each image. The purpose of de-interlacing is the reduction of alias in successive fields. However a purpose of the image conversion unit according to the present invention is to increase the resolution of input images on basis of respective input images. This is done by means of a spatial filter which is adapted to edges in order to limit the amount of blur which would arise without the adaptation to the edges. The spatial filter in controlled by means of filter coefficients which are determined on basis of multiple input images.
An embodiment of the image conversion unit according to the invention is arranged to acquire the pixel values of the first image from a first part of the first image and the further pixel values of the second image from a second part of the second image, with the first part and the second part spatially corresponding. An advantage of this embodiment is that it is relatively simple. Acquisition of the appropriate pixels from the second image is straight forward without additional calculations. Temporarily storage of a number of pixel values of the second image is required.
An embodiment of the image conversion unit according to the invention is arranged to acquire the pixel values of the first image from a first part of the first image and the further pixel values of the second image from a second part of the second image, with the first part and the second part at a motion trajectory. Motion vectors have to be provided by means of a motion estimator. These motion vectors describe the relation between the first part and the second part. An advantage of this embodiment is that the images of the second sequence, i.e. the output images, are relatively sharp.
In an embodiment of the image conversion unit according to the invention the coefficient-calculating means is arranged to calculate the first filter coefficient by means of an optimization algorithm. Preferably the optimization algorithm is a Least Mean Square algorithm. An LMS algorithm is relatively simple and robust.
It is a further object of the invention to provide a method of the kind described in the opening paragraph which is relatively robust while the amount of image blur is relatively low.
This object of the invention is achieved in that the first filter coefficient is calculated on basis of further pixel values of the second image.
It is a further object of the invention to provide an image processing apparatus of the kind described in the opening of which the image conversion unit is relatively robust while the amount of image blur is relatively low.
This object of the invention is achieved in that the coefficient-calculating means of the image processing apparatus is arranged to calculate the first filter coefficient on basis of further pixel values of the second image. The image processing apparatus optionally comprises a display device for displaying the second image. The image processing apparatus might e.g. be a TV, a set top box, a VCR (Video Cassette Recorder) player or a DVD (Digital Versatile Disk) player. Modifications of image conversion unit and variations thereof may correspond to modifications and variations thereof of the method and of the image processing apparatus described.
These and other aspects of the image conversion unit, of the method and of the image processing apparatus according to the invention will become apparent from and will be elucidated with respect to the implementations and embodiments described hereinafter and with reference to the accompanying drawings, wherein:
Same reference numerals are used to denote similar parts throughout the figures.
A pixel acquisition unit 102 which is arranged to acquire a first set of pixel values of pixels 1-4 (See
A filter coefficient-calculating unit 106, which is arranged to calculate filter coefficients on basis of the first set of pixel values and the second set of pixel values. In other words, the filter coefficients are approximated from the SD input image within a local window. This is done by using a Least Mean Squares (LMS) method which is explained in connection with
An adaptive filtering unit 104 for calculating the pixel value of the HD output pixel on basis of the first set of pixel values and the filter coefficients as specified in Equation 1. Hence the filter coefficient-calculating unit 106 is arranged to control the adaptive filtering unit 104.
FHD=w1FSD(1)+w2FSD(2)+w3FSD(3)+w4FSD(4), (2)
where FSD(1) to FSD(4) are the pixel values of the 4 SD input pixels 1-4 and w1 to w4 are the filter coefficients to be calculated by means of the LMS method. The authors of the cited article in which the prior art method is described, make the sensible assumption that edge orientation does not change with scaling. The consequence of this assumption is that the optimal filter coefficients are the same as those to interpolate, on the standard resolution grid:
Pixel 1 from 5, 7, 11, and 4 (that means that pixel 1 can be derived from its 4 neighbors)
Pixel 2 from 6, 8, 3, and 12
Pixel 3 from 9, 2, 13, and 15
Pixel 4 from 1, 10, 14, and 16
This gives a set of 4 linear equations from which with the LSM-optimization the optimal 4 filter coefficients to interpolate the HD output pixel are found.
Denoting M as the pixel set, on the SD -grid, used to calculate the 4 weights, the Means Square Error (MSE) over set M in the optimization can be written as the sum of squared differences between original SD-pixels FSD and interpolated SD-pixels FSI:
Which in matrix formulation becomes:
Here {right arrow over (y)} contains the SD-pixels in M (pixel FSD(1,1) to FSD(1,4), FSD(2,1) to FSD(2,4), FSD(3,1) to FSD(3,4), FSD(4,1) to FSD(4,4) and C is a 4×M2 matrix whose kth row is composed of the weighted sum of the four diagonal SD-neighbors of each SD-pixels in {right arrow over (y)}.
The weighted sum of each row describes a pixel FSI, as used in Equation 3. To find the minimum MSE, i.e. LMS, the derivation of MSE over {right arrow over (w)} is calculated:
By solving Equation 7 the filter coefficients are found and by using Equation 2 the pixel values of the HD output pixels can be calculated.
In this example a window of 4 by 4 pixels is used for the calculation of the filter coefficients. An LMS optimization on a larger window, e.g. 8 by 8 instead of 4 by 4 gives better results.
A memory device for storage of a number of pixels of a number of SD input images.
A pixel acquisition unit 102 which is arranged to acquire:
a first set of pixel values of pixels from a first one of the SD input images in a first neighborhood of a particular location within the first SD input image, which corresponds with the location of the output pixel HD.
a second set of pixel values of pixels from the first SD input image in a second neighborhood of the particular location;
a third set of pixel values of pixels from a second one of the SD input images in a third neighborhood of the particular location;
an optional fourth set of pixel values of pixels from a third one of the SD input images in a fourth neighborhood of the particular location.
A filter coefficient-calculating unit 106 which is arranged to calculate filter coefficients on basis of the first, second, third and optionally fourth set of pixel values. In other words, the filter coefficients are approximated from the SD input images within a local window located in the first SD input image and the window extending to the second SD input image and optionally to the third SD input image. Preferably the second SD input image and the third SD input image are respectively preceding and succeeding the first SD input image in the sequence of SD input images. The approximation of the filter coefficients is done by using a Least Mean Squares (LMS) method which is explained in connection with
An adaptive filtering unit 104 for calculating a pixel value of an HD output image on basis of the second set of pixel values. The HD output pixel is calculated as the weighted sum of the pixel values of the first set of pixel values.
The image conversion unit 200 optionally comprises an input connector 114 for providing motion vectors to be applied by the pixel acquisition unit 102 for the acquisition of pixel values in the succeeding SD input images of the SD input image sequence, which are on respective motion trajectories, as explained in connection with
The number of pixels acquired in the neighborhood, i.e. the window size, might be even or odd, e.g. 4*4 or 5*5 respectively. Besides that the shape of the window does not have to be rectangular. Also the number of pixels acquired from the first image and the number of pixels acquired from the second image does not have to be mutually equal.
The pixel acquisition unit 102, the filter coefficient-calculating unit 106 and the adaptive filtering unit 104 may be implemented using one processor. Normally, these functions are performed under control of a software program product. During execution, normally the software program product is loaded into a memory, like a RAM, and executed from there. The program may be loaded from a background memory, like a ROM, hard disk, or magnetically and/or optical storage, or may be loaded via a network like Internet. Optionally an application specific integrated circuit provides the disclosed functionality.
To convert an SD input image into an HD output image a number of processing steps are needed. By means of
Receiving means 402 for receiving a signal representing SD images. The signal may be a broadcast signal received via an antenna or cable but may also be a signal from a storage device like a VCR (Video Cassette Recorder) or Digital Versatile Disk (DVD). The signal is provided at the input connector 408;
The image conversion unit 404 as described in connection with
A display device 406 for displaying the HD output images of the image conversion unit 200. This display device 406 is optional.
The image processing apparatus 400 might e.g. be a TV. Alternatively the image processing apparatus 400 does not comprise the optional display device but provides HD images to an apparatus that does comprise a display device 406. Then the image processing apparatus 400 might be e.g. a set top box, a satellite-tuner, a VCR player or a DVD player. But it might also be a system being applied by a film-studio or broadcaster.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be constructed as limiting the claim. The word ‘comprising’ does not exclude the presence of elements or steps not listed in a claim. The word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements and by means of a suitable programmed computer. In the unit claims enumerating several means, several of these means can be embodied by one and the same item of hardware.
Number | Date | Country | Kind |
---|---|---|---|
02078991.3 | Sep 2002 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB03/03563 | 8/8/2003 | WO | 3/18/2005 |