This invention relates to the field of image processing, and more particularly to a method of up-sampling images, and in particular video images. The method is suitable for converting standard video to high definition video especially in real time applications such as video conferencing.
In image processing, up-sampling is used to magnify an entire image or to zoom into a part of an image. Up-sampling involves using some technique to fill in empty pixels when an image of a given resolution is displayed at a higher resolution. When the image is a video frame, the up-sampling must be performed in real-time. The term “real-time” typically means being capable of up-sampling an image with resolution of 960×540 to 1920×1080 (full high definition), with frame rate of 30 fr/sec.
Currently, the most commonly used up-sampling methods are nearest neighbor, bilinear interpolation, and bicubic interpolation (applied directly to the original image). Among these three methods, the nearest neighbor gives the coarsest visual quality with grid effects along edges (especially along diagonal edges). Bilinear and bicubic interpolations give images with more natural edges, but with blurred visual quality.
Another more advanced up-sampling method utilizes the Lanczos filter. See, for example, Claude E. Duchon (August 1979). “Lanczos Filtering in One and Two Dimensions”. Journal of Applied Meteorology 18 (8): pp. 1016-1022. Usually, at the level of up-sample-by-2, the visual quality of the up-sampled image using Lanczos filter is better than that of bilinear and bicubic interpolation, but is still noticeably blurred compared to the original image.
There are even more advanced and complex algorithms for image up-sampling, e.g., the level set-based algorithm, which iteratively controls the contour evolution in the up-sampling process. However, such methods usually require a very long time to up-sample an image (for given computing power), and therefore are not suitable for real-time applications.
There is an up-sample-by-2 method, which is based on one-dimensional (1D) wavelet filter bank, which is suitable for real-time applications under current hardware/software conditions. The method is described in paper: Image Up-Sampling Using Discrete Wavelet Transform, Ping-Sing Tsai and Tinku Acharya, Proceedings of Joint Conference on Imformation Sciences, 2006. It is also described in U.S. Pat. No. 6,377,280.
This method is briefly described with reference to
Up until now, these operations are the same as in a standard 1D wavelet filter bank decomposition (known at the time of the invention). Tsai and Acharya modified the structure of the four sub-bands for the purpose of up-sampling by 2. For the two sub-bands LH, HL, they increase their resolution by 2, i.e. from
and put each pixel in the original sub-bands to the upper-left pixel location of the corresponding 2×2 pixel group in the up-sampled sub-bands. For sub-band LL, Tsai and Acharya discard the LL from the decomposition filter and replace it with the original image after applying a scaling factor to it, and therefore the resolution of LL also increased from
For sub-band HH, Tsai and Acharya just replaced it using an all zero image with resolution W×H.
Now, the four modified sub-bands (LL2, LH2, HL2 and HH2) with resolution W×H are put through a standard reconstruction process using wavelet filter bank, as shown in
While this method is capable of working in real-time, there are noticeable artifacts along edges.
Another wavelet-based up-sampling method is described in the paper Edge-preservation resolution enhancement with oriented wavelets, V. Velisavljevic, Proceedings of IEEE Int. Conf. on Image Proc. (ICIP), 2008. This method applies a 1D wavelet filter to five directions in the image, and estimates a directional map by comparing the filtering results on five directions for each pixel. The up-sampling process is based on this estimated directional map. The algorithm works well in preserving the original shape of the edges, but at the expense of higher complexity. It is therefore not suitable for real-time applications of HD video under current hardware/software conditions.
An object of the invention is to up-sample an image in real-time with very good visual quality so that the invention can be used for up-sampling a standard definition video to high definition video. The term “very good visual quality” means the up-sampled image should have a degree of crispness similar to the original image without adding obvious artifacts.
According to the present invention there is provided a method of up-sampling an original image to a final up-sampled image, comprising: constructing at least two sub-banded filtered images of the original image with 2D wavelet-based decomposition filters, each filter image being of the same resolution as the original image; mapping each of the sub-banded filtered images into a larger filtered image of the same size as the final image, wherein pixels in each larger filtered image that were not mapped in from the sub-banded filtered images are interpolated or left blank; filtering the larger filtered images with 2D reconstruction filters; and combining the outputs of the 2D reconstruction filters to form the final up-sampled image.
In one embodiment the invention provides a method of up-sampling, i.e. zooming in, a digital image by first building at least two filtered images of the original image using decomposition filters. Each of the filtered images, which are the same size as the original, is then mapped into a larger filtered image. The larger filtered images are each equal in size to the final image size. Pixels in each larger filtered image which were not mapped in are interpolated from mapped pixels or left blank. Finally the larger filtered images are filtered again using reconstruction filters and combined to form the up-sampled image. All filters are different and are based on the theory of perfect reconstruction wavelet filter banks. In one embodiment quincunx filters are used.
The invention is wavelet-based. Unlike prior art wavelet-based up-sampling methods in which 1D wavelet filters are used with at least four sub-bands involved, the invention uses non-separable 2D wavelet filters (quincunx filter bank) and in one embodiment only has two sub-bands involved.
Unlike the method of Tsai and Acharya, embodiments of the invention do not discard sub-band pixel values in the wavelet decomposition stage. One diagonal pixel is interpolated for the two sub-bands before reconstruction. For the up-sampling of the low-pass filtered sub-band, the invention does not require the use any scaled pixel values of the original image.
Unlike the above two wavelet-based up-sampling methods in which the same wavelet reconstruction filters are used compared to the standard wavelet filter bank, embodiments of the invention adjust the reconstruction wavelet filters in order to bring more crisp up-sampled images.
In another aspect the invention provides an apparatus for up-sampling an original image to a final up-sampled image, comprising: a plurality of 2D wavelet-based decomposition filters for constructing at least two sub-banded filtered images of the original image with, each filter image being of the same resolution as the original image; up-sampling units for mapping each of the sub-banded filtered images into a larger filtered image of the same size as the final image, wherein pixels in each larger filtered image that were not mapped in from the sub-banded filtered images are interpolated or left blank; 2D reconstruction filters for processing the larger filtered images; and a combiner to form the final up-sampled image from the filtered images output by the 2D reconstruction filters.
The invention will now be described in more detail, by way of example only, with reference to the accompanying drawings, in which:—
a to 4d show the filter coefficients for various filters;
a and 5b shows the filtered image and the up-sampled filter image respectively; and
a and 6b show the reconstruction filter coefficients; and
One embodiment of the invention will now be described. A typical video display system is illustrated in
The basic structure of the invention is shown in
Both filters 204 and 220 are non-separable 2D wavelet quincunx filters. The theory behind the quincunx filter is described in “Perfect reconstruction filter banks for HDTV representation and coding” by M. Vetterli, J. Kova{hacek over (c)}ević and D. J. LeGall published in Image Comm. Journal, special issue on HDTV, vol. 2, no. 3, October 1990, pp. 349-364, the contents of which are herein incorporated by reference. The coefficients used in filter 204 are shown in
Returning to
The up-sampling and interpolation process is described with reference to
By way of example,
calculate dif1=|p1−p4|, and dif2=|p2−p3|
if (dif 1>dif 2)
then
else if (dif 1<dif 2)
then
else
The above up-sampling and interpolation procedure is iteratively applied to each pixel on the diagonal. The remaining pixels in the rows and columns containing the directly mapped pixels from the filtered image remain zero in value in the preferred embodiment of the invention. We now have two adjusted sub-bands IF0
This process is referred to as up-sampling and interpolate diagonal element, and is illustrated in
In the final stage each adjusted sub-band 210 and 226 is processed through reconstruction filters 212 and 228 respectively and then additively mixed in adder 214 (i.e. the value of each pixel p212(x,y) resulting from 212 is added to the value of the corresponding pixel p228(x,y) resulting from 228 to yield the corresponding pixel p(x,y) in the up-sampled image 216.
The reconstruction filter masks used in the embodiment of the invention are extended versions of the 2D reconstruction filter masks illustrated in
The need for extended filter masks and how they are created follows.
Filter masks F1_ext and G1_ext are up-sample-by-2 and interpolated versions of F1 and G1 respectively (note that the up-sample and interpolation method used to create F1_ext and G1_ext are different from what we described above for the up-sample of the image itself).
The theory of perfect reconstruction teaches how to precisely reproduce an image of the same size (i.e. same number of pixels) as the original after first down-sampling into sub-bands and then subsequently up-sampling and merging the sub-bands. Since it is an object of the invention to produce an image of larger size (i.e. more pixels) than the original, this underlying theory must be adapted. This embodiment does not down-sample nor does it discard any pixels from the two sub-bands IF0 and IG0. Therefore, in the reconstruction stage, in order to keep the correspondences between the sub-band pixels and filter coefficients as in the original reconstruction filter bank (not for up-sampling purposes), the 2D reconstruction filters F1 and G1 are extended, and the empty filter coefficient locations are filled using bicubic spline interpolation.
Based on lab tests and subjective opinions bicubic spline interpolation increases the degree of crispness of the resulting up-sampled-by-2 images. The values of a, b, c, d, e, g, h etc. are not restricted to the values shown in
The invention might be further developed for higher up-sampling ratio, by changing the structure of the filter bank, e.g., adding more filters which are based on M-channel wavelet filter bank.
It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the invention. For example, a processor may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), and non volatile storage. Other hardware, conventional and/or custom, may also be included.