The present invention relates to image upscaling based upon directional interpolation.
Digital video is typically represented as a series of images or frames, each of which contains an array of pixels. Each pixel includes information, such as intensity and/or color information. In many cases, each pixel is represented as a set of three colors, each of which is defined by eight bit color values.
Pixel information within a frame may be missing such as boundary conditions that may result when executing a video conversion process, such as the conversion between interlaced television field signals and progressive frame signals. In other cases, a frame may be at a first resolution, such as standard definition, which is desirable to convert to a second resolution, such as high definition.
Standard techniques for pixel interpolation are based on application of classical linear filters. However, such techniques introduce various visual artifacts such as blurring of sharp edge patterns and detailed texture patterns in the image, ringing along edge contours, as well jaggedness along edge contours. Such artifacts generally cannot be avoided when standard linear filtering techniques are used for interpolation. Therefore, there is a need for improved techniques for pixel interpolation, such as techniques that are adaptive to local image patterns.
An improved technique for pixel interpolation is generally referred to as an edge-directed interpolation which seeks to generate a value for a missing pixel by extending the patterns in selected directions, expressed as edges which exist in the surrounding pixels of an image frame.
Unfortunately, it is difficult to determine the existence of an edge in an image and the direction of such an edge in an image. Erroneous estimates of the direction can lead to new visual artifacts in the image after interpolation. Therefore, there is a need for a robust technique for estimating the direction of a local edge pattern in an image. Furthermore, there is need for a robust image upscaling technique that considers the possibility of erroneous estimates of the direction, to avoid any visual artifacts. Furthermore, there is a need for an image upscaling technique with low computational complexity that can outperform standard linear filtering techniques in terms of visual quality.
The foregoing and other objectives, features, and advantages of the invention will be more readily understood upon consideration of the following detailed description of the invention, taken in conjunction with the accompanying drawings.
Referring to
The 5 by 5 window is used to determine the orientation (direction) of an edge at or near the central pixel. The orientation estimation 110 of the edges may use any suitable technique, such as using a gradient technique. A reliability measure 120 may be determined which estimates the likelihood that the orientation estimation 110 has determined an actual edge. The output of the orientation estimation 110 may be refined by an orientation refinement 130. The orientation refinement 130 may use a different technique from the orientation estimation 110 to select from among the best candidate(s) from the orientation estimation 110. The orientation refinement 130 reduces the likelihood of the orientation estimation 110 being in error as a result of local minimums in the estimation process.
The orientation refinement 130 is used to generate the coefficients of the direction 140. The coefficients of the direction 140 are used with a directional interpolation 150 to generate a set of one or more interpolated pixels in the 5 by 5 window. In some cases, the directional interpolation 150 does not provide accurate results, such as in the case that there is no defined edge within the 5 by 5 window. To upscaling technique may provide a fallback interpolation 160 that is not based to such a degree upon the determination of edges, described later.
Using the reliability measure 120 as a basis for determining an appropriate output, the fallback interpolation 160 may be multiplied by 1 minus the reliability measure 200 which is added 190 to the directional interpolation 150 multiplied by the reliability measure. In this manner, the output is a combination of the directional interpolation 150 and the fallback interpolation 160.
Referring to
The plurality of different directions may include, for example, 10 different directions defined as, −45 degrees, −26.6 degrees, −18.4 degrees, +18.4 degrees, +26.6 degrees, +45 degrees, +63.4 degrees, +71.6 degrees, +108.4 degrees, and +116.6 degrees. Each direction may be indexed with an index from 0 to 9, with direction with index 0 corresponding to −45 degrees, with index 1 corresponding to −26.6 degrees, with index 2 corresponding to −18.4 degrees, with index 3 corresponding to +18.4 degrees, with index 4 corresponding to +26.6 degrees, with index 5 corresponding to +45 degrees, with index 6 corresponding to +63.4 degrees, with index 7 corresponding to +71.6 degrees, with index 8 corresponding to +108.4 degrees, and with index 9 corresponding to +116.6 degrees. Along a direction, two parallelograms are formed so that the three new pixels (ul 210, ur 220, and dl 230) are either inside the parallelograms or on the boundaries. The new pixels (ul 210, ur 220, and dl 230) are interpolated using these parallelograms by averaging of pixels at either 4 corner points of a parallelogram, or by averaging pixels at 2 points in the 5 by 5 window. Other pixel selection techniques and calculation techniques may likewise be used.
pul=(p06+p12)/2
pdl=(p06+p11+p12+p17)/4
pur=(p06+p07+p12+p13)/4
pul=(p05+p06+p12+p13)/4
pdl=(p11+p12)/2
pur=(p06+p13)/2
pul=(p05+p13)/2
pdl=(p11+p12)/2
pur=(p05+p06+p13+p14)/4
pul=(p8+p10)/2
pdl(p11+p12)/2
pur=(p8+p9+p10+p11)/4
pul=(p7+p8+p10+p11)/4
pdl=(p11+p12)/2
pur=(p8+p11)/2
pul=(p7+p11)/2
pdl=(p7+p11+p12+p16)/4
pur=(p7+p8+p11+p12)/4
pul=(p02+p7+p11+p6)/4
pdl=(p7+p16)/2
pur=(p7+p12)/2
pul=(p2+p16)/2
pdl=(p2+p7+p16+p21)/4
pur=(p7+p12)/2
pul=(p1+p12)/2
pdl=(p1+p6+p17+p22)/4
pur=(p7+p12)/2
pul=(p1+p6+p12+p17)/4
pdl=(p6+p17)/2
pur=(p12+p7)/2
pul=(p06+p07+p11+p12)/4
pdl=(p11+p12)/2
pur=(p07+p12)/2
The orientation estimation may calculate the image gradient vectors at pixels 6, 7, 8, 11, 12, 13, 16, 17, 18 in the 5×5 window for the central pixel 240. These pixels are the inner 9 pixels of the 5×5 window. Each gradient vector consists of a spatial image derivative in the x direction (horizontal) and a spatial image derivative in the y direction (vertical). In other words, gradi=(gradXi, gradYi)T. The gradient calculation is performed using a pair of spatial derivative operators. The first operator computes the derivative in the x direction, and the second operator computes the derivative in the y direction. For example, the technique may utilize the well-known Sobel operators:
Other operators may likewise be used and other image derivative operators may likewise be used.
An estimate of the initial orientation direction θint may be calculated as follows:
θ tends to assume that the 9 pixels show the same edge orientation (e.g., correlated). Because θ is quantized into 10 values, the above equation can be readily solved by exhaustive search. In other words, the sum in the above equation only needs to be evaluated 10 times for the 10 defined orientations. The orientation with the minimum value for the sum corresponds to the initial estimate of the orientation.
Note that the above equation is based on taking the sum of absolute values of differences. Other criteria may be used as well, such as the sum of squared differences. However, taking the sum of absolute values as in the above equation is tolerant to outliers and is therefore robust.
The direction estimated by gradient calculation is not always accurate, and therefore the estimate is refined by the orientation refinement 130. The sum of absolute differences for each direction may be defined as follows.
The weights w[n] in the above expressions serve the following purposes:
From the orientation refinement 130, the technique has computed the index of the initial direction estimate. The technique further compares the sum of absolute differences of index−1, index, and index+1, and picks the lowest one as the refined estimated direction. In other words, the technique computes the values of Diff[index−1], Diff[index], and Diff[index+1], where index corresponds to the index of the initial direction estimate. If Diff[index−1] is the lowest of the three values, then index−1 determines the final direction estimate; otherwise, if Diff[index] is the lowest of the three values, then index determines the final direction estimate; otherwise index+1 determines the final direction estimate.
The following table summarizes the orientation angles that are defined, their index and their preferred weight value.
If θ:int is the initial interpolation direction, then the technique may compute the reliability indicator 120 of the direction estimation as follows:
The value of the reliability indicator α is between 0.0 and 1.0. Note that the numerator of the above expression contains the same sum that was used for the initial direction estimation, while the denominator contains a similar sum based on an angle perpendicular to the estimated θint. If the image structure is locally highly directional, for example due to the presence of an edge, the numerator tends to be much smaller than the denominator, therefore the second term in the above equation tends to be small values, closer to 0.0, and α tends to be 1. If the image structure does not have any dominant orientation, the numerator and denominator tend to have the same magnitude, therefore α tends to 0.0. This may be the case in a texture area of the image, for example. The computation of the reliability indicator may be suitably modified to handle the case where the denominator is very close to 0.0. Furthermore, the above reliability indicator may alternatively be computed based on sums of squared differences instead of sums of absolute differences.
As previously explained, the value of α is used in the interpolation technique to compute the final value of interpolated pixels.
The fallback interpolation 160 may use any suitable technique, such as single bi-linear, or single bi-cubic, and preferably a technique that doesn't involve an edge determination. Referring to
The technique also defines a set of coefficients for a fallback non-directional interpolation. When the edge orientation is estimated, a scalar, α, between 0 and 1 is also computed to indicate the reliability of the estimation. Eventually, the results from the directional interpolation and the non-directional interpolation is blended together based on α. The fallback interpolation and the directional interpolated results are blended together by a scalar α 120. Let pd denote the result of the directional interpolation, and let pn denote the result of the non-directional fallback interpolation for pixel p. This applies to any of the new pixels that are interpolated, i.e. pixels ul 210, ur 220 and dl 230. Then, the final interpolated pixel value pf is defined by
pf=α·pd+(1−α)·pn.
The technique restricts all the computation within the 5×5 window, and does not use iteration.
A non-iteration based technique reduces buffering and reduces the necessary calculation speed for a hardware based implementation. The technique generates the upscaled output image in a single pass over the input image. After the interpolated pixels are calculated from the input pixels, no further operations are performed in the output image.
The terms and expressions which have been employed in the foregoing specification are used therein as terms of description and not of limitation, and there is no intention, in the use of such terms and expressions, of excluding equivalents of the features shown and described or portions thereof, it being recognized that the scope of the invention is defined and limited only by the claims which follow.
This application claims the benefit of U.S. Provisional App. No. 60/961,955, filed Jul. 24, 2007.
Number | Name | Date | Kind |
---|---|---|---|
5054100 | Tai | Oct 1991 | A |
5444487 | Kondo et al. | Aug 1995 | A |
5666164 | Kondo et al. | Sep 1997 | A |
6323905 | Kondo et al. | Nov 2001 | B1 |
6408109 | Silver et al. | Jun 2002 | B1 |
6434280 | Peleg et al. | Aug 2002 | B1 |
6466702 | Atkins et al. | Oct 2002 | B1 |
6535651 | Aoyama et al. | Mar 2003 | B1 |
6690842 | Silver et al. | Feb 2004 | B1 |
6766067 | Freeman et al. | Jul 2004 | B2 |
6766068 | Aoyama et al. | Jul 2004 | B2 |
6928196 | Bradley et al. | Aug 2005 | B1 |
7106914 | Tipping et al. | Sep 2006 | B2 |
7215831 | Altunbasak et al. | May 2007 | B2 |
7218796 | Bishop et al. | May 2007 | B2 |
7239428 | Solecki | Jul 2007 | B2 |
7292738 | Ma et al. | Nov 2007 | B1 |
7391920 | Abe | Jun 2008 | B2 |
7796191 | Vojkovich | Sep 2010 | B1 |
20060244861 | Lertrattanapanich et al. | Nov 2006 | A1 |
20060290950 | Platt et al. | Dec 2006 | A1 |
20060291741 | Gomi et al. | Dec 2006 | A1 |
Number | Date | Country |
---|---|---|
4-129284 | Apr 1992 | JP |
11-514810 | Dec 1999 | JP |
2002-024815 | Jan 2002 | JP |
2003-174557 | Jun 2003 | JP |
WO 03102868 | Dec 2003 | WO |
Entry |
---|
Frank M. Candocia & Jose C. Principe, “A Neural Implementation of Interpolation with a Family of Kernels,” Proc. of the International Conference on Neural Networks, 1997, (ICNN 97), vol. III, pp. 1506-1511. |
Frank M. Candocia & Jose C. Principe, “Superresolution of Images Based on Local Correlations,” IEEE Transactions on Neural Networks, vol. 10, No. 2, pp. 372-380, Mar. 1999. |
Office Action in Chinese App. No. 200880025100.2, Sharp Kabushiki Kaisha, dated May 18, 2011, 19 pgs., including English translation. |
Number | Date | Country | |
---|---|---|---|
20090028464 A1 | Jan 2009 | US |
Number | Date | Country | |
---|---|---|---|
60961955 | Jul 2007 | US |