The present invention relates generally to a system and method for resizing images, and specifically to a system and method for resizing high contrast images.
Scaling an image well is generally challenging. When zooming, fine structures in the image such as lines and edges need to be preserved, and an apparent resolution of the enlarged image must not be degraded when examined closely by a human observer. Satisfying these basic requirements is difficult, especially for a computer graphics image.
A main difficulty in scaling computer graphics images arises because the human observer is sensitive to structures with high contrast as well as an amount of weight or energy in structures of high contrast. Text, for example, contains abrupt edges and line segments, which are combined to form a letter. Similarly, a computer graphics image is likely to contain abrupt edges when a structure is placed on a background of a different colour. Transitions between these regions are often defined by immediate changes in colour and intensity. When such images are enlarged it is very important to preserve the high contrast appearance of the original image.
Linear filtering usually does an adequate job of maintaining the amount of weight or energy in a structure of high contrast but does a poor job at maintaining the structures themselves. Traditionally, resizing an image has often been done using linear techniques. A Finite Impulse Response (FIR) filter of polyphase design is used to compute interpolated pixels in the scaled image. The scaled image can be either larger or smaller than the source image. The process by which the target image is generated from the FIR filter is known as convolution. Unfortunately, the result is not often visually satisfactory, because linear filters cannot reproduce, or preserve, with sufficient accuracy, high contrast structures such as edges tat reside in many images that are generated by a computer. For instance, using a linear filter that is too “sharp ”, that is, it has significant high frequency gain, will result in a target image that is prone to exhibit “ringing”. This type of phenomenon is referred to as Gibbs effect. Gibbs effect manifests itself in a target image as a visual echo or ripple that surrounds the structure in question. Alternately, if the filter is too “soft”, that is, it has more high frequency attenuation the resulting target image will be perceived as blurry. For example, edges are rounded off. Neither a soft nor a sharp linear filter is adequate.
In order to preserve local contrast in the image, scaling using a nearest neighbour algorithm is sometimes used. The nearest neighbour algorithm replicates the value of the pixel nearest in location to the location of the interpolated pixel. The algorithm preserves structures with high contrast, but often results in images that appear blocky when large resize factors are involved. Further, at smaller resize factors, the nearest neighbour algorithm can lead to finer image structures, such as text, to appear as mixed bold and standard fonts depending on the exact position of the target pixel during interpolation. In either case, the result is not visually pleasing.
Therefore, there is a need to provide a scaling algorithm that preserves edges in high contrast images, while reducing the visually displeasing effects of doing so. Thus, it is an object of the present invention to obviate or mitigate at least some of the above-mentioned disadvantages.
In accordance with an aspect of the present invention there is provided a method for interpolating a target pixel from a plurality of source pixels in a high contrast image. The method comprises the following steps. A window of the plurality of source pixels is examined and compared with a plurality of predefined conditions for determining if a structure of significance is present within the window. A filter configuration is selected from a plurality of filter configurations in accordance with results of the comparison. The selected filter is applied to the source pixels for interpolating the target pixel. If the structure of significance is detected in the window, the selected filter best preserves the structure.
Embodiments of the invention will now be described by way of example only with reference to the following drawings in which:
For convenience, like numerals in the description refers to like structures in the drawings. An adaptive nearest neighbour (ANN) algorithm is designed to resize high contrast images such as computer-generated images, web pages and computer graphics with text. The ANN algorithm is an adaptive non-linear algorithm that examines surface structure in order to compute a target pixel.
The basic ideas behind the ANN algorithm is as follows. During image resizing (either enlargement or reduction), a target pixel position is computed relative to a plurality source pixels. Pixels in an image are arranged in a grid comprising rows and columns. The number of rows and columns in an image depends on the image resolution. Source pixels refer to pixels in the image being scaled and target pixels refer to pixels in a resulting scaled image. When enlarging, the step size is less than the distance between source pixels. That is, every source pixel is bound by two target pixels that are closer than the next closest source pixel. This is true because separable interpolation in vertical and horizontal directions forces the target pixel to be co-linear with the rows (when interpolating horizontally) and co-linear with the columns (when interpolating vertically). The target pixel nearest to the source pixel containing an edge, corner, or structure of significance as determined by a decision engine is used to preserve the edge. The target pixel is computed by filtering the surrounding source pixels.
Thus, the ANN algorithm examines surface structure in a pixel window (or decision window) about the target pixel position and applies a variable length filter. The decision engine DE determines sues within the image. In order for this to occur, heuristics are used that examine the pixels in the pixel window in order to determine whether or not there are any important structures prior to interpolation.
In the present embodiment, the adaptive nearest neighbour ANN algorithm uses a 4×4 decision window for vertical interpolation and a 4×1 decision window for horizontal interpolation. While the present embodiment is designed to operate on windows of these sizes, a person skilled in the art will appreciate that the algorithm can be adapted to apply to larger window sizes in which longer filters can be applied. It has been determined during experimentation that the window sizes used in the present embodiment and their associated rules provide good quality results for a large range of zoom factors.
If the target pixel lies directly between two vertically aligned source pixels, only vertical interpolation is required. If the target pixel lies directly between two horizontally aligned source pixels, only horizontal interpolation is required. Otherwise both horizontal and vertical interpolation are required. When an image requires both vertical and horizontal interpolation, it is preferable that the vertical interpolation is performed first. The decision is based on the green values of the Red Green Blue (RBG) values of the pixels.
Referring to
In the present embodiment a target pixel T is illustrated between pixels X1, X2, Y1, and Y2. The position of the target pixel T can be shown to have a vertical component Fy and a horizontal component Fx. The vertical component Fy is the vertical distance from pixel X1, in the downward direction. The horizontal component Fx is the vertical distance from pixel X1, in the rightward direction. Therefore, both a vertical and a horizontal interpolation are required for interpolating the target pixel T.
Depending on the results of the decision engine, there are six possible filter configurations that can be applied to the decision window. The filter configurations either preserve X1, preserve Y1, apply a two-tap filter to X1 and Y1, apply a three-tap filter to W1, X1, and Y1, apply a three-tap filter to X1, Y1, and Z1, or apply a four-tap filter to W1, X1, Y1, and Z1.
The decision engine examines the pixel information on the green value of the RGB values of the pixels. Whatever is determined for the green value is repeated for red and blue. Alternately, the decision as to which filter configuration to implement is based on a surrogate luminance value Y composed of fractional components of the R, G and B channels. The reason for the surrogate Y value is as follows. If the edge threshold detection is used for one of the red, green or blue channels, and there is no edge in that channel, but an edge does exist in the other channels, it would not be detected. By relying on a linear combination of red, green and blue, all edges of sufficient magnitude in any channel is enough to trigger the edge rules.
Before the details of the decision engine are detailed, several conditions and variable used by the decision engine are described. A threshold is defined for determining whether or not an edge exists between two pixels. The threshold is programmable and its default value is set to 10. If the difference between two adjacent pixels exceeds the threshold, then an edge is present between the pixels. If the difference between two pixels does not exceed the threshold, the pixels are determined to be level.
The distance between adjacent pixels in the array, in either the vertical or horizontal direction is defined as one. Thus, since the vertical component Fy of the target pixel falls between X1 and Y1, the distance from the target pixel to X1 is a fraction. One is defined as the binary value of b 1 shifted left by a number of fraction bits PositionFractionBits. The number of fraction bits PositionFractionBits is a predefined variable. This definition is represented by the expression:
Unity=1<<PositionFracBits, where “<<” is a left shift operator
A midpoint between two source pixels is determined by taking the binary value of 1 and shifting it left by one less than the number of fraction bits PositionFractionBits. This definition is represented by the expression:
Midpoint=1<<(PositionFracBits−1)
A step size Ystep represents the distance between target pixels. That is, Ystep is equal to the difference between a previous target pixel and a current target pixel, as well as the difference between the current target pixel and the next target pixel.
Using the step size, Ystep., two other values are determined for later use in calculations. An immediately preceding (or previous) target pixel's y-position FyBehind is determined in accordance with the expression ((fy−Ystep) & positionWholeMask), where “&” represent a bit-wise AND function. FyBehind is the absolute value of the distance between the last target pixel and, in the present embodiment, X1. Thus, for example, if Fy=0.2 and Yste is 0.5, FyBehind=0.3. Also, an immediately following (or next) target pixel's y-position FyAhead is determined in accordance with the expression ((fy+Ystep) & positionWholeMask).
The use and definition of the FyBehind and FyAhead variables will become apparent when illustrated with reference to implementing the present embodiment. The variable positionWholeMask is determined by shifting the binary value of 1 left by the number of fraction bits PositionFractionBits and then subtracting 1. This definition is represented by the expression:
positionWholeMask=(1<<positionFracBits)−1
The decision engine calculates several Boolean variable for use in determining which filter configuration to select. Level variables are used for determining that there are no edges between adjacent pixels. That is, the absolute difference between adjacent pixel values is less than the threshold. Level variable are calculated for each adjacent pair of pixels in a column in accordance with the following relationships:
Level—W1X1=TRUE if abs(ΔW1X1)<=threshold
Level—X1Y1=TRUE if abs(ΔX1Y1)<=threshold
Level—Y1Z1=TRUE if abs(ΔY1Z1)<=threshold
Level—Y1X0=TRUE if abs(ΔY1X0)<=threshold
Level—Y1X2=TRUE if abs(ΔY1X2)<=threshold
That is, for example, Boolean variable Level_W1X1 is set to TRUE if the absolute value of the difference between pixel values for W1 and X1 is less than or equal to the threshold.
Similarly, edge variables are used for determining when an edge is present between adjacent pixels. Edge variables are calculated as follows:
Edge—W1X1=TRUE if abs(Δw1x1)>threshold
Edge—X1Y1=TRUE if abs(Δx1y1)>threshold
Edge—Y1Z1=TRUE if abs(Δy1z1)>threshold
Edge—Y1Y0=TRUE if abs(Δy1y0)>threshold
Edge—Y1Y2=TRUE if abs(Δy1y2)>threshold
That is, for example, Boolean variable Edge_W1X1 is set to TRUE if the absolute value of the difference between pixel values in Green for W1 and X1 is greater than the threshold.
A hybrid filtering variable HybridFiltering determines whether or not to implement hybrid filtering. Hybrid filtering is used to preserve a feature such as a line that is a single pixel wide. That is, for any scale factor, the width of the source feature is preserve. This variable is programmable and its default value is set to False. An increase in pixel intensity variable Up_X1Y1 is used for determining whether the pixel intensity has increased from X1 to Y1. Up_X1Y1 is set to true if Y1>X1. Otherwise it is set to false. Similarly, a decrease in pixel intensity variable Down_X1Y1 is used for determining whether the pixel intensity has decreased from X1 to Y1. Down_X1Y1 is set to true if X1>=Y1. Otherwise it is set to false.
The following algorithm describes predefined patterns against which the decision window is compared as well and the resulting filter configuration that is selected for each case. Due to the way in which the algorithm is implemented (case-like structure), the order of the patterns can be significant. Therefore, it is preferable that they are compared with the decision window in the order as listed below.
Referring to
Level_W1X1 && Level_X1Y1 && Level_Y1Z1
Referring to
(Level_W1X1 && Level_X1Y1 && Edge_Y1Z1) && (fy<=Midpoint)
Referring to
(Level_W1X1 && Level_X1Y1 && Edge_Y1Z1) && (fy>Midpoint)
Referring to
(Edge_W1X1 && Level_X1Y1 && Level_Y1Z1) && (fy<=Midpoint)
Referring to
(Edge_W1X1 && Level_X1Y1 && Level_Y1Z1) && (fy>Midpoint)
Referring to
Edge_W1X1 && Level_X1Y1 && Edge_Y1Z1
Referring to
Edge_X1Y1 && Edge_Y1Z1 && Edge_Y1Y2 && Level_Y1X2 && !((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_X1Y1 && Edge_Y1Z1 && Edge_Y1Y2 && Level_Y1X2 &&((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_X1Y1 && Edge_Y1Z1 && Edge_Y1Y0 && Level_Y1X0 && !((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_X1Y1 && Edge_Y1Z1 && Edge_Y1Y0 && Level_Y1X0 && ((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_X1Y1 && edge_Y1Z1 && ((Fy+Ystep>=One) && !(One−Fy>FyAhead))
Referring to
Edge_X1Y1 && Edge_Y1Z1 && Up_X1Y1 && ((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_X1Y1 && Edge_Y1Z1 && !(Fy+Ystep>=One) && HybridFiltering
Referring to
Edge_X1Y1 && Edge_Y1Z1
Referring to
Edge_W1X1 && Edge_X1Y1 && ((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_W1X1 && Edge_X1Y1 && Down_X1Y1 && ((Fy+Ystep>=One) && !(One−Fy>FyAhead))
Referring to
Edge_W1X1 && Edge_X1Y1 && (Ystep>=FyBehind) && (FyBehind>fy) && HybridFiltering
Referring to
Edge_W1X1 && Edge_X1Y1
Referring to
Edge_X1Y1 && Edge_Y1Y2 && Level_W1X1 && Level_Y1Z1 && Level Y1X2 && !((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_X1Y1 && Edge_Y1Y2 && Level_W1X1 && Level_Y1Z1 && Level Y1X2 && ((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_X1Y1 && Edge_Y1Y0 && Level_W1X1 && Level_Y1Z1 && Level Y1X0 && !((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_X1Y1 && Edge_Y1Y0 && Level_W1X1 && Level_Y1Z1 && Level Y1X0 && ((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_X1Y1 && Level_W1X1 && Level_Y1Z1 && Up_X1Y1 && ((Ystep>=FyBehind) && (FyBehind>fy))
Referring to
Edge_X1Y1 && Level_W1X1 && Level_Y1Z1 && Down_X1Y1 && (Fy+Ystep>=One) && !(One−fy>FyAhead)
Referring to
Edge_X1Y1 && Level_W1X1 && Level_Y1Z1 && Up_X1Y1 && !(Ystep>=FyBehind) && HybridFiltering
Referring to
Edge_X1Y1 && Level_W1X1 && Level_Y1Z1 && Down_X1Y1 && !(Fy+Ystep>=One) && HybridFiltering
Referring to
Edge_X1Y1 && Level W1X1 && Level_Y1Z1 && Down_X1Y1
In the case that none of the above patterns are matched, four-tap filter is applied to pixels W1, X1, Y1, and Z1.
It can be seen from the above cases that many of the patterns contain similar elements. Therefore, in order to improve implementation speed and simplicity, elements can be grouped and defined in accordance as illustrated in Table 1 below.
These Boolean variables can then be grouped to provide the 27 patterns illustrated in
Referring to
Variables are defined for the horizontal interpolation that are similar to those described for the vertical interpolation. For example, Xstep, FxAhead, FxBehind, Up_X1X2, have a similar definition as the correlated y-component variables. The following algorithm describes predefined patterns against which the decision window is compared as well and the resulting filter configuration that is selected for each case. Due to the way in which the algorithm is implemented (case-like structure), the order of the patterns can be significant. Therefore, it is preferable that they are compared with the decision window in the order as listed below.
Referring to
Level_X0X1 && Level_X1X2 && Level_X2X3
Referring to
This case is determined by the Boolean expression:
Level_X0X1 && Level_X1X2 && Edge_X2X3 && Fx<=Midpoint
Referring to
Level_X0X1 && Level_X1X2 && Edge_X2X3 && Fx>Midpoint
Referring to
Edge_X0X1 && Level_X1X2 && Level_X2X3 && Fx<=Midpoint
Referring to
Edge_X0X1 && Level_X1X2 && Level X2X3 && Fx>Midpoint
Referring to
Edge_X0X1 && Level_X1X2 && Edge_X2X3
Referring to
Edge_X1X2 && Edge_X2X3 && (fx+xstep>=one) && (one−fx<Ahead)
Referring to
Edge_X1X2 && Edge_X2X3 && Up_X1X2 && (Xstep>=FxBehind) && (FxBehind>Fx)
Referring to
Edge_X1X2 && Edge_X2X3 && (fx+xstep<one) && HybridFiltering
Referring to
Edge_X1X2 && Edge_X2X3
Referring to
Edge_X0X1 && Edge_X1X2 && (Xstep>=FxBehind) && (FxBehind>Fx)
Referring to
Edge_X0X1 && Edge_X1X2 && !Up_X1X2 && (fx+xstep>=one) && (one−fx=<fxAhead)
Referring to
Edge_X0X1 && Edge_X1X2 && !(Xstep>=FxBehind) && Hybrid Filtering
Referring to
Edge_X0X1 && Edge_X1X2
Referring to
Level_X0X1 && Edge_X1X2 && Level_X2X3 && (fx+xstep>=one) && (one−fx=<fxAhead) && !Up_X1X2
Referring to
Level_X0X1 && Edge_X1X2 && Level_X2X3 && (Xstep>=FxBehind) && (FxBehind>Fx) && Up_X1X2
Referring to
Level_X0X1 && Edge_X1X2 && Level_X2X3 !(Xstep>=FxBehind) && Up_X1X2 && HybridFiltering
Referring to
Level_X0X1 && Edge_X1X2 && Level_X2X3 && !(fx xstep>=one) && !Up_X1X2 && HybridFiltering
Referring to
Level_X0X1 && Edge_X1X2 && Level_X2X3
In the event that none of the above patterns are matched, a four-tap filter is applied to pixels X0, X1, X2, and X3
It can be seen from the above cases that many of the patterns contain similar elements. Therefore, in order to improve implementation speed and simplicity, elements can be grouped and defined in accordance as illustrated in Table 3 below.
These Boolean variables can then be grouped to provide the 19 patterns illustrated in
Although the invention has been described with reference to certain specific embodiments, various modifications thereof will be apparent to those skilled in the art without departing from the spirit and scope of the invention as outlined in the claims appended hereto.
This appilcation is a continuation-in-part of and claims priority to U.S. patent application Ser. No. 09/948,819, filed Sep. 10, 2001, now U.S. Pat. No. 6,788,353, issued Sep. 7, 2004, which, in turn, claims priority to Canadian patent application No. 2,317,870, filed Sep. 8, 2000.
Number | Name | Date | Kind |
---|---|---|---|
4603350 | Arbeiter et al. | Jul 1986 | A |
4805129 | David | Feb 1989 | A |
5559905 | Greggain et al. | Sep 1996 | A |
5594676 | Greggain et al. | Jan 1997 | A |
5598217 | Yamaguchi | Jan 1997 | A |
5818964 | Itoh | Oct 1998 | A |
5852470 | Kondo et al. | Dec 1998 | A |
6144409 | Han et al. | Nov 2000 | A |
6348929 | Acharya et al. | Feb 2002 | B1 |
6563544 | Vasquez | May 2003 | B1 |
6788353 | Wredenhagen et al. | Sep 2004 | B1 |
Number | Date | Country | |
---|---|---|---|
20030185463 A1 | Oct 2003 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09948819 | Sep 2001 | US |
Child | 10106060 | US |