Digital images are communicated by values that represent the luminance and chromatic attributes of an image at an array of locations throughout the image. Each value is represented by a given number of bits. When bandwidth, storage and display requirements are not restrictive, sufficient bits are available that the image can be displayed with virtually uninhibited visual clarity and realistic color reproduction. However, when bit-depth is restricted, the gradations between adjacent luminance or color levels can become perceptible and even annoying to a human observer. This effect is apparent in contouring artifacts visible in images with low bit-depth. Contour lines appear in low frequency areas with slowly varying luminance where pixel values are forced to one side or the other of a coarse gradation step.
These contouring artifacts can be “broken up” by adding noise or other dither patterns to the image, generally before quantization or other bit-depth reduction. This noise or pattern addition forces a random, pseudo-random or other variation in pixel values that reduces the occurrence and visibility of contours. Typically, the image is perceived as more natural and pleasing to a human observer.
Some of these methods can be explained with reference to
Some of these methods may be explained with reference to
In the systems illustrated in
Often it is not feasible to use a dither/noise pattern that is as big as an image file. In these cases, a smaller dither pattern can be used by repeating the pattern across the image in rows and columns. This process is often referred to as tiling. In multiple image sets, such as the frames or fields of video images, a dither pattern may be repeated from frame to frame as well. Dither patterns may be designed to minimize artifacts created by their repetitive patterns.
Dither structures may comprise multiple dither patterns to be used across a single image of multiple frames. A three-dimensional dither structure, as shown in
Systems and methods of embodiments of the present invention comprise the creation and/or application of dither structures. These structures may be used to reduce the visibility of contouring and other artifacts in still and video images.
The following drawings depict only typical embodiments of the present invention and are not therefore to be considered to be limiting of its scope, the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
Embodiments of the present invention may be used in conjunction with displays and, in some embodiments, in display algorithms that employ properties of the visual system in their optimization. Some embodiments of the present invention may comprise methods that attempt to prevent the contouring artifacts in displays that have too few gray levels. Some of these displays include LCD or similar displays with a digital bit-depth bottleneck. They may also be used with graphics controller cards with limited video RAM (VRAM). These bit-depth limitations can arise in the LCD display itself, or its internal hardware driver.
Some embodiments of the present invention include systems and methods comprising an anti-correlated spatio-temporal dither pattern, which exhibits high-pass characteristics in the spatial and temporal domains. Methods for creating these patterns comprise generation of a series of dither tiles for multiple image characteristic channels and the temporal domain.
In a non-limiting example, as shown in
In some embodiments of the present invention, as illustrated in
An initial reference dither pattern 72, 74 & 76 may be a dither tile with a random noise pattern, a pre-set pattern, a constant value across all pixels, a blank tile or some other fixed or random pattern. A set of initial reference dither patterns 72, 74 & 76 for multiple channels of an image, such as the R, G and B channels of an RGB image, forms an initial reference frame 70. Once the initial reference pattern or frame 70 is established, pixel values in the dither pattern tiles can be generated. To ensure that the generated pattern is high-pass, a dispersion-related merit function is used to place each pixel.
In this exemplary method, a first pixel 80 is placed in the red channel tile 78 of frame 1. According to the dispersion merit function, this pixel is placed at a point that is dispersed from the location of pixel values in the initial reference frame tiles 72, 74 & 76. This dispersion merit function can relate to values in same color channel or a combination of color channels. Each color channel tile in the initial reference frame may be weighted to give different channels priority over others.
Once the first pixel has been placed, other pixels can be placed according to the dispersion merit function. These subsequent pixels will be placed in a manner that is dispersed from the first pixel 80 and may also be dispersed from pixel values in the initial reference frame 70. Generally, pixel values in the actual dither pattern 78 being developed will have greater weight than those in the initial reference frame 70, however these weighting factors may vary for specific applications. Each dither pattern tile (i.e., 78) can be completed individually or a set of tiles making up a frame may be generated simultaneously. For example, a pixel may be placed in a red channel tile 78 followed by a pixel placement in a green channel tile 82 of the same frame followed by a pixel placement in the blue channel tile 84 of the same frame. Alternatively, a single color channel tile may be completed before placement of pixel values in another color channel tile of the same frame.
In this manner, each frame's dither pattern tiles are generated with reference to the patterns already established in previous frames and/or the initial reference frame. As the process continues from frame to frame, the weighting of previous frames may vary. For example, the weight given to pixel values in the closest preceding frame may be higher than that given to the next closest preceding frame. In some embodiments the initial reference frame 70 may be used only to generate the first frame 86. In other embodiments the initial reference frame 70 may be referenced in the generation of multiple successive frames with or without weighting factors.
Typically, due to memory constraints, the number of dither pattern frames is much less than the number of frames in a video clip so a series of pattern frames is reused in sequence. This cycle makes the first frame of the sequence 86 immediately follow the last frame 90. Accordingly, if these frames are not correlated, visible artifacts may develop. To avoid this, the last frames in a sequence are generated with reference to the first frame or frames as well as the previous frame or frames. This helps ensure that the pattern is continuously high-pass throughout multiple cycles.
In an exemplary embodiment of the present invention a 32×32 spatial dither pattern tile is generated for each color channel for RGB application. This pattern is created for 32 temporal frames thereby yielding a 32×32×32×3 array. The size is not a factor in the overall function of some embodiments and many different dimensions may be used. A merit function is used to disperse the pixel values into a high-pass relationship. This high-pass relationship may exist spatially within a dither pattern tile, spectrally across color channel tiles and temporally across successive frames. In order to achieve all these relationships, the location of a pattern pixel value must have feedback from other pixel values within the tile pattern, other color channel tiles within the frame and pixel values in adjacent frames. Dispersion or anti-correlation across color channels can help reduce fluctuation in luminance where human vision has the highest sensitivity.
Negative feedback is a way to control the pattern so that pixel values are equally spaced in space and/or time. As a non-limiting example, if a large dither value is assigned to a position A at (i, j, k), its neighbors will be forced to take smaller values because negative influence from the large value at A. The further away from A, the less the influence the value at A will have on another pixel designation.
To define a dither pattern tile set several parameters must be defined. The spatial size of each tile (i.e., M×N), the number of frames, L and the number of color channels must be designated. Each parameter has trade offs that must be balanced. However, embodiments of the present invention allow less resource intensive parameters to be used without perceptible degradation of the final image. The number of levels in the dither pattern set must also be determined. A level may correspond to a luminance value, such as a gray-scale value in a monochrome image, a value for the luminance channel in image formats with specific luminance channels (i.e., LAB, LUV) and other parameters related to the visual perception of a pixel. This number may vary significantly according to specific application factors. In some embodiments, the number of levels may be determined with reference to the number of input bits and the number of output bits. In these embodiments, the number of levels may be determined by taking 2 to the power of the difference between the number of output bits and the number of input bits. In equation form this expression would be:
n=2(b
For example, for an LCD display with the capability to display 6 bits, but receiving an input signal with 10 bits of data, the number of levels would be:
n=2(10−6)=24=16
When a display is linear, the dither values may be evenly distributed among each level. However, in many cases the display is not linear so the level distribution may be distributed in a non-linear manner. When the number of output bits is greater than 4 the non-linear effect is small so uniform distribution does not cause a large non-linear error. Accordingly, the number of pixel values may generally be distributed evenly among levels. However, for lower numbers of output bits and larger non-linearities (i.e., gamma >2) more threshold values should be distributed in the lower portion of the threshold range to compensate for the non-linear gamma effect.
Negative feedback is used to push the temporal frequency of the dither pattern into high frequencies. In some embodiments, for frame 1, since it is the first frame with no other frames to reference, the temporal feedback function, fMask, relates to an initial reference frame (IRF). The initial reference frame may comprise essentially any noise pattern. An IRF may comprise pseudo-random noise, alternating patterns, a field of constant pixel values, a blank tile or frame or any number of other “patterns.” In some embodiments, the IRF may be set to a uniform noise of amplitude 0.1.
For frame 2, frame 1 may be used as a feedback function. Frame 2 may also reference the IRF in some embodiments. For frame 3 and up, a temporal infinite impulse response (IIR) may be used in generating the feedback function, as shown in the following exemplary equation:
fMask=fMask*IIR Coef+(1−IIR Coef)*frame(T−1)
The further away from the current frame, the less is the contribution to the feedback function.
For the last frame, since the dither pattern will repeat itself, in order to achieve a temporal high-pass relationship between the last frame and the first frame, the contribution of the first frame may be added to the temporal IIR filtering as:
fMask=fMask*IIR coef+(1−IIR coef)*0.5*(frame(T−1)+frame(1))
While these particular embodiments have been found to perform well, many other methods may be used to disperse pixel values spatially and temporally.
The idea behind spatial noise distribution is trying to evenly distribute the dither values so that there is minimum fluctuation in both luminance and chrominance when viewed from a certain distance. In some embodiments, the first dither value or pixel of the first level is entirely dependent on the fMask function and the initial reference tile or frame, when an IRF is used. In some embodiments, it will take the position of the maximum value in the IRF. In other embodiments, where a multiple channel IRF is used, cross-channel feedback from the IRF may cause this position to vary. Subsequent pixels are generally placed as far away as possible to all the previous pixels. This is equivalent to placing charged balls in a plane. Each ball is trying to repel other balls of the same charge as far as possible. The new ball will end up in the least occupied space when all values are equal. The inverse distance-squared function may be used as a repellent function, which is equivalent to the repellent force between charges of the same type. The repellent function may be implemented with a convolution kernel as
where x and y are the spatial coordinates, the constant 0.5 is used to prevent division by 0. It is also used to adjust cross color channel influence as described later. Sigma (σ) defines the spatial extent of the repellent function. It may be level dependent. For the first level, we have more degrees of freedom to which to assign dither values, thus the sigma may take a larger value. At the midlevel, near half of the cells are assigned and sigma may take a smaller value.
In some embodiments, the spatial feedback function may be referred to as the sMask function and may be expressed mathematically as
sMask(x,y,color)=img(x,y,color)**k(x,y)
where ** represents a convolution operation and img(x,y,color)=1 if a position is already assigned a dither value. To improve the speed, the convolution operation may be implemented in the frequency domain using Fourier transforms
sMask(x,y,color)=F−1{F[img(x,y,color)]·F[k(x,y)]}
where F denotes a forward Fourier transform and F−1 denotes an inverse Fourier transform. Whenever a new pixel is added, sMask may be recalculated to account for the presence of the new pixel value.
Since the luminance sensitivity of human vision is higher than chrominance sensitivity, it is important to optimize multiple color dither arrays so that the luminance fluctuation is minimized. As a non-limiting example, in an RGB image, for a given gray (luminance value), if the red dither value is assigned to a position, the green dither value should also be repelled by the red dither value. Cross channel feedback can be implemented using a set of weighted spatial feedback functions, which may be implemented as follows:
where Cii is the weight of one color feedback function to another color. Since the contribution to luminance is different for the three color channels, with green having the biggest contribution and blue the least, therefore, in some embodiments we can optimize the weight so that Cgg is higher than Cbb. However, in many applications, this effect has been found to be small. Accordingly, in some embodiments, only two weights are implemented: off-diagonal weight C1 and diagonal weight C2. At mid levels, C1 is the smallest so that the cross channel feedback is very small. Various methods may be used to determine the best weighting values. Constant values may be used in some embodiments. These weights may also be determined using a level-dependent method. One embodiment of this is shown in the equations below.
C1=((level−nLevels/2)/nLevels)2+0.07
C2=1−2*C1
The temporal feedback function, spatial feedback function and cross-channel feedback function may be combined to form a merit function for determining the position of a dither pattern value. The location of the minimum or maximum of this merit function may be assigned a new dither value (level). When the level is small, most of the space is unassigned and it is easier to find the few positions that are already assigned. However, when the level number is close to the last level, most of the space is occupied and it is easier to find the holes that are not assigned. Thus the generation process may be divided into two steps:
For level<=nLevels
mask(x,y,color)=1−fMask(x,y,color)+cMask(x,y,color)
find(x0,y0)|mask(x0,y0,color)=min(mask(x,y,color))
TA(x0,y0)=level−1
img(x0,y0=1)
Some exemplary embodiments of the present invention may be explained with reference to
Initially, dither pattern tile set parameters 102 are designated to define the dimensions and characteristics of the tile set. Once the tile set is defined, each successive frame 104 is designated with reference to an initial reference frame and/or other image frames. In order to relate pixel values in a new dither pattern to other pixel values in preceding frames, an fMask function 106 is used. Depending on the position of the frame being designated, a different relationship or fMask function may be used as shown in the diagram 106, 108, 110 & 112.
In these particular embodiments, the first frame 106 will be designated with reference to an initial reference frame (IRF), which may be a random noise pattern or essentially any other pattern including a constant value tile or a blank tile. In some embodiments, the initial reference frame may simply be omitted and the first pixel value of the first frame may be placed by pseudo-random methods or other methods.
After the first frame of the dither pattern tile set has been established, the second frame may be established using an fMask function 108 that relates to the pixel values in the first frame. Subsequent frames may be established 110 with reference to one or more of the preceding frames and the IRF. The fMask function for the last frame 112 references the pixel values in the preceding frames as well as the first frame, which will be used in a cycle immediately following the last frame.
Once the fMask function for a particular frame is determined, a dither pattern tile is initialized 114 and the process for establishing the first level 116 of values is commenced. When cross-channel feedback methods employ level-dependent weighting factors, these factors may be calculated for the particular level 118.
In these exemplary embodiments, a loop is entered to designate the number of pixels that have been allocated to that particular level 120. Another loop is entered to cycle through the color channels 122. These structures are merely exemplary for some embodiments of the present invention and may be modified in many ways for alternative embodiments.
For each pixel value in a particular level within a particular color channel tile, the feedback functions are aggregated to find the location of a dither pattern pixel value 124. This operation may comprise spatial feedback, cross color-channel feedback and temporal feedback as well as other factors. Once a pixel value has been designated, the feedback values are recalculated using the new pixel value as additional input 126. Subsequent pixel values will be repelled from that newly designated value as well. In these illustrative embodiments, the next color tile is then selected 128 and a pixel value is designated in that tile. This second color pixel value is determined 130 according to the merit function taking into account the location of the first pixel value in the first color channel. This pixel designation process is repeated until all pixel values for a particular level have been designated for each of the color channels.
When a level is fully designated for all color tiles, the next level is selected 132 and pixel values for that level are designated for all color channels. When all levels have been designated for all color channels the next frame is selected 134. The process then repeats for the next frame by calculating the appropriate fMask 112 temporal feedback function, cross-channel feedback values 118 and spatial feedback factors 126 as well as other calculations. Once all frames are designated, the entire dither pattern array is stored for use in video processing 136.
It should be noted that in alternative embodiments, not illustrated in
To determine the frequency characteristics of dither pattern arrays produced with embodiments of the present invention a Fourier analysis may be used.
Some embodiments of the present invention may also employ a tile stepping method as illustrated in
In other embodiments of the present invention, the tile pattern in a particular frame may be varied beyond a sequential spatial order across the rows. In some embodiments, the tiles may be dispersed in a random spatial order across a frame. Once this random spatial pattern is established in the first frame, the tiles in the next temporal frame and subsequent frames will follow a sequential temporal order such that the tile corresponding to the position of a tile in the first frame will be the next sequential tile in the temporal order established in the dither tile structure. These embodiments are illustrated in
Some embodiments of the present invention may make use of the oblique effect of the human visual system. The contrast sensitivity function of the human visual system is dependent on the viewing orientation. Vertical and horizontal sensitivity are higher than diagonal angles such as 45 degrees. To take advantage of this effect, the dither pattern may be designed to have its power spectra peak at 45 degrees. The convolution kernel of embodiments of the present invention can take advantage of this property. Instead of using Euclidian distance, we can use city block distance in the repellent function as shown in the equation below:
In some embodiments of the present invention, level dependent temporal feedback functions may be used such that only a small fraction of fMask is applied to the combined feedback function at mid levels. As a non-limiting example, a normalized C1 can be used in the spatial feedback function as a weighting function for fMask as well.
The terms and expressions which have been employed in the foregoing specification are used therein as terms of description and not of limitation, and there is no intention, in the use of such terms and expressions, of excluding equivalents of the features shown and described or portions thereof, it being recognized that the scope of the invention is defined and limited only by the claims which follow.
This application is a division of U.S. patent application Ser. No. 10/645,952, filed Aug. 22, 2003.
Number | Date | Country | |
---|---|---|---|
Parent | 10645952 | Aug 2003 | US |
Child | 13563583 | US |