The invention relates generally to digital image processing and more particularly to directional noise filtering processes that reduces noise in the original image while also improving contrast.
Digital image processing involves the use of special purpose hardware and computer implemented algorithms (i.e., computer programs) to transform digital images. Digital still/video cameras, whether unitary or embedded in consumer products such as mobile telephones, generate data files whose contents represent the scene whose image was captured. Because of scene lighting, device imperfections and/or human proclivities, the captured data may not be an accurate (or desired) representation of the captured scene.
The difference between what an individual wants or likes to see in an image, and the data actually captured may be thought of as “noise.” The reduction of noise is the subject of this disclosure.
In one embodiment the invention provides a method to selectively filter an image. The method includes obtaining an input image made up of pixels (each having a value); identifying a direction of a structure at one or more of the pixels; and transforming the input image into an output image by filtering the input image along the identified direction of the structure at each of the one or more pixels.
In another embodiment, the invention comprises a computer executable program to implement the method, where the program may be tangibly recorded in any media that is readable and executable by a computer system or programmable control device. In yet another embodiment, the invention comprises a computer system for performing the described methods. In still another embodiment, the invention comprises a system having one or more computer systems communicatively coupled wherein at least one of the computer systems is designed or programmed to perform the described methods.
The following description is presented to enable any person skilled in the art to make and use the invention as claimed and is provided in the context of the particular examples discussed below, variations of which will be readily apparent to those skilled in the art. In the interest of clarity, not all features of an actual implementation are described in this specification. It will be appreciated that in the development of any actual implementation, numerous programming and component decisions must be made to achieve the developers' specific goals (e.g., compliance with system- and business-related constraints), and that these goals will vary from one implementation to another. It will also be appreciated that such development effort might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the image processing field having the benefit of this disclosure. Accordingly, the claims appended hereto are not intended to be limited by the disclosed embodiments, but are to be accorded their widest scope consistent with the principles and features disclosed herein.
Directional noise filtering operations in accordance with general principles of the invention may be described as illustrated in
As used herein, the term “image” means a digital representation of a scene recorded in a digital storage media. Specifically, an image constitutes a physical entity in that the contents, format, or representation of the image in the storage media has a specific pattern. In this manner, an image is similar to a program used to cause a programmable control device to perform a task—when executed (viewed) memory within the programmable control device take on a unique and specific pattern.
As used herein, the “direction” of pixel p refers to the direction of structure within the image centered about pixel p. Referring to
Elements in direction map 115 generally have a one-to-one correspondence to pixels in image 105. Referring to
Acts in accordance with block 130 utilize direction map 115 to selectively smooth/filter input image 105. For example, image pixels corresponding to direction map elements assigned a direction of no_direction may be filtered in a first way, image pixels corresponding to direction map elements determined to be part of a corner may be filtered in a second way and image pixels corresponding to direction map elements assigned a direction other than no_direction (e.g., 45°) may be filtered in a third way. In one embodiment, image pixels corresponding to no_direction direction map elements may be blurred with, for example, a Gaussian blur; image pixels corresponding to direction map elements identified as corners may be left unperturbed and image pixels corresponding to direction map elements assigned a specific direction (e.g., 45°) may be filtered along their assigned direction. In another embodiment, direction map 115 may be pre-filtered to eliminate spurious directional regions (e.g., small isolated collections of direction map elements surrounded by other elements having no direction) while allowing the more highly-connected directional areas (e.g., collections of direction map elements reflecting real structure in the input image) to remain intact for subsequent smoothing operations.
For a more detailed overview of image processing operations in accordance with this disclosure, see
With respect to the acts of block 410, input image 405 may be obtained in many formats. In one embodiment for example, input image 405 comprises a YCbCr representation (having a gamma value of approximately 1.0). In another embodiment, input image 405 comprises a Y′Cb′Cr′ representation (having a gamma value of approximately 2.0). In yet another embodiment, input image 405 comprises an RGB representation. In those implementations in which image 405 comprises a YCbCr representation, only the Y (luminance) need be processed in accordance with the techniques described herein. In those implementations where image 405 comprises an RGB representation, each channel (R, G and B) may be individually processed in accordance with the techniques described herein. In still another embodiment, less than all channels in an RGB representation may be processed in accordance with the techniques described herein.
With respect to the acts of block 415, an image capture device's noise characteristics may generally be determined from one or more items of image metadata. In general, image capture devices embed certain metadata within each captured image (e.g., image 405). From this information, and knowledge of the device itself, it is generally possible to characterize the overall noise environment of the device. By way of example, the noise characteristics of a particular device (e.g., a digital camera embedded in a consumer device such as a mobile telephone or media player device) at a particular pixel may be a function of ISO (aka gain) and exposure time and may be represented as:
Noise(p)=f(ISO,Exposure_Time,Pixel_Value) EQ. 1
where Noise(p) represents the noise variance at a given pixel p whose value is given by Pixel_Value.
It will be recognized that while different image capture devices may characterize their noise in different ways (i.e., linear versus non-linear and/or with different metadata items), the form EQ. 1 takes is generally known for each device. For example, assuming a linear noise relationship for function f( ) in EQ. 1, the noise at a selected pixel p may be described as follows:
Noise(p)=[(Shot_Noise)×h(p)]+(Dark_Noise) EQ. 2
where shot noise and dark noise are functions of image metadata (e.g., ISO and exposure time) and h(p) represents a value that is a function of pixel p's value. Determination of shot noise and dark noise are generally a function of the image capture device and are assumed known once image metadata is obtained.
With respect to the acts of block 425,
In one embodiment, each edge test (block 505) uses a different number of pixels to disambiguate noise from structure. For example, a first test could use a 3×3 pixel neighborhood (i.e., a first resolution) while a second test could use a 5×5 pixel neighborhood (i.e., a second resolution). Only after a sufficient number of pixels have been used is it generally possible to distinguish structure from no structure in light of the noise present at the selected pixel in input image 405. (See additional discussion below.) Thus, if a first edge test using a first number of pixels is not sufficiently robust to distinguish structure (or the lack thereof) from noise, a second test using a second first number of pixels may be used. Successive tests using different numbers of pixels may also be used—that is, more than two edge tests may be used.
With respect to acts in accordance with block 505,
With respect to the acts of block 510, in one embodiment a pixel is said to participate in, or be part of, a structure within image 405 if the largest variance calculated for a given edge test (e.g., the result of acts in accordance with block 620) is greater than or equal to the selected pixel's edge noise threshold, where the selected pixel's edge noise threshold is based on the pixel's noise characterization as given in EQ. 2.
Referring to EQ. 2, in one embodiment a pixel's h(p) function represents the average tonal value of pixel p. A pixel's average tonal value may be determined by taking the weighted average of a number of pixel values surrounding the pixel. Referring to
It will be recognized that the sum of products in the numerator of EQ. 3 is divided by 16 (the sum of the weight values) to generate an “average” value.
In the embodiment represented by EQ. 3 and shown in
Substituting the selected pixel's average tonal value for pixel noise function h(p) in EQ. 2 yields:
Noise(p)=((Shot_Noise)×ATV(p))+(Dark_Noise) EQ. 4
Substituting the illustrative average tonal value given by EQ. 3 into EQ. 4 yields a characterization of noise for the selected pixel and the given image capture device:
With the noise characterized at pixel p (EQ. 5), a noise threshold for pixel p may be specified. Separate noise thresholds may be used for the small edge test, the large edge test, and the corner test. In one embodiment, pixel p's noise threshold is given by:
Noise Threshold(p)=Noise(p)×T×NF(p) EQ. 6
where T represents the number of squared terms in the variance calculation discussed below vis-à-vis EQS. 7b, 8b, 9b, 10b and 14-17. The factor NF(p) represents an a priori empirically determined noise factor, selected to reliably determine when a pixel's value is attributable to structure within the underlying image. One of ordinary skill in the art will recognize that the value of NF(p) may vary from image capture device to image capture device, from pixel to pixel and may be time-consuming to determine but is, nevertheless, straight-forward to find given this disclosure.
In applications in which input image 405 is in Y′Cb′Cr′ format, image gamma is often approximately 2. In these cases, noise variance across an image has been found to be relatively constant. Accordingly, in these implementations it may be possible to calculate a single noise threshold in accordance with EQ. 6 once per image rather than for each pixel as described above.
Referring again to
The described small edge test uses a smaller neighborhood than does the large edge test. As a result, the larger edge test is less sensitive to noise. Conceptually, a small edge test may be performed first as it is less computationally costly than a large edge test. If the structure is “significant” enough to be detected by the small edge test, processing continues. If the structure is not detected by the small edge test, a large edge test may be performed (as it is in the illustrative embodiment described herein) to improve the chances that a structure is detected.
In the following, each edge test assumes the pixel being evaluated (pixel p) is at the center pixel in a 5×5 array of pixels.
Beginning with a small edge test, the variance for pixel p across an edge aligned to the 0° direction may be visualized as the sum of the squares of differences along the 90° (perpendicular) direction, centered about pixel p. In
where V(x) represents the variance of x.
In similar fashion,
where the notation “CENTER” refers to a pixel “location” that is at the junction of two or more other pixels. See, for example, FIG. 9B's illustrative calculation and corresponding diagram for VS(H:CENTER).
Referring once more to
In accordance with one embodiment of a large edge test, the selected pixel p's neighborhood is divided into 3 regions (labeled “a,” “b” and “c” in
With average values known for each region in accordance with EQS. 11b, 12b and 13b, the large edge variance values for each of the inspected directions is given by:
As in the small edge test, the largest of the variances calculated in accordance with EQS. 14-17 is selected and compared against the selected pixel's large edge noise threshold. If the value is greater than the pixel's large edge noise threshold (the “YES” prong of block 510), information pertaining to determining whether the selected pixel is part of a corner structure is gathered whereafter operations continue at block 430 where additional detail about the structure within which the selected pixel participates is collected. If the test of block 510 fails, another edge test is selected if available in accordance with block 515. Since, in the current example, no additional edge tests are specified operations continue at block 435.
In EQS. 11b, 12b and 13b the average region value was calculated. It has been found beneficial, without significant impact on the described process, to calculate a(average), b(average) and c(average) values using denominator values that are even powers of two. Accordingly, EQS. 11b, 12b and 13b may also be calculated as follows:
A corresponding adjustment to the large edge test noise factor must be made: it should be multiplied by (⅝)2 squared which is 25/64.
It will be recognized, both the small and large edge tests described above are arranged to accomplish “middle difference” summing. Thus, a more positive indication of structure is obtained by computing the sum of multiple differences from a middle ground where the edge presumably lies. It will further be recognized that other edge tests may sum differences across the center without using the value at the center. These types of edge tests (e.g., the Sobel gradient test) tend to miss thin lines for this reason.
As noted above, when an edge test is successful additional data is gathered to assist in determining whether the selected pixel is part of a corner structure. (As used here, the term “successful” means the calculated variance for the selected pixel is greater than the pixel's edge noise threshold.) While a different arrangement of pixels is collected depending upon which direction had the largest variance, determining whether a given pixel is part of a corner structure in image 405 is the same regardless of which edge test is being used (e.g., small or large). Referring to
where V(corner) represents the calculated variance for the selected pixel assuming it is part of a corner structure.
The above corner accommodation technique may be utilized to minimize the rounding-off of corners during a subsequent smoothing operation—see block 470. For example, the lower left corner of a rectangular object will have direction values of 90° on its left side and direction values of 0° on its lower side. At the corner, however, the direction value is 45°. Accordingly, looking along the 45° direction of any corner having sufficient curvature yields highly differing pixel values. The above test exploits this result: pixels meeting the above test may be marked as corners and ignored during smoothing.
Referring again to
As previously noted, noise factor NF(p) (first identified in EQ. 6) may be experimentally determined and is chosen so that the selected pixel's edge noise threshold accurately distinguishes between structure and the lack of structure at the selected pixel. In general, noise factor NF(p) may be chosen so that edges are positively identified. The nature of noise in images, however, causes some random alignment of pixel data in areas of pure noise. These become false positive edge indications. Such areas typically do not line up or connect like real edges. To recognize and eliminate such false positives, direction map 440 may be filtered in accordance with block 460 (see discussion below).
It has been found that one factor in determining a value for NF(p) is the number of terms used to calculate the edge variance (e.g., EQS. 7b, 8b, 9b, 10b, 14-17 and 21c). Thus, for the illustrative small edge test described above (see EQS. 7b, 8b, 9b, 10b), this would be 6. For the illustrative large edge test described above (see 14-17), this would be 2. And for the illustrative edge test described above (see EQ. 21c), this would also be 2.
Referring again to
Referring to
With respect to determining the variance along each specified direction (block 1105), the variance of a pixel may be determined by finding the sum of the squares of the length vectors between adjacent pixels in the selected pixel's neighborhood along the specified direction. This approach may be substantially the same as that used with respect to the acts of block 605. By way of example, FIGS. 12A-12D illustrate one technique to calculate the directional structure variance at 0°, 18°, 27° and 45° when the structure at the selected pixel was identified using the above-described small edge test (block 425). Similarly,
In general, a selected pixel's variance along a given direction (α) in accordance with the acts of block 430 may be represented as:
V(at direction α)=Σ(Individual Variances at α) EQ. 22
Directional variances calculated in accordance with EQ. 22 when the selected pixel's structure was identified using a small edge test (see block 425,
VSD(at direction α)=Σ(Individual Variances at α) EQ. 23
In similar fashion, when the selected pixel's structure was identified using a large edge test (see block 425,
VLD(at direction α)=Σ(Individual Variances at α) EQ. 24
One of ordinary skill in the art will understand, given the above discussion and the accompanying figures, how to calculate variance values along the other specified directions. It will also be understood that the values used to modify sums within certain calculations (see, for example,
In addition, it will be noted that pixel edge test operations in accordance with block 505 identify 0° using vertical inter-pixel paths (see
Table 1 shows the number of terms and the vector length associated with each total variance value for the illustrative embodiment. The results presented in Table 1 may be derived directly from
Referring to
It can be noted that a pixel's variance at a specified angle or direction represents the sum of the squared length differences along a hypothetical edge at that angle. Accordingly, the magnitude of a pixel's variance for a given angle may be influenced by (1) the length of the difference vectors for that angle and (2) the number of squared difference terms for the given angle. Based on this recognition, acts in accordance with block 1400 may group the calculated directional variances in accordance with the length of their difference vectors. For the illustrative example being described herein, Table 2 shows the result of this grouping.
Angle variances (calculated by the sum of squared difference vectors as described herein) may be freely compared when they have equal difference vector lengths and equal number of terms—i.e., within a class as defined above. For this reason, acts in accordance with block 1405 may simply select the smallest value variance (and its associated angle) from each group, with the variance/angle pair so selected being referred to as a “class representative” variance/angle pair.
With respect to block 1410, when comparing variance values between classes, both the number of terms and difference vector lengths must be accounted for. This may be done by normalizing each variance value to both the number of terms and the difference vector length value. In one embodiment, this may be calculated on a pair-wise basis by computing the ratio between the difference vector length squared times the ratio of the number of terms for the two variances being compared. Thus, when comparing variances between two classes, say class A and class B, the normalization factor when one wants to compare a variance value from class A to a variance value from class B may be computed as:
where the comparison is made by comparing the value from class A with the value obtained by multiplying the class B value with the Normalization Factor. One of ordinary skill in the art will understood that the expression given in EQ. 25 describes one of many possible ways of normalizing two factors.
For the illustrative embodiment described herein, operations in accordance with block 1410 are illustrated in
As previously discussed, direction map 440 comprises a number of elements equal to the number of pixels in image 405. Accordingly, acts in accordance with block 445 update the selected pixel's corresponding entry in direction map 440 to reflect the direction or angle identified during the acts of block 1410 and whether the selected pixel is “small” or “large” as determined during acts in accordance with block 425. Referring again to
With respect to block 460, it has been found beneficial to despeckle or filter direction map 440 prior to using it to smooth image 405. In one embodiment, direction map 440 may be subjected to a 3-pass filter operation in accordance with
With respect to passes 2 and 3, the phrase “surrounding elements” means the 4(n−1) pixels along the periphery of the selected n×n neighborhood. In the example implementation shown in Table 3, pass 2 uses a 7×7 neighborhood (n=7) so that the “surrounding elements” comprise 24 elements. Similarly, pass 3 uses a 15×15 neighborhood (n=15) so that the “surrounding elements” comprise 56 elements. In like manner, the phrase “middle” elements are all those elements not one of the “surrounding elements.” In general, for any n×n array of direction map elements, there will be 4(n−1) elements of the periphery and (n2−4n+4) elements in the “middle.”
In general, operations in accordance with
With respect to acts in accordance with block 470,
Once the selected pixel has been processed as having a specific direction (block 1715), no direction (block 1720) or as being part of a corner structure (the “YES” prong of block 1705), a check is made to determine if image 405 has pixels that have not been processed in accordance with blocks 1705-1720. If additional pixels remain to be processed (the “YES” prong of block 1725), the next pixel and corresponding direction map element are selected (block 1730), whereafter processing continues at block 1705. If all pixels in image 405 have been processed (the “NO” prong of block 1725), generation of output image 465 is complete.
With respect to the acts of block 1715,
Directional means in accordance with block 1800 may be determined by applying weighting factors to each selected pixel's neighborhood and averaging the results. For example, pixels identified as “small,” may use a weighting vector
In like fashion, pixels identified as “large” may use a weighting factor
One of ordinary skill in the art will recognize these weighting factors implement a digital approximation to a small Gaussian blur—“small” pixels using a 3-pixel range and “large” pixels using a 5-pixel range. It will also be recognized by those of ordinary skill that other averaging operators may be used.
Using the neighborhood layout first presented in
It will be recognized that for the 12-direction embodiment discussed here, the following equivalence relationships exist: M1(p at 0°)=M2(p at 90°); M1(p at 90°)=M2(p at 0°); M1(p at 18°)=M2(p at 108°); M1(p at 108°)=M2(p at 18°); M1(p at 27°)=M2(p at 117°); M1(p at 117°)=M2(p at 27°); M1(p at 45°)=M2(p at 135°); M1(p at 135°)=M2(p at 45°); M1(p at 63°)=M2(p at 153°); M1(p at 153°)=M2(p at 63°); M1(p at 72°)=M2(p at 162°); and M1(p at 162°)=M2(p at 72°)
With respect to the acts of block 1805, one illustrative contrast correction factor M3 may be made by altering the directional mean M1 as follows:
M3(α)=M1(α)+[M1(α)−M2(α)]×CF(d) EQ. 26
where M1(α) and M2(α) represent directional variance means at the specified angle α discussed above and illustrated in
As to factor CF(d), it is noted that since some angles require more interpolation to produce their M1 and M2 values than others, these angles will also produce softer results. It has been found that the amount of interpolation required also varies along the same lines as the angle classes. Thus, only one contrast factor per angle class is needed. In one embodiment, the value of CF(d) is empirically determined and fixed for angle α and directional variance angle class shown in Table 2. Specific values for CF(d), determined using visual inspection of the results on a “target” test chart include: 0.4 for angle class 1; 0.4 for angle class 2; 0.6 for angle class 3; and 0.8 for angle class 4.
With respect to the acts of block 1810, the selected pixel's M3 value may be used to smooth pixel p as follows:
p(smoothed)=p+[M3(α)−p]×SF EQ. 27
where SF represents a smoothing factor. In accordance with EQ. 27, if SF=0, p(smoothed)=p(no smoothing). If, on the other hand, SF=1, pixel p is said to be totally smoothed. In practice, it has been found that a smoothing factor of approximately 0.5 yields visually appealing images. Using a SF=0.5 has been found to reduce the noise associated with the pixel p by approximately 6 db. One of ordinary skill in the art will recognize that other smoothing factor values may be used depending upon the desired result.
For clarity, one illustrative image smoothing operation in accordance with block 470 is described via pseudo-code in Table 4.
In one embodiment, image processing techniques in accordance with this disclosure may process YCbCr or Y′Cb′Cr′ image data. It will be recognized image processing on this type of data is typically used in embedded environments (e.g., mobile phones, personal digital assistant devices and other portable image capture devices). Referring to
In another embodiment, image processing techniques in accordance with this disclosure may process RAW image data. It will be recognized that RAW image processing is typically used in non-embedded environments (e.g., stand-alone workstations, desktop and laptop computer systems). Referring to
In this disclosure, operations have been described in terms of neighborhoods centered about a selected pixel (for an image) or element (for a direction map). It will be recognized that pixels (elements) at the edge of an image (direction map) do not have surrounding pixels (elements). These conditions are often referred to as boundary conditions. In such circumstances, one of ordinary skill in the art will recognize that there are a number of different approaches that may be taken to “fill in” neighborhood pixels (elements). In a first approach, pixels (elements) within the image (direction map) are simply duplicated across the edge. For example: pixels (elements) on the edge are duplicated to create a temporary row or column of pixels (elements); pixels (elements) one row or column away from the edge may then be duplicated to create a second row or column of pixels (elements). This process may be repeated until the needed number of row or column elements have been filled in. In another embodiment, pixels (elements) that do not physically exist may be assumed to have zero values. While only two techniques to fill in missing pixel(element) values have been described here, those of ordinary skill will recognize many others are possible.
Various changes in the materials, components, circuit elements, as well as in the details of the illustrated operational methods are possible without departing from the scope of the following claims. For example, not all steps of operations outlined in
Number | Name | Date | Kind |
---|---|---|---|
6360025 | Florent | Mar 2002 | B1 |
6654055 | Park et al. | Nov 2003 | B1 |
6928196 | Bradley et al. | Aug 2005 | B1 |
7102638 | Raskar et al. | Sep 2006 | B2 |
7130481 | Yu | Oct 2006 | B2 |
7590307 | Wang et al. | Sep 2009 | B2 |
7636112 | Hsu et al. | Dec 2009 | B2 |
7664316 | Aoki | Feb 2010 | B2 |
8207987 | Lee et al. | Jun 2012 | B2 |
8320622 | Smith | Nov 2012 | B2 |
8358867 | Kass et al. | Jan 2013 | B1 |
8639054 | Hasegawa | Jan 2014 | B2 |
20030053708 | Kryukov et al. | Mar 2003 | A1 |
20040086201 | Muresan et al. | May 2004 | A1 |
20040183925 | Raskar et al. | Sep 2004 | A1 |
20040183940 | Raskar | Sep 2004 | A1 |
20050157940 | Hosoda et al. | Jul 2005 | A1 |
20050220337 | Arazaki | Oct 2005 | A1 |
20050220350 | Morisue | Oct 2005 | A1 |
20060170826 | Park et al. | Aug 2006 | A1 |
20070103570 | Inada et al. | May 2007 | A1 |
20070296871 | Yoo et al. | Dec 2007 | A1 |
20080025565 | Zhang et al. | Jan 2008 | A1 |
20080094491 | Hsu et al. | Apr 2008 | A1 |
20080095431 | Ishiga | Apr 2008 | A1 |
20080112639 | Min et al. | May 2008 | A1 |
20090052775 | Moon et al. | Feb 2009 | A1 |
20090285480 | Bennett et al. | Nov 2009 | A1 |
20100189374 | Tsukioka | Jul 2010 | A1 |
20100202701 | Basri et al. | Aug 2010 | A1 |
Entry |
---|
Dimitri Van de Ville et al. “Noise Reduction by Fuzzy Image Filtering.” IEEE Transactions on Fuzzy Systems, vol. 11, No. 4, Aug. 2003. |
Chung-Chia Kang et al. “Fuzzy Reasoning-based Directional Median Filter Design.” Signal Processing 89 (3): 344-351, Mar. 2009. |
Number | Date | Country | |
---|---|---|---|
20110052091 A1 | Mar 2011 | US |