As manufacturing capabilities have improved for image sensor devices, it has become possible to place more pixels in a fixed-size area of silicon. As a consequence, pixel size is shrinking. From a signal processing perspective, more pixels imply that the scene is sampled at a higher rate providing a higher spatial resolution. Smaller pixels, however, collect less light (photons) which, in turn, leads to smaller per-pixel signal-to-noise ratios (SNRs). This means as light levels decrease, the SNR in a smaller pixel camera decreases at a faster rate than SNR in a larger pixel camera. Thus, the extra resolution provided by a smaller pixel image sensor comes at the expense of increased noise.
A side effect of placing more pixels into a fixed-sized silicon sensor is lower pixel well capacity. As pointed out earlier, less photons result in reduced signal everywhere. The impact of reduced signal is particularly severe in blue regions of the image such as the sky. Because each pixel element receives fewer photons, the red channel signal in blue regions is particularly weak (due to the use of Bayer color filter arrays) which, after amplification from white balancing, color correction and local tone mapping manifests itself as noise in blue regions of the image. One approach to this problem would be to enhance the noise reduction strength for blue pixels. This will mitigate noise in blue regions such as sky, but would also result in the removal of texture in other blue regions such as ripples in water, ocean waves, and blue jeans or shirts. Another approach would be to extract regions of the image that contain large relatively smooth blue regions (e.g., sky) using image segmentation techniques and a learning-based method to separate these types of regions from the rest of the image. Noise reduction strengths could then be enhanced in these regions. Image segmentation is, however, a very time consuming and processor-intensive process and is not feasibly implemented in a camera pipeline.
Sharpness and noise are arguably the two most important image quality considerations for an image. Camera manufacturers would like to deliver an image that is sharp with very low noise. Since edges/texture and noise overlap in frequency, often times these are conflicting goals. Typically noise reduction results in a softer image while classical sharpening methods enhance high frequency content in the image (both signal and noise). The challenge is to devise a methodology that removes noise in smooth areas where it is most visible while enhancing sharpness in texture-rich regions.
In one embodiment the disclosed concepts provide a method to perform multi-band fusion. The method includes receiving an image, the image including a first type of channel (e.g., luma Y) and a plurality of other types of channels, each channel type being different (e.g., Cb and Cr); applying multi-band noise reduction to generate a multi-band pyramidal representation for each channel, wherein each channel's multi-band noise reduction is based on channel-specific and band-specific noise models (e.g., multi-level pyramidal representations of the Y, Cb and Cr channels); determining a texture metric value for the first channel type, the texture metric value based on the first channel type's multi-band pyramidal representation (e.g., to identify pixels in smooth and not-smooth areas of the image); determining a blue-chroma metric value based on the plurality of other channel types, the blue-chroma metric value based on the multi-band pyramidal representations of the plurality of other channel types (e.g., to identify blue and not-blue areas of the image); de-noising aggressively and sharpening conservatively at least some of the pixels in the image's first (e.g., the luma) channel having a texture metric value indicative of a smooth region and a blue-chroma metric value indicative of a blue pixel; de-noising conservatively and sharpening aggressively at least some of the pixels in the image's first (e.g., luma) channel having a texture metric value indicative of a not smooth region and a blue-chroma metric value indicative of a not blue pixel; combining, after de-noising, the first type of channel and the plurality of other types of channels to generate a filtered image (e.g., re-integrate the image's individual channels to create a single image); and storing the filtered image in a memory. For example, the filtered image may be stored in memory as a YCbCr image or an RGB image. In another embodiment, the filtered image may be compressed (e.g., as a JPEG image) before being stored in the memory. In another embodiment, the may further comprise denoising at least some of the pixels in each of the image's plurality of other types of channels. In one embodiment, the texture metric value may be based on a gradient between different pixels in the first channel's multi-band pyramidal representation (the pixels may be in the same or different bands within the pyramidal representation). In yet another embodiment the blue-chroma metric value may be based on a partition of a chromaticity space using one or more threshold values. A computer executable program to implement the method may be stored in any media that is readable and executable by a computer system (e.g., prior to execution a non-transitory computer readable memory).
This disclosure pertains to systems, methods, and computer readable media to remove noise and, optionally sharpen, a digital image. In general, techniques are disclosed that use a multi-band noise filter and a unique combination of texture and chroma metrics. More particularly, a novel texture metric may be used during multi-band filter operations on an image's luma channel to determine if a given pixel is associated with a textured or smooth/not-textured region of the image. A novel chroma metric may be used during the the same multi-band filter operation to determine if the same pixel is associated with a blue/not-blue region of the image. Pixels identified as being associated with a smooth blue region may be aggressively de-noised and conservatively sharpened. Pixels identified as being associated with a textured blue region may be conservatively de-noised and aggressively sharpened. By coupling texture constraints with chroma constraints it has been shown possible to mitigate noise in an image's smooth blue regions without affecting the edges/texture in other blue objects.
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the disclosed concepts. As part of this description, some of this disclosure's drawings represent structures and devices in block diagram form in order to avoid obscuring the novel aspects of the disclosed concepts. In the interest of clarity, not all features of an actual implementation are described. Moreover, the language used in this disclosure has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter. Reference in this disclosure to “one embodiment” or to “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosed subject matter, and multiple references to “one embodiment” or “an embodiment” should not be understood as necessarily all referring to the same embodiment.
It will be appreciated that in the development of any actual implementation (as in any software and/or hardware development project), numerous decisions must be made to achieve the developers' specific goals (e.g., compliance with system- and business-related constraints), and that these goals may vary from one implementation to another. It will also be appreciated that such development efforts might be complex and time-consuming, but would nonetheless be a routine undertaking for those of ordinary skill in the design and implementation of a graphics processing system having the benefit of this disclosure.
Referring to
Referring to
Referring to
In the prior cited work, sharpening factors and de-noising strengths that use the multi-band decomposition and noise reduction technology described above have been disclosed. While superior to other methods, their sharpening and de-noising factors may be fixed for a given capture/image. That is, the earlier approaches provide no natural or obvious mechanism to address the added noise in smooth blue regions such as the sky (e.g., due to sensor 105's low well capacity and its attendant weak red channel signal). This prior work is herein extended so that it can better differentiate between smooth and edge/texture regions. In the approach taken here, it can be helpful to think of the MBDF filters 200 (Luma channel), 245 (Cb channel), and 255 (Cr channel) as generating a pyramidal decomposition of their input images 205, 250, and 260 respectively. In this way images (e.g., individual channels) may be manipulated based on pixels within a single band/layer or between different bands/layers.
To determine if a pixel pi(x, y) in band ‘i’ belongs to a smooth region or an edge/texture portion of an image, horizontal and vertical gradients on the luma (Y) channel may be determined:
dx=Yi(x+1,y)−Yi(x,y) and EQ. 1A
dy=Yi(x,y+1)−Yi(x,y), EQ. 1B
where dx represents the horizontal or ‘x’ gradient, dy represents the vertical or ‘y’ gradient, ‘x’ and ‘y’ represent the coordinates of the pixel whose gradients are being found, and Yi(x, y) represents the luma channel value of the pixel at location (x, y) in the i-th level. In one embodiment, a degree of textureness metric may be taken as the maximum of the two gradient values: max(dx, dy). In other embodiments, a textureness metric could be the mean(dx, dy), median(dx, dy), or Euclidean distance √{square root over (dx2+dy2)}, between the two gradient values. In practice, any measure that is appropriate for a given implementation may be used. For example, Sobel and Canny type edge detectors may also be used.
To reduce this metric's sensitivity to noise, this textureness metric may be based on a scaled-up version of the next band (pyramid level):
dx=Yi+1(x+1,y)−Yi+1(x,y) and EQ. 2A
dy=Yi+1(x,y+1)−Yi+1(x,y), EQ. 2B
where Yi+1(x, y) represents the luma channel value in the i-th plus 1 band at location (x, y). Since each band is a filtered and down-sampled version of the immediately higher band (e.g., compare output band Y4 240 to output band Y3 235), determining an edge/texture metric on a scaled up version of the next higher band, provides a textureness metric that captures only significant edges and textures. This allows a MBNF to de-noise smooth areas more and sharpen them less, while de-noising textured regions less and sharpening them more. EQS. 1 and 2 provide a metric wherein the degree of sharpening may be proportional to the textureness metric value or strength: a higher value may be used to apply more sharpening. EQS. 1 and 2 also provide a measure wherein the degree of de-noising may be inversely proportional to the edge/texture metric value or strength (meaning edge/texture pixels are de-noised less while smooth pixels are de-noised more).
Another metric that may be used to determine if a pixel belongs to a smooth region may be based on the difference between a pixel at the i-th band and a pixel in the up-sampled version of the next lower (i+1) band:
Δband=Yi(x,y)−Yi+1(x,y)↑N EQ. 3
A low Δband metric value may be indicative of the pixel belonging to a smooth region, while a large value may indicate the pixel belongs to an edge/texture. The earlier described edge strength measure coupled with the high frequency estimate of EQ. 3 can provide a very robust technique to determine whether a pixel is on/in an edge/texture region. With these extensions, smooth areas may again be de-noised more and sharpened less, while edge/texture regions may again be de-noised less and sharpened more.
Referring to
f(Tcb)≦Cb≦1, and EQ. 4A
−1≦Cr≦g(TCr), where EQ. 4B
TCb and TCr represent Cb and Cr chromaticity thresholds respectively, f(•) represents a first threshold function, g(•) represents a second threshold function, and Cb and Cr refer to the chroma of the pixel being de-noised. Referring to
Threshold values for TCb and TCr (EQ. 4) may be unique for each implementation. When determined however, these values may be used in combination with MBNF filter elements (e.g., 310, 330, and 350) within both luma (e.g., Y 205) and chroma (e.g., Cb 250 and Cr 255) channels to generate a de-noised and (optionally) sharpened image. More specifically, threshold values for each luma channel band may be used in conjunction with each band's noise model (e.g., via a MBNR filter element) to determine if a pixel is textured or not textured. Similarly, threshold values for each band of each chroma channel (e.g., chroma channels Cb 250 and Cr 255) may be used in conjunction with each chroma band's noise model (also via a MBNR filter element) to determine if a pixel is blue or not not blue. In one embodiment, when a pixel is determined to be associated with a smooth blue region of an image, it may be heavily (aggressively) de-noised and moderately (conservatively) sharpened. If a pixel is determined to be associated with a textured blue region of an image, it may be conservatively de-noised and aggressively sharpened. For pixels that do not satisfy the aforementioned “blue” criteria, de-noising and sharpening strengths may be based on edge/texture strength and high frequency measures. That said, the approach described herein may be easily extended to other colors such as skin or high-level features such as face.
Referring to
Referring to
It is to be understood that the above description is intended to be illustrative, and not restrictive. The material has been presented to enable any person skilled in the art to make and use the disclosed subject matter as claimed and is provided in the context of particular embodiments, variations of which will be readily apparent to those skilled in the art (e.g., some of the disclosed embodiments may be used in combination with each other). For example, MBNR operations are not restricted to implementations using 4 bands; other embodiments may have fewer or more than 4. Further,
This application claims priority to U.S. Patent Application Ser. No. 62/214,514, entitled “Advanced Multi-Band Noise Reduction,” filed Sep. 4, 2015 and U.S. Patent Application Ser. No. 62/214,534, entitled “Temporal Multi-Band Noise Reduction,” filed Sep. 24, 2015, both of which are incorporated herein by reference. In addition, U.S. patent application Ser. No. 14/474,100, entitled “Multi-band YCbCr Noise Modeling and Noise Reduction based on Scene Metadata,” and U.S. patent application Ser. No. 14/474,103, entitled “Multi-band YCbCr Locally-Adaptive Noise Modeling and Noise Reduction based on Scene Metadata,” both filed Aug. 30, 2014, and U.S. Patent Application Ser. No. 61/656,078 entitled “Method of and Apparatus for Image Enhancement,” filed Jun. 6, 2012 are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
5130798 | Christopher | Jul 1992 | A |
6018596 | Wilkinson | Jan 2000 | A |
6526181 | Smith | Feb 2003 | B1 |
6804408 | Gallagher | Oct 2004 | B1 |
7426314 | Kimbell | Sep 2008 | B2 |
7432986 | Winger | Oct 2008 | B2 |
7437013 | Anderson | Oct 2008 | B2 |
7515160 | Kerofsky | Apr 2009 | B2 |
7590303 | Lee | Sep 2009 | B2 |
7969511 | Kim | Jun 2011 | B2 |
8108211 | Baqai | Jan 2012 | B2 |
8194160 | Tsuruoka | Jun 2012 | B2 |
8264574 | Grosvenor | Sep 2012 | B2 |
8411938 | Zhang | Apr 2013 | B2 |
8571347 | Srinivasan | Oct 2013 | B2 |
8614746 | Choe | Dec 2013 | B2 |
8711248 | Jandhyala | Apr 2014 | B2 |
8711249 | Baqai | Apr 2014 | B2 |
8743208 | Ovsiannikov | Jun 2014 | B2 |
8768054 | Aragaki | Jul 2014 | B2 |
8989516 | Albu | Mar 2015 | B2 |
9299130 | Panetta | Mar 2016 | B2 |
20020196907 | Shinbata | Dec 2002 | A1 |
20060023965 | Kimbell | Feb 2006 | A1 |
20060088275 | Odea | Apr 2006 | A1 |
20060274210 | Kim | Dec 2006 | A1 |
20070085857 | Bang | Apr 2007 | A1 |
20070146236 | Kerofsky | Jun 2007 | A1 |
20070258657 | Kryda | Nov 2007 | A1 |
20080199100 | Ishiga | Aug 2008 | A1 |
20090096882 | Johnson | Apr 2009 | A9 |
20090129695 | Aldrich | May 2009 | A1 |
20090136057 | Taenzer | May 2009 | A1 |
20090148063 | Hosoda | Jun 2009 | A1 |
20090322912 | Blanquart | Dec 2009 | A1 |
20110075935 | Baqai | Mar 2011 | A1 |
20120019690 | Stirling-Gallacher | Jan 2012 | A1 |
20120207396 | Dong | Aug 2012 | A1 |
20120301046 | Wallace | Nov 2012 | A1 |
20120307110 | Baqai | Dec 2012 | A1 |
20130329004 | Baqai | Dec 2013 | A1 |
20140044375 | Xu | Feb 2014 | A1 |
Entry |
---|
A. Buades, B. Coll, and J.-M. Morel, “A non-local algorithm for image denoising,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, 2005, pp. 60-65. |
C. Liu, W. T. Freeman, R. Szeliski, and S. B. Kang, “Noise estimation from a single image,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, 2006, pp. 901-908. |
S. J. Julier, “The scaled unscented transformation,” in Proceedings of American Control Conference, vol. 6, 2002, pp. 4555-4559. |
X. Liu, M. Tanaka, and M. Okutomi, “Estimation of signal dependent noise parameters from a single image,” in Proceedings of IEEE International Conference on Image Processing, 2013, pp. 79-82. |
X. Liu, M. Tanaka, and M. Okutomi, “Noise level estimation using weak textured patches of a single noisy image,” in Proceedings of IEEE International Conference on Image Processing, 2012, pp. 665-668. |
Number | Date | Country | |
---|---|---|---|
20170070718 A1 | Mar 2017 | US |
Number | Date | Country | |
---|---|---|---|
62214514 | Sep 2015 | US | |
62214534 | Sep 2015 | US |