The present invention relates to CMOS imagers and, more particularly, to methods and systems for forming high dynamic range (HDR) images.
Color digital imaging systems, such as digital cameras, typically employ a single image sensor, such as a charged coupled device (CCD) or a complementary metal oxide semiconductor (CMOS) device, to digitally capture a scene of interest. Image sensors typically include an array of optical detectors, such as photodiodes, that generate an electrical response in proportion to the intensity of incident light. The dynamic range of individual optical detectors is defined by the minimum amount of light that is required to generate an electrical response at the low end and the maximum amount of light beyond which the electrical response of the optical detector does not change (i.e. a saturation point) at the high end.
The dynamic range of an image sensor is an important characteristic when capturing high contrast images. When bright and/or dark areas of an image exceed the dynamic range of an image sensor, the quality of the captured image may be degraded. If the sensitivity of the image sensor is adjusted, such as by decreasing the exposure time to sufficiently capture the features of the bright areas in an image, then the dark features are not captured sufficiently.
One technique for capturing high contrast images with a digital sensor involves capturing two images of the same scene in rapid succession, with the sensitivity of the image sensor set to capture the bright areas in a first image and the dark areas in a second image. The two images may then be used to create a composite image that includes the features of both the bright and dark areas.
Although the two-image technique may extend the dynamic range of an image sensor, changes in the scene between the time of capturing the first and second images may introduce motion artifacts that degrade the quality of the combined image.
The invention may be understood from the following detailed description when read in connection with the accompanying drawing. Included in the drawing are the following figures:
Aspects of the present invention relate to the capture of high dynamic range (HDR) images with objects, such as people, included in a scene. An HDR image that includes objects is generated by using a low dynamic range exposure of the object in the scene, as well as a high dynamic range exposure of the scene itself (i.e. without the object). Because an image of people generally tends to have a limited dynamic range, a single exposure at a suitable exposure setting may be used to cover the entire dynamic range for the people in the scene.
Shutter 102, imaging optics 104 and imager sensor 106 are conventional devices that are well known in digital imaging devices. Display 110 and memory 114 are also well known in digital imaging devices. Image sensor 106 may include CCD and CMOS devices. Image sensor 106 includes an array of optical detectors, referred to as pixels, which individually detect light and generate electrical responses in proportion to received light. In order to capture color images, image sensor 106 also includes color filters, for example, a color filter array (CFA) that separates incident light into different colors, such as red, blue and green and passes the filtered light to the pixels. In an embodiment, shutter 102, imaging optics 104, display 110, user interface 112 and memory 114 are integrated with image sensor 106 and image processing unit 108 into a digital camera.
Image data generated from image sensor 106 is provided to image processor 108 as captured images and stored in memory 114. Image processor 108 uses the captured images to generate an HDR image that may be displayed on display 110 and/or provided to memory 114. In particular, image processor 108 uses the captured images of a scene over a range of exposure settings and another image of an object in the scene to generate the HDR image, described further below with respect to
An HDR image is typically captured using multiple exposures so that both the low and high scene luminance components may be captured with sufficient accuracy. HDR image construction typically takes longer to process than the time it takes to generate an image from a single exposure. For substantially stationary objects, such as a house, a longer period for capturing multiple images is not a concern. For scenes containing objects such as people and/or animals, however, a long capture period and longer exposure times may create problems with motion artifacts.
Frequently, image registration is used during the merging of multiple images. Image registration refers to a process of providing point by point correspondence among multiple images in a scene. An example of image registration is provided in U.S. Pat. No. 7,142,723 issued to Kang et al. The image registration process, however, may create artifacts (e.g., edge blurs or ghost images) in the resulting HDR image in regions where objects move between various exposures.
In general, imaging device 100 is typically placed on a tripod (not shown), to minimize effects of any motion artifacts during image capture. By providing imaging device 100 on a tripod, the scene may be considered substantially constant over time. It may be appreciated that, if the scene is substantially constant, image processor 108 may generate an HDR image from multiple captured images of a scene over a range of exposure times, without requiring to detect motion, apply motion compensation processing or perform image registration. The exposure for each image may be controlled by varying shutter speed, F-number (i.e. the brightness of the image), or exposure time of image sensor 106.
As described above, high exposure settings tend to represent dark areas more accurately, whereas low exposure settings tend to represent bright areas more accurately. The combination of low to high exposure settings may be combined into a single high dynamic range image. The “dynamic range” of an image, as used herein, refers to the range from the lowest pixel brightness value in the image (corresponding to the lowest detected intensity) to the highest pixel brightness value in the image (corresponding to the highest detected intensity).
If an object, such as a person, however, is included in a scene and the object varies more quickly than the scene or is prone to motion, it may be difficult to capture an HDR image without motion compensating the object. For example, a relatively stationary person may be captured in an image with a short exposure time of about 1/30 of a second without detecting movement. When the exposure time becomes about 1 second, for example, even though the person is relatively stationary, there may still be movement in the captured image that degrades the HDR image quality.
The entire HDR capture process may take about 1 to 2 minutes, including all the exposures and preparation time. Accordingly, if a person (relatively stationary) is included in a scene (substantially constant) and an HDR image is generated from multiple images over a range of exposure times, the quality of the image may be degraded due to motion artifacts. As a result of such HDR capture process with people in the scene, direct merging of individual images in the HDR sequence may cause edges of the moving elements to blur and degrade the quality of the HDR image.
In order to include objects that may move in a scene, imaging device 100 captures multiple images of the scene (referred to herein as multiple scene images) over a range of exposure times T1, T2, . . . , TN (i.e., over a high dynamic range). For example, the range of exposure times may include T1 of about 1/8000 of a second to TN of about 20 seconds for 14 different exposure times at a fixed aperture value.
In addition, an object may be placed in the scene and then imaging device 100 may capture another image of that object (referred to herein as an object image) at a single low exposure time suitable for capturing the object. The single object image may be captured with a low dynamic range.
Image processor 108 forms an HDR image using the multiple scene images that do not include the object and merges the other image that includes the object. By merging the multiple scene images with the single object image, the process of the invention uses images with substantially no motion. Accordingly, motion artifacts are reduced.
User interface 112 may be used to selectively capture the multiple scene images or the object image. Depending upon the settings selected by user interface 112, image processor 108 may be directed to adjust one or more of the speed of shutter 102, the focus of imaging optics 104, or the exposure time of image sensor 106. Although image processor 108 is illustrated as controlling adjustment of shutter 102, imaging optics 104 and image sensor 106, it will be understood that imaging device 100 may be configured by other means to adjust shutter speed, focus, F-number, exposure time and/or any other parameter in order to capture images with different exposure settings and provide both high and low dynamic range images.
A predetermined range of exposure settings may be used to capture the multiple scene images. For example, the range of exposure settings may be stored in memory 114. To capture another single object image, imaging device 100 may determine a suitable exposure setting or exposure time Ti for the object, depending upon lighting conditions, distance to the object, focusing of the object, etc. For example, imaging device 100 may capture a test image and adjust the exposure settings to optimize capture of the single object image. The exposure settings for the object image may be adjusted, for example, based on a lookup table stored in memory 114. It will be understood that exposure time Ti of the object image may be different from exposure times of the multiple captured scene images. In one embodiment, a range of illumination for the multiple scene images may be greater than 100 dB (decibels) whereas a range of illumination for the single object image may be less than about 30 dB.
Controller 202 controls radiance image generator 204, mask image generator 206 and merge generator 208 for generating an HDR image from multiple captured scene images (over a range of exposure settings) and another object image (at a low exposure setting). Controller 202 may also receive selection information from user interface 112 and adjust one or more settings of shutter 102, imaging optics 104 or image sensor 106, according to whether multiple scene images or a single object image is selected.
Radiance image generator 204 receives multiple images of the scene (having a high dynamic range) and form a first radiance image from the multiple images. Radiance image generator 204 also receives the other image of the object in the scene (having a low dynamic range) and form a second radiance image. An example of radiance image generation is described in U.S. Pat. No. 7,142,723 issued to Kang et al. In general, input images are converted to a radiance image using a known exposure value and a known imaging device response function Fresponse. A final radiance value at a pixel is typically computed as a weighted sum of the corresponding pixels in the radiance images. The radiance of an individual pixel is typically provided as:
where Rad is the radiance of a pixel, p is the pixel intensity and exp is an exposure level. The imaging device response function may include known imaging device parameters used to capture the multiple scene images or the object image. For example, the speed of shutter 102 and aperture settings of imaging optics 104. In one embodiment, suitable imaging device parameters are stored, for example, as metadata information associated with each of the images. Referring to
Accordingly, a first radiance image is generated from the multiple scene images, by combining the pixel radiances from the multiple scene images, and a second radiance image is generated from the single object image. With respect to generation of the first radiance image, a weighting function is typically used to average the linear exposures together. Referring to
Mask image generator 206 receives at least one of the multiple scene images and the object image and subsequently generate a masked image. In one embodiment, controller 202 selects one of the multiple scene images having similar exposure times as the exposure time of the object image. In another embodiment, mask image generator 206 receives multiple scene images and selects one of the multiple scene images as the exposure time. In one embodiment, the selected scene image and the object image are subtracted from each other to form a differential image. The difference between the selected scene image and the object image emphasizes the object in the scene, because features that are common to both images (i.e. the remainder of the scene) are substantially minimized.
To generate a masked image, a boundary is formed around the object in the differential image. The regions that include the object may be given minimum pixel values (e.g. 0), whereas regions that do not include the object may be given maximum pixels values (e.g. 1). A second (i.e. inverse) masked image is generated from the inverse of the masked image. It may be appreciated that the boundary between the regions excluding/including the object may have a sharp (i.e. binary) transition (e.g. 0 to 1) or a soft transition (e.g. a slope from 0 to 1). For example, a soft transition may result in the merged object in the HDR image appearing more natural to response of the human eye.
In one embodiment, mask image generator 206 receives the first and second radiance images generated by radiance image generator 204 and, using the masked image and inverse masked image, forms respective first and second masked radiance images. In another embodiment, merge generator 208 forms the first and second masked radiance images, using the masked image and inverse masked image generated by mask image generator 206. In a further embodiment, controller 202 may form the first and second masked radiance images, using the masked image and inverse masked image, and subsequently provide the masked first and second radiance images to merge generator 208.
The first masked radiance image is formed by multiplying the inverse masked image and the first radiance image. In this manner, the scene, without the object, is included in the masked first radiance image. The second masked radiance image is formed by multiplying the masked image and the second radiance image. Accordingly, the object, without the scene, is included in the second masked radiance image.
Merge generator 208 receives the first and second radiance images, the masked image and inverse masked image (or the masked first and second radiance images) and generate a merged HDR image. As one example, the masked first and second radiance images may be summed together. In one embodiment, the summation at the borders of the two regions is processed differently from the rest of the image. A gradient weighting function is applied so that a smooth transition is achieved between the two regions. Additional adjustments in white balance, exposure, or tonal mapping may be applied to either or both of the two regions to achieve the most pleasing results of the final HDR image.
Referring now to
The steps illustrated in
At step 300, multiple scene images are captured and stored in memory 114 (
Referring back to
Referring again to
At step 312, the first radiance image is masked by an inverse masked image, for example, by mask image generator 206 (
Referring to
Although the invention is illustrated and described herein with reference to specific embodiments, the invention is not intended to be limited to the details shown. Rather, various modifications may be made in the details within the scope and range of equivalents of the claims and without departing from the invention.
Number | Name | Date | Kind |
---|---|---|---|
5144442 | Ginosar et al. | Sep 1992 | A |
5309243 | Tsai | May 1994 | A |
5828793 | Mann | Oct 1998 | A |
5914748 | Parulski et al. | Jun 1999 | A |
7057650 | Sakamoto | Jun 2006 | B1 |
7061524 | Liu et al. | Jun 2006 | B2 |
7127084 | Mauk | Oct 2006 | B1 |
7142723 | Kang et al. | Nov 2006 | B2 |
7239805 | Uyttendaele et al. | Jul 2007 | B2 |
7301563 | Kakinuma et al. | Nov 2007 | B1 |
7349119 | Tsukioka | Mar 2008 | B2 |
7443533 | Lin | Oct 2008 | B2 |
7495699 | Nayar et al. | Feb 2009 | B2 |
7519907 | Cohen et al. | Apr 2009 | B2 |
7612804 | Marcu et al. | Nov 2009 | B1 |
20020154829 | Tsukioka | Oct 2002 | A1 |
20030095192 | Horiuchi | May 2003 | A1 |
20030151689 | Murphy | Aug 2003 | A1 |
20040100565 | Chen et al. | May 2004 | A1 |
20050275747 | Nayar et al. | Dec 2005 | A1 |
20060002611 | Mantiuk et al. | Jan 2006 | A1 |
20060017837 | Sorek et al. | Jan 2006 | A1 |
20060083440 | Chen | Apr 2006 | A1 |
20060114333 | Gokturk et al. | Jun 2006 | A1 |
20060192867 | Yosefin | Aug 2006 | A1 |
20060192873 | Yaffe | Aug 2006 | A1 |
20060209204 | Ward | Sep 2006 | A1 |
20060251216 | Allred et al. | Nov 2006 | A1 |
20070002164 | Ward et al. | Jan 2007 | A1 |
20070242900 | Chen et al. | Oct 2007 | A1 |
20070257184 | Olsen et al. | Nov 2007 | A1 |
20080023623 | Davidovici | Jan 2008 | A1 |
Number | Date | Country |
---|---|---|
2005-175870 | Jun 2005 | JP |
2007-221423 | Aug 2007 | JP |
2001-0036040 | May 2001 | KR |
WO 2006073875 | Jul 2006 | WO |
Entry |
---|
Debevec et al., “Recovering High Dynamic Range Radiance Maps from Photographs”, Proceedings of SIGGRAPH 1997, Computer Graphics Proceedings, Annual Conference Series, pp. 369-378. |
Mann et al., “On Being ‘Undigital’ with Digital Cameras: Extending Dynamic Range by Combining Differently Exposed Pictures”, In Proc. IS&T's 48th annual conference, pp. 422-428, Washington, D.C., May 7-11, 1995. |
Eden et al., “Seamless Image Stitching of Scenes with Large Motions and Exposure Differences”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, 2006. |
Xiao et al., “High Dynamic Range Imaging of Natural Scenes”, Tenth Color Imaging Conference: Color Science. |
Capra et al., “Adaptive Image Data Fusion for Consumer Devices Application”, IEEE 6th Workshop on Multimedia Signal Processing, 2004, pp. 243-246. |
Khan et al., “Ghost Removal in High Dynamic Range Images”, IEEE International Conference, 2006. |
Cho et al., “Extending Dynamic Range of Two Color Images under Different Exposures”, IEEE Proceeding of the 17th International Conference on Pattern Recognition, 2004. |
Razlighi et al., “Correction of Over-Exposed Images Captured by Cell-Phone Cameras”, IEEE International Symposium on Publication, 2007. |
Number | Date | Country | |
---|---|---|---|
20090274387 A1 | Nov 2009 | US |