An aspect of the invention relates to a method of processing a set of images that have been successively captured. The method may be applied in, for example, digital photography so as to subjectively improve an image that has been captured with flashlight. Other aspects of the invention relate to an image processor, an image-capturing apparatus, and a computer-program product for an image processor.
The article entitled “Flash Photography Enhancement via Intrinsic Relighting” by Elmar Eisemann et al., Siggraph 2004, Los Angeles, USA, Aug. 8-12, 2004, Volume 23, Issue 3, pages: 673-678, describes a method of enhancing photographs shot in dark environments. A picture taken with the available light is combined with one taken with a flash. A bilateral filter decomposes the pictures into detail and large scale. An image is reconstructed using the large scale of the picture taken with the available light, on the one hand, and the detail of the picture taken with the flash, on the other hand. Accordingly, the ambience of the original lighting is combined with the sharpness of the flash image. It is mentioned that advanced approaches could be used to compensate for subject motion.
According to an aspect of the invention, a set of images that have been successively captured comprises a plurality of images that have been captured under substantially similar light conditions, and an image that has been captured under substantially different light conditions. A motion indication is derived from at least two images that have been captured under substantially similar light conditions. The image that has been captured under substantially different light conditions is processed on the basis of the motion indication derived from the at least two images that have been captured under substantially similar light conditions.
The invention takes the following aspects into consideration. When an image is captured with a camera, one or more objects that form part of the image may move with respect to the camera. For example, an object that forms part of the image may move with respect to another object that also forms part of the image. The camera can track one of those objects only. All objects that form part of the image will generally move if the person holding the camera has a shaky hand.
An image may be processed in a manner that takes into account respective motions of objects that form part of the image. Such motion-based processing may enhance image quality as perceived by human beings. For example, it can be prevented that one or more moving objects cause the image to be blurred. Motion can be compensated when a combination is made of two or more images captured at different instants. Motion-based processing may further be used to encode the image so that a relatively small amount of data can represent the image with satisfactory quality. Motion-based image processing generally requires some form of motion estimation, which provides indications of respective motions in various parts of the image.
Motion estimation may be carried out in the following manner. The image of interest is compared with a so-called reference image, which has been captured at a different instant, for example, just before or just after the image of interest has been captured. The image of interest is divided into several blocks of pixels. For each block of pixels, a block of pixels in the reference image is searched that best matches the block of pixels of interest. In case of motion, there will be a relative displacement between the two aforementioned blocks of pixels. The relative displacement provides a motion indication for the block of pixels of interest. Accordingly, a motion indication can be established for each block of pixels in the image of interest. The respective motion indications constitute a motion indication for the image as a whole. Such motion estimation is commonly referred to as block-matching motion estimation. Video encoding in accordance with a Moving Pictures Expert Group (MPEG) standard typically uses block-matching motion estimation.
Block-matching motion estimation will generally be unreliable when the image of interest and the reference image have been captured under different light conditions. This may be the case, for example, if the image of interest has been captured with ambient light whereas the reference image has been captured with flashlight, or vice versa. Block-matching motion estimation takes luminance into account when searching for the best match between a block of pixels in the image of interest and a block of pixels in the reference image. Consequently, block-matching motion estimation may find that, in the image of interest, a block of pixels, which has a given luminance, best matches a block of pixels that has similar luminance in the reference image. However, the respective block of pixels may belong to different objects.
For example, let it be assumed that a first image is captured with ambient light and a second image is captured with flashlight. In the first image, there is an object X that appears to be light gray and another object Y that appears to be dark gray. In the second image, which is captured with flashlight, the object X may appear to be white and the object Y may appear to be light gray. There is a serious risk that a block-matching motion estimation finds that a light-gray block of pixels in the first image, which belongs to object X, best matches with a similar light-gray block of pixels in the second image, which belongs to object Y. The block-matching motion estimation will thus produce a motion indication that relates to the location of object X in the first image with respect to the location of object Y in the second image. The block-matching motion estimation has thus confused objects. The motion indication is wrong.
It is possible to apply a different motion estimation technique, which is less sensitive to differences in light conditions under which respective images have been captured. For example, the motion estimation operation may be arranged so that luminance or brightness information is ignored. Color information is taken into account only. Nevertheless, such color-based motion estimation does generally not provide sufficiently precise motion indications. The reason for this is that color comprises less detail than luminance. Another possibility is to base motion estimation on edge information. A high pass filter can extract edge information from an image. Variations in pixel values are considered rather than the pixel values themselves. Even such edge-based motion estimation provides relatively imprecise motion indications in quite a number of cases. The reason for this is that edge information is generally affected too when light conditions change. In general, any motion estimation technique is to a certain extent sensitive to different light conditions, which may lead to erroneous motion indications.
In accordance with the aforementioned aspect of the invention, a motion indication is derived from at least two images that have been captured under substantially similar light conditions. An image that has been captured under substantially different light conditions is then processed on the basis of the motion indication derived from the at least two images that have been captured under substantially similar light conditions.
The motion indication is relatively precise with respect to the at least two images that have been captured under substantially similar light conditions. This is because motion estimation has not been disturbed by differences in light conditions. However, the motion indication derived from the at least two images that have been captured under substantially similar light conditions does not directly relate to the image that has been captured under substantially different light conditions. This is because the latter image has not been taken into account in the process of motion estimation. This may introduce some imprecision. In fact, it is assumed that motion is substantially continuous throughout an interval of time during which the images are captured. In general, this assumption is sufficiently correct in a great number of cases, so that any imprecision will generally be relatively modest. This is particularly true compared with imprecision due to differences in light conditions, as explained hereinbefore. Consequently, the invention allows a more precise indication of motion in an image that has been captured under substantially different light conditions. As a result, the invention allows relatively good image quality.
The invention may advantageously be applied in, for example, digital photography. A digital camera may be programmed to capture at least two images with ambient light in association with an image captured with flashlight. The digital camera derives a motion indication from the at least two images captured with ambient light. The digital camera can use this motion indication to make a high-quality combination of the image captured with flashlight and at least one of the two images captured with ambient light.
Another advantage of the invention relates to the following aspects. In accordance with the invention, the motion indication for an image that has been captured under substantially different light conditions need not be derived from that image itself. The invention therefore does not require a motion estimation technique that is relatively insensitive to differences in light conditions. Such motion estimation techniques, which have been described hereinbefore, generally require complicated hardware or software, or both. The invention allows satisfactory results with a relatively simple motion estimation technique, such as, for example, a block-matching motion estimation technique. Already existing hardware and software can be used, which is cost-efficient. For those reasons, the invention allows cost-efficient implementations.
These and other aspects of the invention will be described in greater detail hereinafter with reference to drawings.
The optical pickup unit OPU captures an image in a substantially conventional manner. A shutter, which forms part of the lens-and-shutter system LSY, opens for a relatively short interval of time. The image sensor SNS receives optical information during that interval of time. Lenses, which form part of the lens-and-shutter system LSY, project the optical information on the image sensor SNS in a suitable manner. Focus and aperture are parameters that define lens settings. The optical sensor converts the optical information into analog electrical information. The image interface-circuit IIC converts the analog electrical information into digital electrical information. Accordingly, a digital image is obtained which represents the optical information as a set of digital values. This is the image captured.
The flash unit FLU may provide flashlight FLSH illuminating objects that are relatively close to the digital camera DCM. Such objects will reflect a portion of the flashlight FLSH. A reflected portion of the flashlight FLSH will contribute to the optical information that reaches the optical sensor. Consequently, the flashlight FLSH may enhance the luminosity of objects that are relatively close to the digital camera DCM. However, the flashlight FLSH may cause optical effects that appear unnatural, such as, for example, red eyes, and may also cause the image to have a flat and harsh appearance. An image of a scene that has been captured with sufficient ambient light is generally considered more pleasant than an image of the same scene captured with flashlight. However, an ambient-light image may be noisy and blurred if there is insufficient ambient light, in which case a flashlight image is generally preferred.
In step ST1, the control-and-processing circuit CPC detects that a user has depressed the flash button FB and the image-shot button SB (FB↓ & SB↓). In response to this, the control-and-processing circuit CPC causes the digital camera DCM to carry out the steps described hereinafter (the digital camera DCM may also carry out these steps if the user has depressed the image-shot button SB only and the control-and-processing circuit CPC detects that there is insufficient ambient light).
In step ST2, the optical pickup unit OPU captures a first ambient-light image IM1a at an instant to (OPU: IM1a @ t0). The control-and-processing circuit CPC stores the first ambient-light image IM1a in the image storage medium ISM (IM1a→ISM). In step ST3, the optical pickup unit OPU captures a second ambient-light image IM2a at an instant to +ΔT (OPU: IM2a @ to +ΔT), with sign ΔT denoting the time interval between the instant when the first ambient-light image IM1a is captured and the instant when the second ambient-light image IM2a is captured. The control-and-processing circuit CPC stores the second ambient-light image IM2a in the image storage medium ISM (IM2a→ISM).
In step ST4, the flash unit FLU produces flashlight (FLSH). The digital camera DCM carries out step ST5 during the flashlight. In step ST5, the optical pickup unit OPU captures a flashlight image IMFa at an instant t0+2ΔT (OPU: IMFa @ to +2ΔT). Thus, the flashlight occurs just before the instant to +2ΔT. The time interval between the instant when the second ambient-light image IM2a is captured and the instant when the flashlight image IMFa is captured is substantially equal to ΔT. The control-and-processing circuit CPC stores the flashlight image IMFa in the image storage medium ISM (IMFa→ISM).
In step ST6, the control-and-processing circuit CPC carries out a motion estimation on the basis of the first ambient-light image IM1a and the second ambient-light image IM2a, which are stored in the image storage medium ISM (MOTEST[IM1a,IM2a]). One or more objects that form part of these images may be in motion. The motion estimation provides an indication of such motion. The indication typically is in the form of motion vectors (MV).
There are many different manners to carry out the motion estimation in step ST6. A suitable manner is for example the so-called three-dimensional (3D) recursive search, which is described in the article “Progress in motion estimation for video format conversion” by G. de Haan, IEEE Transactions on Consumer Electronics, Vol. 46, No. 3, August 2000, pp. 449-459. An advantage of the 3D recursive search is that this technique generally provides motion vectors that accurately reflect the motion within the image of interest.
In step ST6, it is also possible to carry out a block-matching motion estimation. An image to be encoded is divided into several blocks of pixels. For a block of pixels in the image to be encoded, a block of pixels in a previous or subsequent image is searched that best matches the block of pixels in the image to be encoded. In case of motion, there will be a relative displacement between the two aforementioned blocks of pixels. A motion vector represents the relative displacement. Accordingly, a motion vector can be established for each block of pixels in the image to be encoded.
Either 3D recursive search or block-matching motion estimation can be implemented at relatively low cost. The reason for this is that hardware and software already exist for these types of motion estimation in various consumer-electronics applications. An implementation of the digital camera DCM, which is illustrated in
In step ST7, the control-and-processing circuit CPC carries out a motion compensation on the basis of the second ambient-light image IM2a and the motion vectors MV that the motion estimation has produced in step ST6 (MOTCMP[IM2a,MV]). The motion compensation provides a motion-compensated ambient-light image IM2aMC, which may be stored in the image storage medium ISM. The motion compensation should compensate for motion between the second ambient-light image IM2a and the flashlight image IMFa. That is, the motion compensation is carried out relative to the flashlight image IMFa.
Ideally, identical objects in the motion-compensated ambient-light image IM2aMC and the flashlight image IMFa have identical positions. That is, all objects should ideally be aligned if the aforementioned images are superposed. The only difference should reside in luminance and color information of the respective objects. The objects in the motion-compensated ambient-light image IM2aMC will appear darker with respect to those in the flashlight image IMFa, which has been captured with flashlight.
In practice, the motion compensation will not perfectly align the images. A relatively small error may remain. This is due to the fact that the motion vectors relate to motion in the second ambient-light image IM2a relative to the first ambient-light image IM1a. That is, the motion vectors do not directly relate to the flashlight image IMFa. Nevertheless, the motion compensation can provide a satisfactory alignment on the basis of these motion vectors.
Alignment will be precise if the motion in the second ambient-light image IM2a relative to the first ambient-light image IM1a, is similar to the motion in the flashlight image IMFa relative to the second ambient-light image IM2a. This will generally be the case if the images are captured in a relatively quick succession. For example, let it be assumed that the images concern a scene that comprises an accelerating object. The object will have a substantially similar speed at respective instants when the images are captured if the time interval is relatively short with respect to the object's acceleration.
In step ST8, which is illustrated in
The combination, which is made in step ST8, also offers the possibility to correct for any red eyes that may appear in the flashlight image IMFa. When an image is captured of a living being with eyes and flashlight is used, the eyes may appear red, which is unnatural. Such red eyes may be detected by comparing the motion-compensated ambient-light image IM2aMC with the flashlight image IMFa. Let it be assumed that the control-and-processing circuit CPC detects the presence of red eyes in the flashlight image IMFa. In that case, eye-color information of the motion-compensated ambient-light image IM2aMC defines the color of the eyes in the enhanced flashlight image IMFa. It is also possible that a user detects and corrects red eyes. For example, the user of the digital camera DCM illustrated in
In step ST9, the control-and-processing circuit CPC stores the enhanced flashlight image IMFaE in the image storage medium ISM (IMFaE→ISM). Accordingly, the enhanced flashlight image IMFaE may be transferred to an image display apparatus at a later moment. Optionally, in step ST10, the control-and-processing circuit CPC deletes the ambient-light images IM1a, IM2a and the flashlight image IMFa, which are present in the image storage medium ISM (DEL[IM1a,IM2a,IMFa]). The motion-compensated ambient-light image IM2aMC may also be deleted. However, it may be useful to keep the aforementioned images in the image storage medium ISM so that these can be processed at a later moment.
Ambient-light images IM1a, IM2a appear to be substantially similar. Both images are taken with ambient light. Each object has similar luminosity and color in both images. The only difference concerns the ball BL, which has moved. Consequently, the motion estimation in step ST6, which has been described hereinbefore, will provide motion vectors that indicate the same. The second ambient-light image IM2a comprises one or more groups of pixels that substantially belong to the ball BL. A motion vector for such a group of pixels indicates the displacement, i.e. the motion, of the ball BL. In contradistinction, a group of pixels that substantially belongs to an object other than the ball BL will have a motion vector that indicates no motion. For example, a group of pixels that substantially belongs to the vase VA will indicate that this is a still object.
The flashlight image IMFa is relatively different from the ambient-light images IM1a, IM2a. In the flashlight image IMFa, foreground objects such as the table TA, the ball BL, the vase VA with the flower FL, are more clearly lit than in the ambient-light images IM1a, IM2a. These objects have a higher luminosity and more vivid colors. The flashlight image IMFa differs from the second ambient-light image IM2a not only because of different light conditions. The motion of the ball BL also causes the flashlight image IFa to be different from the second ambient-light image IM2a. There are thus two main causes that account for differences between the flashlight image IMFa and the second ambient-light image IM2a: light conditions and motion.
The motion vectors, which are derived from the ambient-light images IM1a, IM2a, allow a relatively precise distinction between differences due to light conditions and differences due to motion. This is substantially due to the fact that the ambient-light images IM1a, IM2a have been captured under substantially similar light conditions. The motion vectors are therefore not affected by any differences in light conditions. Consequently, it possible to enhance the flashlight image IMFa on the basis of differences in-light conditions only. The motion compensation, which is based on the motion vectors, prevents that the enhanced flashlight image IMFaE is blurred.
In step ST101, the control-and-processing circuit CPC detects that a user has depressed the flash button FB and the image-shot button SB (FB↓ & SB↓). In response to this, the control-and-processing circuit CPC causes the digital camera DCM to carry out the steps described hereinafter (the digital camera DCM may also carry out these steps if the user has depressed the image-shot button SB only and the control-and-processing circuit CPC detects that there is insufficient ambient light).
In step ST102, the optical pickup unit OPU captures a first ambient-light image IM1b at an instant t1 (OPU: IM1b @ t0). The control-and-processing circuit CPC stores the first ambient-light image IM1b in the image storage medium ISM. A time label that indicates the instant t1 is stored in association with the first ambient-light image IM1b (IM1b & t1→ISM).
In step ST103, the flash unit FLU produces flashlight (FLSH). The digital camera DCM carries out step ST104 during the flashlight. In step ST104, the optical pickup unit OPU captures a flashlight image IMFb at an instant t2 (OPU: IMFb @ t2). Thus, the flashlight occurs just before the instant t2. The control-and-processing circuit CPC stores the flashlight image IFb in the image storage medium ISM. A time label that indicates the instant t2 is stored in association with the flashlight image IMFb (IMFb & t2→ISM).
The digital camera DCM carries out step ST105 when the flashlight has dimmed and ambient light conditions have returned. In step ST105, the optical pickup unit OPU captures a second ambient-light image IM2b at an instant t3 (OPU: IM2b @ t3). The control-and-processing circuit CPC stores the second ambient-light image IM2b in the image storage medium ISM. A time label that indicates the instant t3 is stored in association with the second ambient-light image IM2b (IM2b & t3→ISM).
In step ST106, the control-and-processing circuit CPC carries out a motion estimation on the basis of the first ambient-light image IM1b and the second ambient-light image IM2b, which are stored in the image storage medium ISM (MOTEST[IM1b,IM2b]). The motion estimation provides motion vectors MV1,3 that indicate motion of objects that form part of the first ambient-light image IM1b and the second ambient-light image IM2b.
In step ST107, the control-and-processing circuit CPC adapts the motion vectors MV1,3 that the motion estimation has provided in step ST106 (ADP[MV1,3;IM1b,IMFb]). Accordingly, adapted motion vectors MV1,2 are obtained. The adapted motion vectors MV1,2 relate to motion in the flashlight image IMFb relative to the first ambient-light image IM1b. To that end, the control-and-processing circuit CPC takes into account the respective instants t1, t2, and t3 when the ambient-light and flashlight images IM1b, IM2b, and IMFb have been captured.
The motion vectors MV1,3 can be adapted in a relatively simple manner. For example, let it be assumed that a motion vector has a horizontal component and a vertical component. The horizontal component can be scaled with a scaling factor equal to the time interval between instant t1 and instant t2 divided by the time interval between instant t1 and instant t3. The vertical component can be scaled in the same manner. Accordingly, a scaled horizontal component and a scaled vertical component are obtained. In combination, these scaled components constitute an adapted motion vector, which relates to the motion in the flashlight image IMFb relative to the first ambient-light image IM1b.
In step ST108, which is illustrated in
In step ST109, the control-and-processing circuit CPC makes a combination of the flashlight image IMFb and the motion compensated ambient-light image IM1bMC (COMB[IMFb,IM1bMC]). The combination results in an enhanced flashlight image IMFbE in which unnatural and less pleasant effects, which the flashlight may cause, are reduced. In step ST110, the control-and-processing circuit CPC stores the enhanced flashlight image IMFbE in the image storage medium ISM (IMFbE→ISM). Optionally, in step ST 111, the control-and-processing circuit CPC deletes the ambient-light and flashlight images IM1b, IM2b, IMFb that are present in the image storage medium ISM (DEL[IM1b,IM2b,IMFb]). The motion compensated ambient-light image IM1bMC may also be deleted.
The image processing apparatus IMPA may process a set of images that relate to a same scene. At least two images have been captured with ambient light. At least one image has been captured with flashlight.
For example, let it be assumed that the digital camera DCM is programmed to carry out steps ST1-ST5, but not step ST10 (see
Alternatively, the digital camera DCM may be programmed to carry out steps ST101-ST105, but not step ST111 (see
The enhanced flashlight image will have a quality that substantially depends on motion-estimation precision. As mentioned hereinbefore, 3D-recursive search allows relatively good precision. A technique known as Content Adaptive Recursive Search is a good alternative. Complex motion estimation techniques may be used that can account for tilt as well as translation between images. Furthermore, it is possible to first carry out a global motion estimation, which relates to an image as a whole, and, subsequently, a local motion estimation, which relates to various different parts of the image. Sub-sampling the image simplifies the global motion estimation. It should also be noted that the motion estimation can be segment-based instead of block-based. A segment-based motion estimation takes into account that an object may have a form that is quite different from that of a block. A motion vector may relate to an arbitrary-shaped group of pixels, not necessarily a block. Accordingly, a segment-based motion estimation can be relatively precise.
The following rule generally applies. The greater the number of images on which the motion estimation is based, the more precise the motion estimation will be. In the description hereinbefore, the motion estimation was based on two images captured with ambient light. A more precise motion estimation can be obtained if more than two images are captured with ambient light and subsequently used for estimating motion. For example, it is possible to estimate the speed of an object on the basis of two images that have been successively captured, but not the acceleration of the object. Three images allow acceleration estimation. Let it be assumed that three ambient-light images are captured in association with a flashlight image. In that case, a more precise estimation can be made of where objects will be at the instant when the flashlight image is captured compared with when two ambient light images are captured.
The detailed description hereinbefore with reference to the drawings illustrates the following characteristics. A set of images that have successively been captured comprises a plurality of images that have been captured under substantially similar light conditions (first and second ambient-light images IM1a, IM2a,
The detailed description hereinbefore further illustrates the following optional characteristics. At least two images are first captured with ambient light and, subsequently, an image is captured with flashlight (operation in accordance with
The detailed description hereinbefore further illustrates the following optional characteristics. The images are successively captured at respective instants with a fixed time interval (ΔT) between these instants (operation in accordance with
The detailed description hereinbefore further illustrates the following optional characteristics. An image is captured with ambient light, subsequently, an image is captured with flashlight, and subsequently, a further image is captured with ambient light (operation in accordance with
The detailed description hereinbefore further illustrates the following optional characteristics. The motion indication comprises an adapted motion vector (MV1,2) which is obtained as follows (
The detailed description hereinbefore further illustrates the following optional characteristics. The motion-estimation step establishes a motion vector that belongs to a group of pixels in a manner that takes into account a motion vector that has been established for another group of pixels. This is the case, for example, in 3D recursive search. The aforementioned characteristic allows accurate motion estimation compared with simple block-matching motion estimation techniques. Motion vectors will truly indicate motion of an object to which the relevant group of pixels belongs. This contributes to a good image quality.
The aforementioned characteristics can be implemented in numerous different manners. In order to illustrate this, some alternatives are briefly indicated. The set of images may form a motion picture instead of a still picture. For example, the set of images to be processed may be captured by means of a camcorder. The set of images may also result from a digital scan of a set of conventional paper photos. The set of images may comprise more than two images that have been captured under substantially similar light conditions. The set may also comprise more than one image that has been captured under substantially different light conditions. The images may be located anywhere with respect to each other. For example, a flashlight image may have been captured first followed by two ambient-light images. A motion indication may be derived from the two ambient-light images, on the basis of which the flashlight image can be processed. Alternatively, two flashlight images may have been captured first and, subsequently, an ambient-light image. A motion indication is derived from the flashlight images. In this case, the flashlight images constitute the images that have been taken under substantially similar light conditions.
There are numerous different manners to process the set of images. Processing need not necessarily include image enhancement as described hereinbefore. The processing may include, for example, image encoding. In case the processing includes image enhancement, there are many ways to do so. In the description hereinbefore, a motion-compensated ambient-light image is first established. Subsequently, a flashlight image is enhanced on the basis of the motion-compensated ambient-light image. Alternatively, the flashlight image may directly be enhanced on a block-by-block basis. A block of pixels in the flashlight image may be enhanced on the basis of a motion vector for that block of pixels, which indicates a corresponding block of pixels in an ambient-light image. Accordingly, respective blocks of pixels in the flashlight image may be successively enhanced. In such an implementation, there is no need to first establish a motion-compensated ambient-light image.
The set of images need not necessarily comprise time labels that indicate respective instants when respective images have been captured. Time labels are not required, for example, if there are fixed time intervals between these respective instants. Time intervals need not be identical, it is sufficient that they are known.
There are numerous ways of implementing functions by means of items of hardware or software, or both. In this respect, the drawings are very diagrammatic, each representing only one possible embodiment of the invention. Moreover, although a drawing shows different functions as different blocks, this by no means excludes that a single item of hardware or software carries out several functions or that an assembly of items of hardware or software or both carry out a function.
The remarks made herein before demonstrate that the detailed description, with reference to the drawings, illustrates rather than limits the invention. There are numerous alternatives, which fall within the scope of the appended claims. Any reference sign in a claim should not be construed as limiting the claim. The word “comprising” does not exclude the presence of other elements or steps than those listed in a claim. The word “a” or “an” preceding an element or step does not exclude the presence of a plurality of such elements or steps.
Number | Date | Country | Kind |
---|---|---|---|
04300738.4 | Oct 2004 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB05/53491 | 10/25/2005 | WO | 00 | 4/24/2007 |