The present invention relates generally to image stabilization and, more particularly, to image stabilization by image processing and registration.
The problem of image stabilization dates back to the beginning of photography, and the problem is related to the fact that an image sensor needs a sufficient exposure time to form a reasonably good image. Any motion of the camera during the exposure time causes a shift of the image projected on the image sensor, resulting in a degradation of the formed image. The motion related degradation is called motion blur. Using one or both hands to hold a camera while taking a picture, it is almost impossible to avoid an unwanted camera motion during a reasonably long exposure or integration time. Motion blur is particularly easy to occur when the camera is set at a high zoom ratio when even a small motion could significantly degrade the quality of the acquired image. One of the main difficulties in restoring motion blurred images is due to the fact that the motion blur is different from one image to another, depending on the actual camera motion that took place during the exposure time.
The ongoing development and miniaturization of consumer devices that have image acquisition capabilities increases the need for robust and efficient image stabilization solutions. The need is driven by two main factors:
1. Difficulty to avoid unwanted motion during the integration time when using a small hand-held device (like a camera phone).
2. The need for longer integration times due to the small pixel area resulting from the miniaturization of the image sensors in conjunction with the increase in image resolution. The smaller the pixel area the fewer photons per unit time could be captured by the pixel such that a longer integration time is needed for good results.
Image stabilization is usually carried out in a single-frame method and a multi-frame method. In the single-frame method, optical image stabilization generally involves laterally shifting the image while the image is projected on the image sensor by optical or mechanical means in order to compensate for the camera motion. The single-frame method requires a complex actuator mechanism to effect the image shifting. The actuator mechanism is generally expensive and large in size. It would be advantageous and desirable to provide a method and system for image stabilization using the multi-frame method.
The present invention involves a multi-frame solution. The solution is based on dividing a long exposure time into several shorter intervals and capturing several image frames of the same scene. The exposure time for each frame is reasonably short in order to reduce the motion blur degradation of the individual frames. The final output image is obtained by combining the individual frames either during the time of their capturing or after they are all captured. The operations involved in the process of generating the final image from the individual frames are as follows:
1. Reference frame selection: Select a reference image frame among the available frames.
2. Global image registration: Register each image frame with respect to the reference frame.
3. Corresponding pixel identification and weighting: Identify the pixels in the given frames that correspond to the pixels of the reference image. Weight each pixel in the given frames according to the degree of similarity between the pixel and the corresponding reference pixel.
4. Pixel fusion: Calculate the final value of each image pixel in the given frames by combining its value in the reference image with its corresponding values in the other frames.
Thus, the first aspect of the present invention provides a method of image stabilization. The method comprises:
adjusting geometrically the plurality of image frames in reference to a reference frame for providing a plurality of adjusted image frames, wherein each of the reference frame and the adjusted image frames comprises a plurality of pixels, each pixel having a pixel value, wherein each of the pixels in at least an image section of each adjusted image frame has a corresponding pixel in the reference frame; and
determining a weighting factor for each pixel in said at least image section based on similarity between the pixel values of said each pixel and a corresponding pixel for generating a resulting image frame based on the pixel value of said each pixel adjusted by the weighting factor and the pixel value of the corresponding pixel in the reference frame.
The method further comprises selecting the reference frame and said plurality of image frames among a plurality of input frames.
According to one embodiment of the present invention, the reference frame is selected based on a sharpness measure of the input frames.
According to another embodiment of the present invention, the reference frame is selected from a frame that has a shortest exposure time among the input frames. The frame that has the shortest exposure time can be the first frame of the input frames.
According to a different embodiment, the reference frame is selected from the first frame that meets a certain sharpness criteria among the input frames. The frames that do not meet the sharpness criteria can be removed in order to save the memory storage.
According to one embodiment of the present invention, the resulting image is generated based on a weighted average of the pixel value of said each pixel adjusted by the weighting factor in each of the plurality of image frames and the pixel value of the corresponding pixel in the reference frame.
According to one embodiment of the present invention, the image frames are adjusted based on a geometrical or coordinate transformation, the transformation may include rotation, translation, affine transformation, nonlinear warping, enlarging, shrinking or any combination thereof. An image registration or comparison operation may be used to determine how each of the image frames is adjusted.
According to one embodiment of the present invention, the image registration or comparison operation may include low-pass filtering each of the plurality of other frames for providing a plurality of smoothed image frames and low-pass filtering the reference frame for providing a smoothed reference frame; and comparing a portion of each smoothed image frame to a corresponding portion of the smoothed reference frames for providing an error image portion so as to determine how each of the image frames is adjusted.
The second aspect of the present invention provides an image processing system which includes a processor configured for receiving a plurality of image frames and a memory unit communicative to the processor, wherein the memory unit has a software application, the software application having programming codes for carrying out the image stabilization method.
The third aspect of the present invention provides an imaging device, such as a stand-alone digital camera, a digital camera disposed in a mobile phone or the like. The imaging device includes an image sensor, an image forming module for forming a plurality of image frames on the image sensor, a processor configured for receiving a plurality of image frames for generating a resulting image; and a memory unit communicative to the processor, wherein the memory unit has a software application, the software application having programming codes for carrying out the image stabilization method.
The fourth aspect of the present invention provides a software application product embodied in a computer readable storage medium having programming codes to carry out the image stabilization method.
The present invention will become apparent upon reading the description taken in conjunction with
The present invention provides a method and system for multi-frame image stabilization. The method can be further improved by estimating parameters of the geometrical transformation for use in image registration.
A general algorithmic description of the multi-frame image stabilization method, according to one embodiment of the present invention, is illustrated in the flowchart 100 in
Select a reference image frame R among K available frames of the same scene, as shown at step 110. The selection can be based on image sharpness, for example. Image sharpness can be quantified by a sharpness measure. For example, the sharpness measure can be expressed as the sum of absolute values of the image after applying a band-pass filter:
where I(bp) denotes the band-pass filtered image. The filtered image can be obtained by filtering the original image in the frequency domain or in the spatial domain.
According to one embodiment of the present invention, the band-pass filtered image version is calculated as the difference between two differently smoothed versions of the original image:
I
(bp)(x,y)=abs(ÏL
where L1, and L2 are different levels of image smoothness, and Ïl denotes a smoothed image resulted after l-th smoothing iterations. For example, L1=4 (level 4) and L2=1 (level 1) are used in the calculation of the band-pass filtered image version. Level 0 corresponds to the original image. Computation of the smoothed image version at different levels of smoothness is presented in more detail later herein.
Reference frame selection can be carried out in at least three ways. In a system where memory is sufficient, the image that exhibits the least blur or the highest sharpness among all available frames can be selected as the reference frame.
In a system where memory is strictly limited, it is usually not possible to store all intermediate images but only few of them (e.g. 2 or 3) plus the final result image. In such a case, the first image whose sharpness exceeds a certain threshold value is selected as the reference image. Moreover, it is possible that the system automatically removes all frames of which the sharpness measure is below a predetermined value as soon as they are captured.
A third option is to impose a shorter exposure time for one of the frames, such as the first frame, so as to reduce the risk of having it blurred by possible camera motion in that frame. The frame with a shorter exposure time can be selected as the reference frame.
After the reference frame is selected among the K available frames, the remaining frames are re-labelled as Ik, k=1, . . . , K−1, as shown at step 120. For each of the remaining frames, the following steps are carried out through the loop with k<K (steps 140 through 170) starting with. k=1 (step 130).
a. Global Image Registration:
Global image registration, as illustrated at step 150, comprises two tasks:
i) Estimate a warping function (or the registration parameters) to be used in registering the image frame Ik with respect to the reference image R; and
ii) Warp the input image Ik based on the registration parameters estimated at the previous point. The warped input image is denoted as Jk. By warping, the input image Ik is adjusted by a geometrical or coordinate transformation. The transformation can be linear or non-linear, and the transformation may include rotation, translation, affine transformation, nonlinear warping, enlarging, shrinking or any combination thereof.
The objective of the global image registration process is to compare the corresponding pixels in two images, R and Jk, by overlapping them. In practice, exact pixel correspondence may not always be achievable in all image regions. For example, in the regions representing moving objects in the scene, or image regions that cannot be mapped by the assumed global motion model, exact pixel correspondence may not be achievable. For that reason, the step of corresponding pixel identification is also carried out.
b. Corresponding Pixel Identification and Weighting:
As illustrated at step 160, identification of corresponding pixels and assignment of weight are carried out separately:
i) For each pixel x=(x,y) in the reference image R, identify the corresponding pixel xk=(xk,yk) from the warped input image Jk.
To improve the process, nearby pixels may also be used to aid the identification of the corresponding pixels. As illustrated in
After the corresponding pixels are brought in close proximity from each other, a search for the corresponding pixel xk=(xk,yk) is carried out only in a restricted searching space around the coordinates x=(x,y). During the search, the corresponding pixel xk=(xk,yk) is selected as the pixel whose neighborhood NJk has a minimum distance DF(NR,NJk), with respect to the reference pixel neighborhood NR. The search algorithm is summarized in the flowchart of
Alternatively, corresponding pixel identification is carried out simultaneously in blocks of pixels, called inner blocks, instead of individual pixels. The inner blocks are illustrated in
During the search, the inner block J′k is selected as the block whose neighborhood or outer block NJk has a minimum distance DF(NR, NJk), with respect to the outer block NR of the inner block R′ in the reference image. The search algorithm using the inner blocks is summarized in the flowchart of
In general, the process as illustrated in
ii) Weight the importance of each input image pixel Jk(xk) in the restoration of the reference image.
At this point each input image pixel has already assigned a corresponding pixel in the reference image. However this correspondence relationship may still be false in some image regions (i.e. moving objects regions). For that reason, a weight Wk(xk) may be assigned to each input image pixel in the pixel fusion process. It is possible to assign the same weight to all input image pixels that belong to the same inner block, and the weight is calculated based on a measure of similarity between the inner block and the best matching block from the reference image.
For instance, the measure of similarity can be represented by a function Wk(xk)=exp(−λ·DF(NR, NJk)), where λ is a real constant value. It is also possible that a small weight value is assigned to those pixels Jk(xk) that do not have corresponding pixels in the reference image. These pixels could be pixels belonging to some regions of the scene that have changed since the capture of the reference image (e.g. moving objects), or pixels belonging to some regions of the input image that are very different from the reference image (e.g. blur image regions since the reference image was selected to be the sharpest frame). This weighting process is useful in that better regions from each input image are selected for the construction of the output image. Optionally, in order to reduce subsequent computations, a minimum acceptable similarity threshold between two corresponding pixels can be set such that all the weights Wk(xk) that are smaller than the threshold can be set to zero.
After the steps for global image registration, corresponding pixel identification and weighting on all remaining K−1 images are completed, pixel fusion is carried out at step 180 so as to produce an output image based on the reference frame and the similarity values in the warped images.
In pixel fusion, each pixel x of the output image O is calculated as a weighted average of the corresponding values in the K−1 warped images. The task is to calculate the final value of each pixel O(x). In this operation, all pixels in the reference image R are given the same weight W0, whereas the corresponding pixels in the warped images will have the weights Wk(xk) as assigned in step 2(ii) above. The final image pixel is given by:
As mentioned earlier, a measure of similarity is used to assign the weight Wk(xk) for a corresponding pixel in a warped input image. For efficiency, pixels can be grouped into small blocks (inner blocks) of size 2×2 or larger, and all the pixels in such a block are treated unitarily, in the sense that they are all together declared correspondent with the pixels belonging to a similar inner block in the other image (see
It is possible to speed up the process of corresponding pixel identification by:
A second aspect of the present invention provides a method for the estimation of the image registration parameters.
In the estimation process, only a smoothed version of each image is used for estimating the image registration parameters. The smoothed image is obtained by low-pass filtering the original image. Because a smoothed image represents an over-sampled version of the image, not all the pixels in the smoothed image are needed in the registration process. It is sufficient to use only a subset of the smoothed image pixels in the registration process. Moreover, various image warping operations needed during the estimation of the registration parameters can be achieved by selecting different sets of pixels inside the smoothed image area, without performing interpolation. In this way, the smoothed image is used only as a “reservoir of pixels” for different warped low-resolution versions of the image, which may be needed at different iterations.
The above-described estimation method is more effective when the images are degraded by blur (for example, out of focus blur and undesirable motion blur) and noise.
The smoothed image can be calculated by applying a low-pass filter on the original image, either in the frequency domain or in the spatial domain. The original image I can be iteratively smoothed in order to obtain smoother and smoother versions of the image. Let us denote by Ïl the smoothed image resulted after l-th smoothing iterations. At each such iteration, applying a one-dimensional low-pass filter along the image rows and columns in order to smooth the current image further. Thus, assuming Ï0=I, the smoothed image at l-th iteration is obtained in two steps of one-dimensional filtering:
where hk are the taps of the low-pass filter used. For example, it is possible to use a filter of size 3 having taps h−1=2−2, h0=2−1, h1=2−2. The selection of filter taps as powers of 2 reduces the computational complexity since multiplication can be carried out in a shift register. At the end of this pre-processing step a smoothed image ÏL of the same size with the original image is obtained. This smoothed image will be used in the registration operation.
After the smoothed versions of the input and reference images are obtained, it is possible to select a set of sampling points in each image for image comparison. For simplicity, the sampling points are selected from the vertex of a rectangular lattice with horizontal and vertical period of D=2L pixels. The selection of sampling points is illustrated in
In accordance with one embodiment of the present invention, warping of the input low-resolution image Î, is performed by changing the position of the sampling points xn,k inside the smooth image area (see
A warping function can be selected in different ways. The selection of an appropriate parametric model for the warping function should be done in accordance with the expected camera motion and scene content. For instance, a simple model could be the two parameters translational model:
W(x;p)=x+p, (5)
where the parameter vector p=[p1 p2]T includes the translation values along x and y image coordinates. Another example of warping functions that can be used in image registration applications is the rigid transformation:
The rigid transformation consists of translation plus rotation.
Assuming a rigid warping function (Equation 6), the registration algorithm for registering an input image with respect to the reference image can be formulated as follows:
Input: the two images plus an initial guess of the parameter vector
p=[p1 p2 p3]T.
Output: the parameter vector that best overlaps the input image over the reference image.
a. Calculate the smoothed images ÏL, {umlaut over (R)}L.
b. Set the initial position of the sampling points xn,k, in the vertex of a rectangular lattice of period D=2L, as exemplified in
c. Construct the low-resolution reference image by collecting the pixels of the smoothed reference image in the sampling points, i.e. {circumflex over (R)}(n,k)={umlaut over (R)}L(xn,k).
d. Approximate the gradient of the reference image by
{circumflex over (R)}
x(n,k)={circumflex over (R)}(n+1,k)−{circumflex over (R)}(n,k)+{circumflex over (R)}(n+1,k+1)−{circumflex over (R)}(n,k+1), and
{circumflex over (R)}
y(n,k)={circumflex over (R)}(n,k+1)−{circumflex over (R)}(n,k)+{circumflex over (R)}(n+1,k+1)−{circumflex over (R)}(n+1,k).
e. Calculate an image
each parameter pi of the warping function.
f. Calculate the 3×3 Hessian matrix:
g. Calculate the inverse of the Hessian matrix: H−1.
a. Warp the sampling points in accordance with the current warping parameters: x′n,k=round(W(xn,k,p)).
b. Construct the warped low-resolution image by collecting the pixels of the input smoothed image in the sampling points: Î(n,k)=ÏL(x′n,k).
c. Calculate the error image: eo(n,k)=Î(n,k)−{circumflex over (R)}(n,k).
d. Smooth the error image:
e(n,k)=(eo(n,k)+eo(n+1,k)+eo(n,k+1)+eo(n+1,k+1))/4
e. Calculate the 3×1 vector of elements:
f. Calculate the update of the vector parameter: Δp=H−1g.
g. Update the parameter vector such that:
Δp, where D is the period of the rectangular sampling lattice defined earlier in the sub-section A(b) above.
Thus, the present invention provides a method for image stabilization to improve the image quality of an image captured in a long exposure time. According to the present invention, the long exposure time is divided into several shorter intervals for capturing several image frames of the same scene. The exposure time for each frame is reasonably short in order to reduce the motion blur degradation of the individual frames. The final output image is obtained by combining the individual frames either during the time of their capturing or after they are all captured. The operations involved in the process of generating the final image from the individual frames are as follows:
1. Reference frame selection: Select a reference image frame among the available frames.
2. Global image registration: Register each image frame with respect to the reference frame.
3. Corresponding pixel identification and weighting: Identify the pixels in the given frames that correspond to the pixels of the reference image. Weight each pixel in the given frames according to the degree of similarity between the pixel and the corresponding reference pixel.
4. Pixel fusion: Calculate the final value of each image pixel in the given frames by combining its value in the reference image with its corresponding values in the other frames.
In sum, the method of image stabilization, according to the present invention, can be summarized in two operations as follows:
adjusting geometrically a plurality of image frames in reference to a reference frame for providing a plurality of adjusted image frames, wherein each of the reference frame and the adjusted image frames comprises a plurality of pixels, each pixel having a pixel value, wherein each of the pixels in at least an image section of each adjusted image frame has a corresponding pixel in the reference frame; and
determining a weighting factor for each pixel in said at least image section based on similarity between the pixel values of said each pixel and a corresponding pixel for generating a resulting image frame based on the pixel value of said each pixel adjusted by the weighting factor and the pixel value of the corresponding pixel in the reference frame.
If the reference image is not already selected, the reference frame can be selected among a plurality of input frames, based on different methods:
1) the reference frame is selected based on a sharpness measure of the input frames.
2) the reference frame is selected from a frame that has a shortest exposure time among the input frames. The frame that has the shortest exposure time can be the first frame of the input frames.
3) the reference frame is selected from the first frame that meets a certain sharpness criteria among the input frames. The frames that do not meet the sharpness criteria can be removed in order to save the memory storage.
According to one embodiment of the present invention, the resulting image is generated based on a weighted average of the pixel value of said each pixel adjusted by the weighting factor in each of the plurality of image frames and the pixel value of the corresponding pixel in the reference frame.
According to one embodiment of the present invention, the image frames are adjusted based on a geometrical or coordinate transformation, the transformation may include rotation, translation, affine transformation, nonlinear warping, enlarging, shrinking or any combination thereof. An image registration or comparison operation may be used to determine how each of the image frames is adjusted.
According to one embodiment of the present invention, the image registration or comparison operation may include low-pass filtering each of the plurality of other frames for providing a plurality of smoothed image frames and low-pass filtering the reference frame for providing a smoothed reference frame; and comparing a portion of each smoothed image frame to a corresponding portion of the smoothed reference frames for providing an error image portion so as to determine how each of the image frames is adjusted.
In order to carry out the image stabilization method, according to the various embodiments of the present invention, an image processing system is required. An exemplary image processing system is illustrated in
The resulting image frame as generated by the processor and the software application can be conveyed to a storage medium 252 for storage, to a transmitter module 254 for transmitting, to a display unit 256 for displaying, or to a printer 258 for printing.
The electronic device 1, can be a stand-alone digital camera, a digital camera disposed in a mobile phone or the like.
Thus, although the present invention has been described with respect to one or more embodiments thereof, it will be understood by those skilled in the art that the foregoing and various other changes, omissions and deviations in the form and detail thereof may be made without departing from the scope of this invention.
This application is based on and claims priority to a pending U.S. Provisional Patent Application Ser. No. 60/747,167, filed May 12, 2006, assigned to the assignee of the present invention.
Number | Date | Country | |
---|---|---|---|
60747167 | May 2006 | US |