This invention relates to digital image processing that stabilizes video.
One of the disadvantages is that when the video is displayed to the user, the user may experience a zoom-in and zoom-out effect when the frames are cropped and resized repeatedly. On the other hand, if the frames are not cropped and resized, the frames may have blank areas that are displayed to the user as a result of the transformation to remove any unwanted camera motion. Thus, what is needed is a method for stabilizing video that addresses these challenges.
Use of the same reference numbers in different figures indicates similar or identical elements.
In one embodiment of the invention, a method for stabilizing a video comprising includes transforming a current frame to remove an unwanted camera motion from the current frame, cropping a portion of the transformed current frame located outside a field of view, transforming preceding and subsequent frames to place them into the local coordinate system of the current frame and to remove the unwanted camera motion from the preceding and the subsequent frames, and filling at least one blank area of the field of view with at least one of the transformed preceding and subsequent frames.
A line 306 interpolated (linearly or nonlinearly) through objects 302 in frames 1 to 7 represents the idealized camera motion, which is the actual camera motion minus any unwanted camera motion. Once the idealized camera motion is determined, an Affine transform can be determined for each frame that places that frame along the idealized camera motion 306 (hereafter referred to as “stabilizing transform”).
Once frames 1 to 7 are placed along the idealized camera motion 306, portions of frames outside of their original field of views (FOVs) 308 (illustrated as dashed boxes in
In step 502, seven frames of a video are retrieved. For example, frames 1, 2, 3, 4, 5, 6, and 7 (
In step 504, the inter-frame transforms between consecutive frames are determined or retrieved if they have been previously determined. As described above, the inter-frame transforms can be determined from common POIs between consecutive frames.
In step 506, the stabilizing transform for current frame 4 is determined or retrieved if it has been previously determined. As described above, the stabilizing transform can be determined from the idealized camera motion 306.
In step 508, current frame 4 is transformed using the stabilizing transform to remove the unwanted camera motion from current frame 4.
In step 510, current frame 4 is cropped to remove portions outside FOV 308. This leaves blank area 310 in FOV 308. Current frame 4 may have more than one blank area under other circumstances.
In step 512, one of preceding frames 1, 2, 3 and subsequent frames 5, 6, 7 is selected.
In step 514, an Affine transform that places the selected frame in the local coordinate system of current frame 4 and removes the unwanted camera motion from the selected frame is determined (hereafter referred to as “compensating transform”). The compensating transform is determined from the known inter-frame transforms and the known stabilizing transform.
The inter-frame transform between frames 3 and 4 is:
where x3 and y3 are the coordinates of a pixel in frame 3, θ(3,4) is the rotation between from frame 3 to frame 4, tx(3,4) and ty(3,4) are the translation from frame 3 to frame 4, and x4 and y4 coordinates of the pixel from frame 3 in the local coordinate system of frame 4.
The stabilizing transform for current frame 4 is:
where θ(4) is the rotation of frame 4 to remove unwanted camera motion, tx(3,4) and ty(3,4) are the translation of frame 4 to remove unwanted camera motion, and x4′ and y4′ are the coordinates of a transformed pixel from frame 4 after the removal of the unwanted camera motion.
Thus, equation 1 is substituted in equation 3 to determine a compensating transform for frame 3 as follows:
{right arrow over (X)}4′=R(4)(R(3,4){right arrow over (X)}3+{right arrow over (t)}(3,4))+t(4), or (5)
{right arrow over (X)}4′=R(4)R(3,4){right arrow over (X)}3+R(4){right arrow over (t)}(3,4)+{right arrow over (t)}(4). (6)
As one skilled in the art understands, the selection of frames that are more than once removed from current frame 4 would require the substitution of that frame's inter-frame transform into one or more additional inter-frame transforms of its neighboring frames up to current frame 4.
In step 516, the selected frame is transformed using the compensating transform.
In step 518, it is determined if there is any remaining preceding or subsequent frame. If so, then step 518 is followed by step 512 and method 500 repeats until all of the preceding and subsequent frames are placed in the local coordinate system of current frame 4 and the unwanted camera motion removed from them. If there is no remaining preceding or subsequent frame, then step 518 is followed by step 520.
In step 520, a combination of the preceding and subsequent frames that uses the least number of frames to fill in blank area 310 in FOV 308 is selected. For simplicity, assume that only frames 1, 2, and 5 appear in blank area 310 as illustrated in
In step 522, for each overlapping area in blank area 310, the frame that is the closest in time to current frame 4 is selected. If two frames are equally close in time, then one of the frames is selected randomly. As illustrated in
In step 524, edges between current frame 4 and the filled in blank area 310 are blended to create a more natural merge of the different frames in the resulting frame 4.
In step 526, the resulting frame 4 is cropped and resized if there are any remaining blank areas in the field of view. Referring back to
Various other adaptations and combinations of features of the embodiments disclosed are within the scope of the invention. Numerous embodiments are encompassed by the following claims.
This application is related to U.S. application Ser. No. 10/003,329, attorney docket no. M-12237 US (ARC-P109), entitled “VIDEO STABILIZER,” filed Oct. 31, 2001, which is commonly assigned and incorporated by reference in its entirety.