The invention generally relates to image processing and, more particularly, the invention relates to image stabilization.
Image data, such as a video data stream, often can have artifacts introduced when a data capture device (e.g., a video camera) shakes while recording or otherwise capturing the image. Removal of such artifacts improves image fidelity.
In various embodiments of the invention, an apparatus and method stabilize video in real-time. In one embodiment of the invention, weighted averages as a function of position of the intensity or the hue associated with pixels in a video image are calculated. The weighted average in the horizontal and in the vertical determine a location in the image called a centroid. The centroid is first calculated for a reference frame of the video data stream. Subsequent frames of the video are then translated so that their centroids coincide with the centroid of the reference frame. Thus, artifacts in the image due to camera “shake” are removed. In another embodiment of the invention, the video image frame is broken into regions or tiles. As before, the centroids of the tiles in a reference video image frame are calculated. Subsequent frame's tiles' centroids are calculated. A simple curve fitting technique is used to determine the affine transform that will cause the image to coincide with the reference image. This embodiment of the invention can deal effectively with motion of the video capture device or camera that involves translation and rotation in a plurality of directions simultaneously.
Illustrative embodiments of the invention are implemented as a computer program product having a computer usable medium with computer readable program code thereon. The computer readable code may be read and utilized by a computer system in accordance with conventional processes.
The foregoing features of the invention will be more readily understood by reference to the following detailed description, taken with reference to the accompanying drawing, in which:
In various embodiments of the invention, an apparatus and method stabilize video in real-time. In one embodiment of the invention, weighted averages as a function of position of the intensity or the hue associated with pixels in a video image are calculated. The weighted averages in the horizontal and in the vertical determine a location in the image called a centroid. The centroid is first calculated for a reference frame of the video data stream. Subsequent frames of the video are then translated so that their centroids coincide with the centroid of the reference frame. Thus, artifacts in the image due to camera “shake” are removed. In another embodiment of the invention, the video image frame is broken into regions or tiles. As before, the centroids of the tiles in a reference video image frame are calculated. Subsequent frame's tiles' centroids are calculated. A simple curve fitting technique is used to determine the affine transform that will cause the image to coincide with the reference image. This embodiment of the invention can deal effectively with motion of the camera that involves translation and rotation in a plurality of directions simultaneously.
Illustrative embodiments of the invention may be implemented as a computer program product having a computer usable medium with computer readable program code thereon. The computer readable code may be read and utilized by a computer system in accordance with conventional processes. Details of illustrative embodiments are discussed below.
System Operation
In an embodiment of the invention, as shown in
Centroid Calculation
The centroid is calculated using a weighted average. The centroid is defined as follows:
The ValueOfPixel could be the intensity or hue of the pixel or another value calculated from appropriate attributes of each pixel. The summation takes place over the user defined range of interest which may be all or just a portion of the video frame. The centroid of the image is then compared to the centroid of a reference frame that has been previously calculated 250. The image is then translated in the X and Y direction as needed so that the centroid of the image and the centroid of the reference image coincide 260, 270. These operations are then repeated on subsequent images in the video stream.
Curve Fitting
To deal with complex motion of the video capture source, the image is broken into tiles. Each tile's centroid is then calculated as above. The movement of each tile's centroid is compared to corresponding centroids in a reference frame. These values are input into a curve fitting routine that determines the values of an affine transform. The transform accounts for translation, scale, rotation, yaw, and pitch. If a successful calculation of each of these values is made, then the video frame is passed to a transformation routine that rotates, scales, and translates the image appropriately. If a solution to the curve fit is not found, then the points of the reference frame are correlated against the points of the current frame. The “N” by “N” correlation yields the highest correlation of points. All points that are above a user selected threshold are used to establish a curve fit. If all attempts to establish a curve fit fail, then a new reference frame is established, and the process starts over.
It should be noted that discussion of video data streams is exemplary and not intended to limit the scope of all embodiments. Rather, various embodiments apply to image data that can be represented graphically and recorded to some medium. In illustrative embodiments, the image data is recordable in 2D. Further, in various embodiments of the invention, the system can cause a new reference image frame to be designated periodically and its centroid or centroids calculated. This can be done based on a time parameter, the amount of translation of the image frames that occurs, or some other criterion.
Various embodiments of the invention may be implemented at least in part in any conventional computer programming language. For example, some embodiments may be implemented in a procedural programming language (e.g., “C”), or in an object oriented programming language (e.g., “C++”). Other embodiments of the invention may be implemented as preprogrammed hardware elements (e.g., application specific integrated circuits, FPGAs, and digital signal processors), or other related components.
In some embodiments, the disclosed apparatus and methods may be implemented as a computer program product for use with a computer system. Such implementation may include a series of computer instructions fixed either on a tangible medium, such as a computer readable medium (e.g., a diskette, CD-ROM, ROM, or fixed disk) or transmittable to a computer system, via a modem or other interface device, such as a communications adapter connected to a network over a medium. The medium may be either a tangible medium (e.g., optical or analog communications lines) or a medium implemented with wireless techniques (e.g., WIFI, microwave, infrared or other transmission techniques). The series of computer instructions can embody all or part of the functionality previously described herein with respect to the system.
Those skilled in the art should appreciate that such computer instructions can be written in a number of programming languages for use with many computer architectures or operating systems. Furthermore, such instructions may be stored in any memory device, such as semiconductor, magnetic, optical or other memory devices, and may be transmitted using any communications technology, such as optical, infrared, microwave, or other transmission technologies.
Among other ways, such a computer program product may be distributed as a removable medium with accompanying printed or electronic documentation (e.g., shrink wrapped software), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the network (e.g., the Internet or World Wide Web). Of course, some embodiments of the invention may be implemented as a combination of both software (e.g., a computer program product) and hardware. Still other embodiments of the invention are implemented as entirely hardware, or entirely software.
Although the above discussion discloses various exemplary embodiments of the invention, it should be apparent that those skilled in the art can make various modifications that will achieve some of the advantages of the invention without departing from the true scope of the invention.
This application claims priority from U.S. provisional patent application No. 60/603,768, filed Aug. 23, 2004, entitled “Real-Time Image Stabilization,” which is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
4719584 | Rue et al. | Jan 1988 | A |
6167167 | Matsugu et al. | Dec 2000 | A |
6215519 | Nayar et al. | Apr 2001 | B1 |
6233007 | Carlbom et al. | May 2001 | B1 |
6377626 | Hatabu | Apr 2002 | B1 |
6556708 | Christian et al. | Apr 2003 | B1 |
6560375 | Hathaway et al. | May 2003 | B1 |
6591011 | Nielsen | Jul 2003 | B1 |
7016532 | Boncyk et al. | Mar 2006 | B2 |
7038709 | Verghese | May 2006 | B1 |
7409092 | Srinivasa | Aug 2008 | B2 |
20020048393 | Oosawa | Apr 2002 | A1 |
20030048359 | Fletcher et al. | Mar 2003 | A1 |
20040032906 | Lillig | Feb 2004 | A1 |
20050084178 | Lure et al. | Apr 2005 | A1 |
20050163348 | Chen | Jul 2005 | A1 |
20060152590 | Kage et al. | Jul 2006 | A1 |
Number | Date | Country |
---|---|---|
0671147 | Jan 1995 | EP |
2 242 590 | Oct 1991 | GB |
2242590 | Oct 1991 | GB |
04-287579 | Oct 1992 | JP |
06-038091 | Feb 1994 | JP |
WO 0073996 | Dec 2000 | WO |
WO 0229544 | Apr 2002 | WO |
WO2004062270 | Jul 2004 | WO |
Number | Date | Country | |
---|---|---|---|
20060061661 A1 | Mar 2006 | US |
Number | Date | Country | |
---|---|---|---|
60603768 | Aug 2004 | US |