The invention relates to stabilizing video obtained by a camera. More particularly, this invention relates to controlling a moveable camera and stabilizing video output from the moveable camera.
Tracking an object with a moving camera is a difficult task. When the camera is moving, simple change detection algorithms conventionally used to detect motion in fixed cameras cannot be used to detect object motion. These simple algorithms do not work in moving cameras because the moving camera produces changes all over the image. On the other hand, determining motion models using optical flow techniques can be rather inaccurate. Control commands for moveable cameras based on inaccurate motion models lead to the appearance of unwanted shaking during display of the resulting video.
What is needed is a system that can track a moving object with a moveable camera or can compensate for vehicle motion without producing unwanted shaking in the resulting video.
An exemplary embodiment of the invention may include a system for tracking motion of an object, a system for maintaining a moveable camera in an initial direction, and method for tracking a moving object.
The system includes a moveable camera adapted to obtain a sequence of video images of an object, a determiner adapted to identify an object border in a current video image of the sequence of video images, the determiner being adapted to determine an object area and a background area based on the object border; an estimator adapted to estimate a camera motion estimate of the moveable camera and to estimate an object motion estimate of the object, the estimator being adapted to generate a camera motion model from the camera motion estimate and being adapted to generate an object motion model from at least one of the object motion estimate and the camera motion model; a stabilizer adapted to adjust at least one video image within the sequence of video images based on the camera motion model; and a controller adapted to control the moveable camera to track the object based on the object motion model and the camera motion model.
The system may further include an output device adapted to receive the adjusted video image, wherein the stabilizer warps the at least one video image during the adjustment of the at least one video image, wherein the stabilizer is adapted to stabilize the sequence of video images produced by the moveable camera and simultaneously the controller is adapted to control the moveable camera to track the object, wherein the controller is adapted to control the movable camera to maintain the object within the outer border of each video image while the object is within the range of the camera, and wherein the moveable camera is adapted to perform at least one of pan, tilt, and/or zoom.
The system may also include wherein the stabilizer is adapted to generate a correction model from the camera motion model, wherein the stabilizer is adapted to filter the camera motion model and to generate the correction model based on a comparison of the camera motion model and the filtered camera motion model, wherein the stabilizer adjusts at least one of the sequence of video images using the correction model, and wherein the determiner is adapted to adjust the object border in a next video image based on the object motion model.
A system may also include a moveable camera adapted to obtain a sequence of video images, an estimator adapted to generate a camera motion estimate of motion of the moveable camera and to generate a camera motion model from the camera motion estimate; a controller adapted to receive an initial direction of an optical axis, the controller controlling the moveable camera to maintain the optical axis in the initial direction based on the initial direction and the camera motion model; and a stabilizer adapted to adjust the sequence of video images based on the camera motion model to stabilize the sequence of video images.
The system may also include wherein the stabilizer is adapted to stabilize the sequence of video images simultaneously to the controller being adapted to control the moveable camera to maintain the optical axis of the moveable camera in the initial direction, wherein the controller is adapted to control the moveable camera to at least one of pan, tilt, and/or zoom, wherein the stabilizer is adapted to generate a correction model based on the camera motion model, and wherein the stabilizer is adapted to adjust at least one of the sequence of video images using the correction model.
A method may include obtaining, at a moveable camera, a sequence of video images of a scene having an object; identifying an object border within a current video image that substantially surrounds the object; determining a background area and an object area of the current video image in the sequence of video images based on the object border; determining optical flow data of the background area and of the object area; calculating a camera motion model based on the optical flow data of the background area; calculating an object motion model based on the optical flow data of the object area; adjusting the object border based on the object motion model; calculating a correction model based on the camera motion model; and adjusting the current video image based on the correction model.
The method may also include wherein the adjusting the object border further comprises warping the object border based on the object model, wherein adjusting the current video image comprises warping the current video image based on the correction model, controlling the moveable camera to track the object based on the object motion model and on the camera motion model, and outputting the adjusted current video image to an output device.
Moreover, the above features and advantages of the invention are illustrative, and not exhaustive, of those which can be achieved by the invention. Thus, these and other features and advantages of the invention will be apparent from the description herein, both as embodied herein and as modified in view of any variations which will be apparent to those skilled in the art.
The foregoing and other features and advantages of the invention will be apparent from the following, more particular description of the exemplary embodiments of the invention, as illustrated in the accompanying drawings. The left most digits in the corresponding reference number indicate the drawing in which an element first appears.
Embodiments of the invention are explained in greater detail by way of the drawings, where the same reference numerals refer to the same or analogous features.
Exemplary embodiments of the invention are discussed in detail below. In describing embodiments, specific terminology is employed for the sake of clarity. However, the invention is not intended to be limited to the specific terminology so selected. While specific exemplary embodiments are discussed, it should be understood that this is done for illustration purposes only. A person skilled in the relevant art will recognize that other components and configurations can be used without parting from the spirit and scope of the invention.
An exemplary embodiment of the invention may relate to a system for fast automatic object tracking or for vehicle motion compensation using a moveable camera and for presenting a smooth visual impression at a display by stabilizing video output from the moveable camera. In an exemplary embodiment, tracking of the object and stabilization of the output video may be based on a simultaneous determination of camera motion and object motion. While the stabilization of the output video is achieved by compensating for the camera motion, both, the camera motion and the object motion are needed for the control of the moveable camera in order to track the object. In an exemplary embodiment, each video image in a sequence of video images may be captured by the moveable camera and may be partitioned into an object area and a background area through the use of, e.g., but not limited to, optical flow estimations that may be used to create models for object motion and for camera motion, or through a user drawing an object border around an object using a selection device, such as, but not limited to, a mouse or joy-stick. The video sequence, which may be obtained by the moveable camera, may be stabilized by adjusting individual video images with motion corrections, which may be derived from temporally low pass filtering, e.g., but not limited to, a camera motion model to smooth out changes between each video image caused by motion of the moveable camera, according to an exemplary embodiment. Additionally, in an exemplary embodiment, the moveable camera may receive control commands, which may be derived from both the camera motion model and the object motion model, which may be used to track the object of interest and to keep the object of interest within the outer border of each video image while the object is within the range of the camera.
The invention is initially described with reference to
The Moveable Camera 101 may be adapted to record a scene and to produce a sequence of video images of the scene, where each video image may represent items and objects within the scene at a particular time. In an exemplary embodiment, the Moveable Camera 101 may be a video camera and may be able to perform one or more of, e.g., but not limited to, panning, zooming, and/or tilting, etc. In a further exemplary embodiment, the Moveable Camera 101 may be a Pan-Tilt-Zoom (PTZ) Camera. The Moveable Camera 101 may output the sequence of video images to, e.g., but not limited to, the Background Determiner 102 and to the Object Determiner 103 in an exemplary embodiment.
The Object Determiner 103 may receive the sequence of video images from the Moveable Camera 101 and may create a border 203 substantially around the outside of an object 210 of interest within the current video image (see
The Background Determiner may 102 receive the object area information on the border 203 from the Object Determiner 103 and may receive the sequence of video images from the Moveable Camera 101. Alternatively, the Object Determiner 103 may forward the sequence of video images to the Background Determiner 102. For the current video image, the Background Determiner 102 may use the border 203, which may be identified by the Object Determiner 103, to determine the complement of the object area as background area. In an exemplary embodiment, the area outside of the border 203 within the current video image may be a background area 202 (i.e., the complement), and the area inside of the border 203 within the current video image may be an object area 204 (see
The Camera Motion Estimator 104 may receive and use the background area information on the background area 202 to estimate a model of the camera motion based on optical flow data calculated between the current video image and one or more of the previous video images. The Camera Motion Estimator 104 then may forward parameters describing the camera motion model to the Video Stabilizer 106, which may temporally filter the parameters and may adjust the current video image in order to, e.g., but not limited to, generate a substantially smooth, non-shaking sequence of video images for display at the Display 108. The Camera Motion Estimator 104 also may forward the camera motion model parameters to the Object Determiner 103 and to the Control Unit 107, in an exemplary embodiment.
While the object area information is being forwarded to the Background Determiner 102, the Object Determiner 103 may simultaneously, independently, and/or consecutively forward the object area information to the Object Motion Estimator 105. The Object Motion Estimator 105 may use the object area information of the object area 204 to estimate a model of object motion by identifying optical flow of the object 210 within the sequence of video images. In one exemplary embodiment, the estimate may use at least the current video image and one or more of the previous video images. Once calculated, the Object Motion Estimator 105 may output the object motion model to the Object Determiner 103 for adjusting the object border in the next image and for controlling motion of the Moveable Camera 101, according to an exemplary embodiment.
As illustrated in the exemplary embodiment of
In
During operation, all of the pixel blocks in the background area 202 may be used by the Camera Motion Estimator 104 to estimate the camera motion model for the motion of the Moveable Camera 101 based on optical flow data. In one embodiment, the estimate of the camera motion model may be based on motion vectors (or displacements) obtained by matching the selected pixel blocks within the current video image with pixel blocks in one or more of the previous video images. This matching process may be supported by additional similar matching processes in corresponding images with rougher spatial resolution. In a robust least squares algorithm, a camera motion model (transformation) Tcurrent is determined, which maps the original positions of the pixel blocks to the displaced positions of the matching pixel blocks. The camera motion model may be described by four parameters, which may be used to identify translation, rotation and scale. An advantage of using a robust least square algorithm is that it may guarantee that the contribution of the false dark blocks, belonging to the object 210, will be detected as outliers and thus will not contribute to the camera motion model. Any other algorithm producing a camera motion model, which maps the original positions of the pixel blocks to the displaced positions of the matching pixel blocks, may also be used, as will be appreciated by those skilled in the art.
Likewise, all of the pixel blocks in the object area 204 may be used by the Object Motion Estimator 105 to estimate an object motion model for the motion of object 210 based on optical flow data. In one embodiment, the estimate of the object motion model may be based on motion vectors (or displacements) obtained by matching the selected pixel blocks within the current video image with pixel blocks in one or more of the previous video images. This matching process may be supported by additional similar matching processes in corresponding images with rougher spatial resolution. In a robust least squares algorithm, an object motion model (transformation) Mcurrent is determined, which maps the original positions of the pixel blocks to the displaced positions of the matching pixel blocks. The object motion model may be described by four parameters, which may be used to identify translation, rotation and scale, or by just two parameters representing translation only. An advantage of using a robust least square algorithm is that it may guarantee that the contribution of the false bright blocks, belonging to the real background (e.g., the pixels in the video image that do not correspond to the object 210), will be detected as outliers and thus will not contribute to the object motion model. Additionally the camera motion model, calculated before, can be used to identify outliers belonging to the real background, by comparing the motion vectors detected in the object area 204 with corresponding motion vectors calculated from the camera motion model. In one embodiment, the pixel blocks with small deviations may be identified as outliers and may not be used in the least squares algorithm. Alternatively, the object motion model may be derived from the camera motion model and the object border 203.
The Camera Position Correction Calculator 404 may also receive the actual camera position that may be forwarded from the Moveable Camera 101 through the Camera Position Receiver 403. In an exemplary embodiment, after receiving the actual camera position, the Camera Position Correction Calculator 404 may compare the actual camera position with the camera position estimate to calculate a position correction. The comparison to calculate the position correction may determine an error in the camera position estimate, and the New Camera Position Estimator 402 may use the error in the camera position estimate to, e.g., but not limited to, update the camera position, and to minimize error between the actual camera position and the camera position estimate. After the position correction is calculated, the position correction may be transferred to the Moveable Camera 101 by a Control Data Sender 405 to, e.g., but not limited to, adjust the position of the Moveable Camera 101. It is noted that
Initially, the sequence of video input images may be received and stored in an Image Buffer 507, which may be adapted to output each stored video image to an Image Adjuster 504. The sequence of camera motion models may also be received at a separate input and may be stored at a Camera Motion Model Buffer 501. The Camera Motion Model Buffer 501 may be adapted to output the current camera motion model to a Camera Motion Model Filter 502. The Camera Motion Model Filter 502 may temporally filter the parameters of the current camera motion model (e.g., but not limited to, generated from the current video image and at least one or more of the previous video images) together with the corresponding parameters of previous camera motion models from a sequence of previous images stored in the Camera Motion Model Buffer 501 using, e.g., but not limited to, a special low pass Finite Impulse Response (FIR)-filter. The special low pass FIR-filter may be, for example, but not limited to, a Blackman-filter. In general, any low pass filter may be used, as will be appreciated by those skilled in the art. The special low pass FIR-filter may filter the parameters of the current camera motion model with the corresponding parameters of a sequence of previous images to remove any large parameter fluctuations or differences between the parameters of the current camera motion model and of the previous camera motion models.
After filtering the current camera motion model, the Camera Motion Model Filter 502 may output the filtered current camera motion model Tfiltered together with the current camera motion model Tcurrent to a Correction Model Calculator 503. The Correction Model Calculator 503 may calculate the current correction model ΔTcurrent from the current camera motion model Tcurrent and the filtered current camera motion model Tfiltered together with the previous correction model ΔTprevious by the composition ΔTcurrent=Tfiltered∘ΔTprevious∘Tcurrent−1. The correction model may be adapted to correct, e.g., shaking in the sequence of video images that may appear as a result of movement by, e.g., the Moveable Camera 101 and/or a vehicle or a device on which the Moveable Camera 101 is mounted.
Once calculated, the Correction Model Calculator 503 may forward the correction model to the Image Adjuster 504. The Image Adjuster 504 may adjust the current video image using the correction model to produce a video output such that successive video images in the video output to Display 108 may appear substantially smooth and non-shaking to a user. The Image Adjuster 504 may adjust the individual video images using the correction model, e.g., but not limited to, by a correction warp. A correction warp is a transformation of a video image to a corrected image, wherein the value of each pixel in the corrected image is interpolated from the values of the pixels in the neighborhood of a point in the image, displaced from the pixel by a motion vector derived from the correction model.
In contrast with the exemplary embodiment depicted in
The exemplary embodiment and examples discussed herein are non-limiting examples.
The invention is described in detail with respect to exemplary embodiments, and it will now be apparent from the foregoing to those skilled in the art that changes and modifications may be made without departing from the invention in its broader aspects, and the invention, therefore, as defined in the claims is intended to cover all such changes and modifications as fall within the true spirit of the invention.
This application claims priority to provisional U.S. patent application Ser. No. 60/616,857, filed Oct. 8, 2004, which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
60616857 | Oct 2004 | US |