There are many ways in which motion may be estimated between two images. This motion may be described by a set of motion parameters that describe motion of luminance of pixels from a first image to a second image. These motion parameters may be defined at a time associated with either or both of the first and second images, or may be defined at a time between the first and second images. Thus, a vector for each pixel describes the motion of the pixel from one image to the next.
This motion estimate may be computed by using a gradient-based method, of which an example is a technique referred to as computing the “optical flow” between the images, or by using a correlation-based method. The “constant brightness constraint,” the assumption underlying the computation of optical flow, may be violated because of a change in an object's position relative to light sources, an object's specularity, an overall luminance change, a lack of similarity between the input images, an object entering or leaving a scene, or an object becoming revealed or occluded. If a violation of the constant brightness constraint occurs, the motion vectors from one coherent area of motion may spread into another coherent area of motion.
If erroneous motion information is used to perform various image processing operations, such as morphing, warping, interpolation, motion effects, and motion blurring, visible artifacts are seen in the resulting output images. For example, in some cases, the foreground may appear to stretch or distort, or the background may appear to stretch or distort, or both.
Visible artifacts in images created using image processing based on motion vector maps may be reduced by providing one or more mechanisms for correcting the vector map. In general, the set of motion vectors is changed by selecting one or more portions of the image. The vectors corresponding to the selected one or more portions are modified. Various image processing operations, such as motion compensated interpolation, may be performed using the changed set of motion vectors. Various mechanisms for obtaining a changed set of motion vectors may be used separately or combination by a user.
In one method, a region in an image may be defined. The region may be segmented into foreground and background regions. A tracker then may be used to track either the foreground region or the background region or both. A single motion vector or a parameterized motion model obtained from the tracker may be assigned to the tracked region.
In another method, a combination map may be defined to control which pixels of the input images are used to contribute to each pixel of an output image based on how a motion vector transforms a pixel from the input image to the output image. The combination map is used with a specified region to which one or more motion vectors are assigned. Such combination maps may be used in combination with the tracker described above.
In another method, a color image is generated using the motion vectors. The color image then may be modified using conventional color modification tools. The modified color image then may be converted back to motion vectors.
In another method, a user-specified transform between the two images may be used to define a set of vectors that correspond in time and resolution to the motion vectors estimated between the two images. The set of vectors may be combined with the estimated motion vectors to produce a set of vectors used for image processing operations.
Accordingly, in one aspect, motion vectors are generated by determining a set of motion vectors that describes motion between the first and second images. The set of motion vectors is changed by selecting one or more regions in the image and modifying the vectors corresponding to the selected one or more regions. Image processing operations may be performed on the images using the changed set of motion vectors.
The set of motion vectors may be changed by identifying a foreground region and a background region in the first and second images. Tracking is performed on at least one of the foreground region and the background region to determine a motion model for the tracked region. The set of motion vectors corresponding to the tracked region is changed according to the motion model for the tracked region.
The motion vectors may be changed by identifying a foreground region and a background region in the first and second images. A combination map is defined to limit how the vector map is applied to transform a pixel from the input image to the output image.
The set of motion vectors may be changed by receiving an indication of a user specified transform between the first and second images. Vectors are computed using the user specified transform and corresponding in time and in resolution with the motion vectors defined by motion estimation. The computed vectors are combined with the set of motion vectors. In one embodiment, the user specified transform may be defined by at least one point in a first image and at least one corresponding point in the second image. A transform for warping the first image to the second image that maintains correspondence between the at least one point in the first image and the at least one point in the second image may be determined. The vectors are computed by determining, for each pixel, a set of transform vectors that describe the spatial transform of the region of the first image to the corresponding region in the second image. The user specified transform also may be defined by at least one line, and/or by at least one region.
The set of motion vectors may be changed by displaying to the user a color image defined by the set of motion vectors. The user is allowed to modify the color image defined by the set of motion vectors. The set of motion vectors is then changed according to the modified color image.
There are several techniques that may be used to reduce visible artifacts in images processed using motion estimates. In general, the vector map that estimates the motion between images may be corrected to change the motion vectors that are causing the artifacts. In particular, one or more regions in the image may be selected. The vectors corresponding to the selected one or more regions are modified. Various image processing functions then may be performed using the changed set of motion vectors.
A first example technique involves identifying foreground and background regions, and applying the motion vectors differently depending on how they relate to the foreground and background in the two images. This technique is described below in connection with
Referring now to
The following are three approaches for fixing the set of motion vectors using the segmented regions of the image. The particular approach used to remove an artifact may be selected by providing a mechanism through a user interface through which the user may indicate the kind of artifact that is present in the image.
The first approach, as shown in the flow chart of
Any suitable tracker, such as those used for stabilization and object tracking applications may be used. The result of the tracker is a parameterized motion model describing the motion of the region that is tracked. The parameterized motion model may be a single motion vector, which describes translational motion, or an affine motion model defined by six parameters, or a projective motion model defined by eight parameters. A motion model is used to generate a new set of per pixel motion vectors.
Each pixel in the entire background region, excluding region 110, is then assigned a motion vector according to the parameterized motion model provided by the tracker (step 206). Next, for each pixel in the bounding box 110, it is then determined whether the original motion vector for the pixel, or the new motion vector for the background is to be used. As shown in
Given the modified set of motion vectors, a “combination map,” such as shown in
In a second approach, shown in
In a third approach, as shown in the flow chart of
It should be understood that the combination map as described above may be used with motion vectors assigned through another process to a region, without using a tracker. Given an identified region, the combination map limits how the vector map is applied to the identified region to transform a pixel from the input image to the output image.
Another technique for changing motion vector maps uses a user interface to display a representation of the motion vectors to allow a user to correct one or more individual motion vectors. In particular, the motion vectors may be encoded as a color image that is displayed to the user. The results of changes to the motion vectors may be shown interactively by updating the output image generated by processing the two input images using the updated motion vectors. A number of options may be presented to the user to change the set of motion vectors. For instance, a user may be permitted to define a region of vectors. The user may provide a single value for the whole region, or a separate value for each of several individual pixels in the region. Alternatively, a single value could be assigned automatically to the region, for example by computing an average value of a different region of vectors, or other values may be assigned automatically. As another example, a planar perspective model could be applied to a planar patch to generate vectors.
Referring to
In one embodiment, the vector color encoder maps the magnitude of each vector to a luma component and the direction to the chroma component. Given the motion estimate in the x and y directions for a pixel as a vector having magnitude u and direction v, the following equations may be used to compute a corresponding luma value (e.g., Y) and chroma values (e.g., U, V):
Y=sqrt(u*u+v*v);
theta=−a tan2(v,u);
U=f(u,v)*cos(theta); and
V=f(u,v)*sin(theta).
If saturation is constant the f(u,v)=0.5. If saturation is not constant, f(u,v) may be, for example, defined by the following formula:
(|v|−|v|min)/(|v|max−|v|min)
where |v| is the magnitude of a vector, |v|min is the minimum magnitude from among the vectors, and |v|max is the maximum magnitude from among the vectors. With these values calculated for the motion vector for each pixel, a color image is generated.
It should be understood that the color encoder may convert the motion vectors into values into any color space, including but not limited to RGB, HSL, HSV, YUV, YIQ and CMYK.
The inverse operations may be performed to take a color image and create a vector map. Formulas describing these inverse operation are:
theta=a cos(U/f(u,v));
u=Y*cos(theta); and
v=Y*sin(theta).
In another embodiment, the color image representing the motion vectors may use the luminance of an image as its luma, with the Cr and Cb values computed in the manner described above. The image may be one of the input images, a blend of the input images, a blend of the input images warped using the motion vectors, or one of the input images warped using the motion vectors.
In any conversion of the motion vectors to a color, the range of the vectors that are converted into color also may be limited, by defining a minimum and maximum value for the vector magnitude. In this embodiment, the luma component of a pixel in the color image representing the motion vectors is the luminance of the original image. Instead of using the value 0.5 as a coefficient in the equations above, the coefficient is computed as a percentage within the range of values between the minimum and maximum values. For example, if the magnitude of the vector is outside of the range of values between the minimum and maximum values, the chroma components may be zero (indicating zero saturation), or one (indicating maximum saturation), or may be clipped to the minimum or maximum value. If the magnitude of the vector is inside the range of values between the minimum and maximum values, the coefficient is defined by the fraction of the difference between the magnitude and the minimum value over the difference between the maximum and the minimum value. A user interface may be provided to allow a user to adjust the minimum and maximum values.
To allow a user to understand the color image and how it conveys information about the motion vectors that it represents, a color legend, such as illustrated by
Various visual effects also may be provided by the color image generated from the vector maps. For example, the color image that is created may be blended with the luminance of the original image. The color in the blended image highlights characteristics of motion may in the original image. For some image sequences, the color image created from the vector maps may be stored as a clip. This conversion of the motion information into an image may provide an interesting visual artistic effect. These effects also may be viewed by a user to provide another way of visualizing the motion information that has been estimated for the images.
Another way to correct artifacts using motion vectors involves using spatial warping and morphing techniques. Such techniques involve defining a mesh on an image, transforming the mesh, and generating an image using the transformed mesh to sample the original image. The mesh is defined by basic shapes called facets, which may be a polygon, such as a quadrilateral a triangle. The transform of the mesh may be specified manually. A mesh also may be specified by creating a mesh around points, lines and shapes. A user may specify start and destination positions of these points, lines and shapes. The transform of the mesh is derived from the specified mesh, and the specified start and destination positions of points, lines, curves and shapes that are used to define the mesh. The transform is used to warp two images (representing the start and the destination) towards each other, which are then blended together to produce an output image. Such techniques are described, for example, in Digital Image Warping, by George Wolberg, IEEE Computer Society Press, 1990, and “Smooth interpolation to scattered data by bivariate piecewise polynomials of odd degree,” by R. H. J. Gmelig Meyling et al., Computer Aided Geometric Design, Vol. 7, pages 439–458, 1990, and “Piecewise Cubic Mapping Functions for Image Registration,” by A. Goshtasby, Pattern Recognition, Vol. 20, No. 5, pp. 525–533, 1987, and “A method of bivariate interpolation and smooth surface fitting for irregularly distributed data points,” by H. Akima, ACM Transactions on Mathematical Software, Vol. 4, No. 2, June 1978, pp. 148–159, and “A triangle-based C1 interpolation method,” by R. J. Renka et al., Rocky Mountain Journal of Mathematics, Vol. 14, No. 1, 1984, pp. 223–237, and “Approximation and geometric modeling with simplex B-splines associated with irregular triangles,” by S. Auerbach et al., Computer Aided Geometric Design, Vol. 8, pp. 67–87, 1991, and “A piecewise linear mapping function for image registration,” by A. Goshtasby, Pattern Recognition, Vol. 19, pp. 459–466, 1986.
The transforms that are derived for mesh warping and morphing can be used to generate a set of motion vectors that describe the transform. For example, referring now to
Typically the motion vectors are centered on one of the images or may be centered at the midpoint or other location between the images. For example, as described in U.S. patent application Ser. No. 09/657,699 filed Sep. 8, 2000, and hereby incorporated by reference, this point may be the midpoint between the two images, as indicated in
In the following example, the motion vectors are centered at the midpoint between two images. Using the transform specified by the mesh for the warping operation, given a point (x,y) in the output image, a location (x1, y1) in the source image that is resampled to generate that image is determined. The motion vector (u,v) for point (x,y) is computed as twice the difference between the pixel location (x,y) and the pixel location (x1,y1). In particular, u=2*(x−x1) and v=2*(y−y1). The difference in locations is doubled because the motion vectors represent motion between the two images and they are centered at the midpoint between these images. Thus, as indicated in
Using this technique, any user specified spatial transform may be used to generate vectors that correspond to the vector map generated using motion estimation.
Referring now to
An illustration of such correction will now be described in connection with
An alternative approach that also uses user-specified transforms involves coarsely aligning objects using a user-specified transform to warp the input images, using conventional warping techniques. Motion estimation is then applied to the warped images. The warped images then may be processed using the estimated motion to perform a fine alignment of the objects. Motion between the images that are generated by such a warp operation can then be estimated, and used to more finely align the two objects. This process is illustrated in
Having now described a few embodiments, it should be apparent to those skilled in the art that the foregoing is merely illustrative and not limiting, having been presented by way of example only. Numerous modifications and other embodiments are within in the scope of one of ordinary skill in the art and are contemplated as falling within with scope of the invention.
Number | Name | Date | Kind |
---|---|---|---|
4924310 | von Brandt | May 1990 | A |
5182633 | Antonio et al. | Jan 1993 | A |
5347312 | Saunders et al. | Sep 1994 | A |
5353119 | Dorricott et al. | Oct 1994 | A |
5369443 | Woodham | Nov 1994 | A |
5394196 | Robert | Feb 1995 | A |
5410358 | Shackleton et al. | Apr 1995 | A |
5438423 | Lynch et al. | Aug 1995 | A |
5469226 | David et al. | Nov 1995 | A |
5579054 | Sezan et al. | Nov 1996 | A |
5594676 | Greggain et al. | Jan 1997 | A |
5608464 | Woodham | Mar 1997 | A |
5617144 | Lee | Apr 1997 | A |
5642170 | Hackett et al. | Jun 1997 | A |
5654771 | Tekalp et al. | Aug 1997 | A |
5657402 | Bender et al. | Aug 1997 | A |
5668914 | Inuiya et al. | Sep 1997 | A |
5673207 | Nomura | Sep 1997 | A |
5727080 | Cox et al. | Mar 1998 | A |
5748761 | Chang et al. | May 1998 | A |
5805733 | Wang et al. | Sep 1998 | A |
5850229 | Edelsbrunner et al. | Dec 1998 | A |
5880788 | Bregler | Mar 1999 | A |
5920657 | Bender et al. | Jul 1999 | A |
5926209 | Glatt | Jul 1999 | A |
5929919 | DeHaan et al. | Jul 1999 | A |
5940145 | Burl | Aug 1999 | A |
5949484 | Nakaya et al. | Sep 1999 | A |
5959672 | Sasaki | Sep 1999 | A |
5960081 | Vynne et al. | Sep 1999 | A |
5973733 | Gove | Oct 1999 | A |
5982389 | Guenter et al. | Nov 1999 | A |
5982440 | Aoki | Nov 1999 | A |
5991459 | Fogel | Nov 1999 | A |
6005625 | Yokoyama | Dec 1999 | A |
6014181 | Sun | Jan 2000 | A |
6016152 | Dickie | Jan 2000 | A |
6028631 | Nakaya et al. | Feb 2000 | A |
6031564 | Ma et al. | Feb 2000 | A |
6075818 | Thomson | Jun 2000 | A |
6081606 | Hansen et al. | Jun 2000 | A |
6088393 | Knee et al. | Jul 2000 | A |
6249613 | Crinon et al. | Jun 2001 | B1 |
6335985 | Sambonsugi et al. | Jan 2002 | B1 |
6408293 | Aggarwal et al. | Jun 2002 | B1 |
6529613 | Astle | Mar 2003 | B1 |
6563874 | Lu | May 2003 | B1 |
6570624 | Cornog et al. | May 2003 | B1 |
6658057 | Chen et al. | Dec 2003 | B1 |
6665450 | Cornog et al. | Dec 2003 | B1 |
6766037 | Le et al. | Jul 2004 | B1 |
20010019631 | Ohsawa et al. | Sep 2001 | A1 |
20020150161 | Baese et al. | Oct 2002 | A1 |
20020154792 | Cornog et al. | Oct 2002 | A1 |
20020186889 | De Haan et al. | Dec 2002 | A1 |
20020191841 | Harman | Dec 2002 | A1 |
20030035592 | Cornog et al. | Feb 2003 | A1 |
20030103569 | Nakaya et al. | Jun 2003 | A1 |
20040022419 | Kesaniemi | Feb 2004 | A1 |
20040091170 | Cornog et al. | May 2004 | A1 |
20040234143 | Hagai et al. | Nov 2004 | A1 |
Number | Date | Country | |
---|---|---|---|
20020154695 A1 | Oct 2002 | US |