1. Field of the Invention
The invention relates generally to apparatus and methods for generating three dimensional surface models of moving objects.
2. Background of the Invention
The generation of three dimensional models of moving objects has uses in a wide variety of areas, including motion pictures, computer graphics, video game production, human movement analysis, orthotics, prosthetics, surgical planning, sports medicine, sports performance, product design, surgical planning, surgical evaluation, military training, and ergonomic research.
Two existing technologies are currently used to generate these moving 3D models. Motion capture techniques are used to determine the motion of the object, using retro-reflective markers such as those produced by Motion Analysis Corporation, Vicon Ltd., active markers such as those produced by Charnwood Dynamics, magnetic field detectors such as those produced by Ascension Technologies, direct measurement such as that provided by MetaMotion, or the tracking of individual features such as that performed by Peak Performance, Simi. While these various technologies are able to capture motion, nevertheless these technologies do not produce a full surface model of the moving object, rather, they track a number of distinct features that represent a few points on the surface of the object.
To supplement the data generated by these motion capture technologies, a 3D surface model of the static object can be generated. For these static objects, a number of technologies can be used for the generation of full surface models: laser scanning such as that accomplished by CyberScan, light scanning such as that provided by Inspeck, direct measurement such as that accomplished by Direct Dimensions, and structured light such as that provided by Eyetronics or Vitronic).
While it may be possible to use existing technologies in combination, only a static model of the surface of the object is captured. A motion capture system must then be used to determine the dynamic motion of a few features on the object. The motion of the few feature points can be used to extrapolate the motion of the entire object. In graphic applications, such as motion pictures or video game production applications, it is possible to mathematically transform the static surface model of the object from a body centered coordinate system to a global or world coordinate system using the data acquired from the motion capture system.
All of these surface generation systems are designed to operate on static objects. Even when used in combination with a motion capture system, as described above, an object that is not a strictly rigid body is not correctly transformed from a body centered coordinate system, as a singles static surface models does not adequately represent the non rigid motion of the object. Therefore, there exists a need for a systems and methods that can produce a model of the surface a three dimensional object, with the object possibly in motion and the object possibly deforming in a non-rigid manner.
A device is needed that has the capability to operate on an object that is moving relative to the device while the device itself is possibly in motion. In order to achieve this goal, a novel device is provided wherein the optical characteristics of the device are under the control of an operator. In one embodiment a device that projects a pattern onto one aspect of the object is disclosed. The projection is in focus regardless of the distance between the object and the device, within a given range. In addition, the size of the pattern projected onto the object is under operator control. Furthermore, the imaging components of the device may also be under operator control, so that the zoom, focus, and aperture of the imaging components can be set to operate optimally, depending on where the object is in relation to the device.
The drawings illustrate the design and utility of preferred embodiments of the invention, in which similar elements are referred to by common reference numerals and in which:
Various embodiments of the invention are described hereinafter with reference to the figures. It should be noted that the figures are not drawn to scale and elements of similar structures or functions are represented by like reference numerals throughout the figures. It should also be noted that the figures are only intended to facilitate the description of specific embodiments of the invention. The embodiments are not intended as an exhaustive description of the invention or as a limitation on the scope of the invention. In addition, an aspect described in conjunction with a particular embodiment of the invention is not necessarily limited to that embodiment and can be practiced in any other embodiment of the invention.
Turning to the drawings,
Each of the video cameras 150, 160 have lenses 170 with electronic zoom, aperture and focus control. Also contained within the mounting panel 110 is a projection system 180. The projection system 180 has a lens 190 with zoom and focus control. The projection system 180 allows an image, generated by the imaging device, to be cast on the object of interest, such as an actor or an inanimate object. The projection system could be a slide projector, which includes a light source, slide holder, slide or patterned glass, and lens(es), a liquid crystal display (LCD) projector, which includes a light source, one or three liquid crystal displays, and lens(es), a digital light processing (DLP) projector, which includes a light source, an array of micro-mirrors, and lens(es), or any other type of projector that allows an image to be projected on an object.
Control signals are transmitted to the imaging device 100 through a communications channel 122. Data is downloaded from the imaging device 100 through another communications channel 124. Power is distributed to the imaging device 100 through a power system 126. The power system 126 may be internal to the imaging device 100 or an external power system may be used. The imaging device 100 is controlled by a computer (not shown). The computer may be a remote computer for example a laptop, desktop or workstation, or the imaging device 100 may have an internal computer.
After the imaging unit 320 is positioned and digital video cameras 350 and projection system 380 zoom, focus and aperture are set, a trigger signal is sent through the control channel. The trigger signal causes the projector to turn on, projecting the desired pattern 420 out into the viewing volume and onto a subject 410 as illustrated in
A computer, either external or internal to the imaging device as described above, reads these images of the object with the superimposed pattern. For each of the various camera images, the pattern features are detected. One approach that may be used for the pattern detection is a mesh extraction technique such as that described in Bayesian Grid Matching, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 25, No. 2, February 2003, pp. 162-173 by K. Hartelius, and J. M. Carstensen, which is hereby incorporated by reference in its entirety. In this algorithm, the projection pattern is a grid of black lines on a clear background. Since the pattern is known, it is possible using computer vision techniques to quickly extract these features.
Once the features are extracted from all of the digital video cameras on the imaging device, the features can be corresponded (that is, features on the object observed in one camera image are related to the same features as observed in another camera) across the digital video cameras. Corresponding the features can be accomplished in a variety of ways. One such approach is to make a number of the features distinct from the rest of the feature field. These features are detected in each of the digital video cameras and serve as a template for corresponding the rest of the features. Another approach uses a technique know as maximization of mutual information as detailed in “Multi-modality image registration by maximization of mutual information”, Mathematical Methods in Biomedical Image Analysis. IEEE Computer Society Press, 1996, by A. Maes, et al., (the “Maes technique”) which is incorporated herein by reference in its entirety. Using the Maes technique, equation 1 below is calculated for each of the possible correspondence patterns.
MI(X,Y)=H(X)+H(Y)−H(X,Y) (1)
Where MI is the mutual information between image X and image Y, H(X) and H(Y) are the entropies of images X and Y respectively, and H(X,Y) is the joint entropy of the two images. The correspondence pattern that maximizes the mutual information indicates the correct correspondence pattern. The mutual information in this case is taken as a measure for the efficacy of the correspondence between images. When the two images are correctly matched, the mutual information will be maximized.
Once the features have been corresponded, the known relative positions of the digital video cameras can be used to calculate the three dimensional location of the feature points detected in more than one digital video camera. There are multiple approaches to performing this three dimensional location determination. One such approach as explained below is commonly known as the DLT technique and is detailed in Direct Linear Transformation from Comparator Coordinates into Object Space Coordinates in Close-Range Photogrammetry (1971) by Y. I. Abdel-Aziz, Y. I. and H. M. Karara. Proceedings of the Symposium on Close-Range Photogrammetry (pp. 1-18). Falls Church, Va.: American Society of Photogrammetry.
Using the DLT technique, for each of the feature points detected in more than one digital video camera, a set of simultaneous equations is set up. These equations are:
Where X,Y,Z are the 3D coordinates of the point, L1-L11 are the direct linear transform parameters determined from digital video camera calibration, m is the number of digital video cameras, u(i) and w(i) are the pixel coordinates of the feature in digital video camera i, and R(i) is given by
R(i)=L9(i)x+L10(i)y+L11(i)z+1 (3)
A simultaneous equation solving technique (such as the linear least squares technique) is used to determine the unknowns in these equations. The solution to these equations is the three dimensional location of the feature point. This procedure is repeated for all of the points detected by more than one digital video camera. This produces a cloud of three dimensional points 710 representing a model of the surface of the three dimensional object as illustrated in
Any of a large number of data processing algorithms can be applied to this cloud of points 710 to calculate a surface model more amenable to the specific task. One approach would be to fit a spline surface through this cloud of points 710. Another approach is to use a cluster of super-quadrics. Regardless of which of these point cloud to surface algorithms are used, the essential task of constructing a model of one aspect of the three dimensional surface is complete.
The embodiments described herein have been presented for purposes of illustration and are not intended to be exhaustive or limiting. Many variations and modifications are possible in light of the forgoing teaching. The system is limited only by the following claims.
Number | Name | Date | Kind |
---|---|---|---|
3965753 | Browning, Jr. | Jun 1976 | A |
4639878 | Day et al. | Jan 1987 | A |
4965667 | Trew et al. | Oct 1990 | A |
5008804 | Gordon et al. | Apr 1991 | A |
5268998 | Simpson | Dec 1993 | A |
5495576 | Ritchey | Feb 1996 | A |
5745126 | Jain et al. | Apr 1998 | A |
5852672 | Lu | Dec 1998 | A |
5889550 | Reynolds | Mar 1999 | A |
6114824 | Watanabe | Sep 2000 | A |
6377298 | Sheele et al. | Apr 2002 | B1 |
6380732 | Gilboa | Apr 2002 | B1 |
6519359 | Nafis et al. | Feb 2003 | B1 |
6594600 | Arnoul et al. | Jul 2003 | B1 |
6768509 | Bradski et al. | Jul 2004 | B1 |
6788333 | Uyttendaele et al. | Sep 2004 | B1 |
6816187 | Iwai et al. | Nov 2004 | B1 |
6819789 | Kantor | Nov 2004 | B1 |
7274388 | Zhang | Sep 2007 | B2 |
7295698 | Miyoshi et al. | Nov 2007 | B2 |
7403853 | Janky et al. | Jul 2008 | B1 |
7630537 | Sato et al. | Dec 2009 | B2 |
20010030744 | Chang | Oct 2001 | A1 |
20020050988 | Petrov et al. | May 2002 | A1 |
20020164066 | Matsumoto | Nov 2002 | A1 |
20020184640 | Schnee et al. | Dec 2002 | A1 |
20030085992 | Arpa et al. | May 2003 | A1 |
20030235331 | Kawaike et al. | Dec 2003 | A1 |
20040128102 | Petty et al. | Jul 2004 | A1 |
20040223077 | Said et al. | Nov 2004 | A1 |
20050136819 | Kriesel | Jun 2005 | A1 |
20050168381 | Stephens | Aug 2005 | A1 |
20050225640 | Sadano | Oct 2005 | A1 |
20060290695 | Salomie | Dec 2006 | A1 |
Number | Date | Country |
---|---|---|
1 408 702 | Apr 2004 | EP |
9735166 | Sep 1997 | WO |
9742601 | Nov 1997 | WO |
9810246 | Mar 1998 | WO |
0000926 | Jan 2000 | WO |
0027131 | May 2000 | WO |
03044734 | May 2003 | WO |
2004106856 | Dec 2004 | WO |
WO 2004109228 | Dec 2004 | WO |
2005065283 | Jul 2005 | WO |
Entry |
---|
Grau O et al: “A Combined Studio Production System for 3-D Capturing of Live Action and Immersive Actor Feedback” IEEE Transactions on Circuits and Systems for Video Technology, IEEE Service Center, Piscataway, NJ, US LNKD-DOI:10.1109/TCSVT.2004.823397, vol. 14, No. 3, Mar. 1, 2004, pp. 370-380, XP011108802 ISSN: 1051-8215. |
Grau O et al: “The Origami Project: Advanced tools for creating and mixing real and virtual content in film and TV production Visual media production” IEE Proceedings: Vision, Image and Signal Processing, Institution of Electrical Engineers, GB LNKD-DOI:10.1049/IP-VIS:20045134, vol. 152, No. 4, Aug. 5, 2005, pp. 454-469, XP006024838 ISSN: 1350-245X. |
Ito, Y and Saito, H: “Free viewpoint image synthesis using uncalibrated multiple moving cameras” Computer Vision / Computer Graphics Collaboration Techniques and Applications (MIRAGE2005) Mar. 2, 2005, pp. 173-180, XP002608385 Retrieved from the Internet: URL:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.81.5150 [retrieved on Nov. 1, 2010]. |
Thomas G A et al: “A versatile camera position measurement system for virtual reality TV production” Broadcasting Convention, 1997. IBS 97., International (Conf. Publ. 447 ) Amsterdam, Netherlands Sep. 12-16, 1997, London, UK,IEE, UK LNKD-DOI:10.1049/CP:19971284, Sep. 12, 1997, pp. 284-289, XP006508771 ISBN: 978-0-85296-694-5. |
Nicola D'Apuzzo: ‘Digitization of the Human Body in the Present-Day Economy: On the actual state of the technology and its exploitation for commercial applications’, [Online] Apr. 5, 2004, XP55009675 Retrieved from the Internet: [retrieved on Oct. 17, 2011]. |
Gregor A. Kalberer; Luc Van Gool: ‘Realistic face animation for speech,’ The Journal of Visualization and Computer Animation, 2002, 13:97-106, DOI: 10.1002/vis.283. |
Number | Date | Country | |
---|---|---|---|
20070076090 A1 | Apr 2007 | US |
Number | Date | Country | |
---|---|---|---|
60723903 | Oct 2005 | US |