The present invention generally relates to three dimensional modeling and more particularly to a two pass approach to three dimensional reconstruction of film sets.
There are a number of known techniques that either captures 3D information directly, as for example, using a laser range finder, or recover 3D information from one or multiple 2D images such as stereo techniques. These and other known single techniques do not perform well in all situations. Some techniques perform well only in indoor environments while others work in static scenes. Fully recovering the complete geometry of a scene in one pass is computationally expensive and unreliable. Three dimensional 3D acquisition techniques in general can be classified as active and passive approaches, single view and multi-view approaches or geometric and photometric methods.
Passive approaches acquire 3D geometry from images or videos taken under regular lighting conditions. 3D geometry is computed using the geometric or photometric features extracted from images and videos. Active approaches use special light sources, such as laser, structure light or infrared light. They compute the geometry based on the response of the objects and scenes to the special light projected onto the surface.
Single-view approaches recover 3D geometry using one image taken from a single camera viewpoint. Examples include photometric stereo and depth from defocus. Multi-view approaches recover 3D geometry from multiple images taken from multiple camera viewpoints, resulted from object motion, or with different light source positions. Stereo matching is an example of multi-view 3D recovery by matching the pixels in the left image and right images in the stereo pair to obtain the depth information of the pixels.
Geometric methods recover 3D geometry by detecting geometric features such as corners, lines or contours in single or multiple images. The spatial relationship among the extracted corners, lines or contours can be used to infer the 3D coordinates of the pixels in images. Photometric methods recover 3D geometry based on the shading or shadow of the image patches resulted from the orientation of the scene surface.
A solution is needed for recovering three dimensional geometries of objects and scenes that overcomes problems due to the movement of subjects, large depth discontinuity between foreground and background, and complicated lighting conditions.
An inventive method includes scanning a static background for background three dimensional information, scanning a dynamic foreground for foreground three dimensional information and combining the background and foreground three dimensional information to obtain a three dimensional model.
In an alternative embodiment of the invention, a method includes acquiring three dimensional information of a static scene, acquiring three dimensional information of a dynamic scene, and combining the three dimensional information recovered for the static and dynamic scenes.
The advantages, nature, and various additional features of the invention will appear more fully upon consideration of the illustrative embodiments now to be described in detail in connection with accompanying drawings wherein:
It should be understood that the drawings are for purposes of illustrating the concepts of the invention and are not necessarily the only possible configuration for illustrating the invention.
Unlike ideal conditions in a laboratory, in a real-world scene subjects could be in movement, lighting may be complicated, and depth range could be large. It is difficult for prior techniques to handle these real-world conditions. For instance, if there is a large depth discontinuity between the foreground and background objects, the search range of stereo matching has to be significantly increased, which could result in high computational cost, and more depth estimation errors. Therefore, it is desirable to treat foreground and background objects separately.
The invention is a two-pass technique for the recovery of three dimension 3D information. A first pass recovers a three dimension of a static scene using a low speed, high accuracy technique. Static scene scanning would need to be repeated multiple times to recover any new items introduced in the static scene. A second pass uses a high speed, less accurate technique to recover 3D information of dynamic scenes. The results of the two passes will be combined to obtain a complete three dimensional 3D model of the environments.
The invention deals with the problem of recovering 3D geometries of objects and scenes. Recovering the geometry of real-world scene is a challenging problem due to the movement of subjects, large depth discontinuity between foreground and background, and complicated lighting conditions. Fully recovering the complete geometry of a scene in one pass is computationally expensive and unreliable. Moreover, prior techniques for accurate 3D acquisition, such as laser scan, are unacceptable in many situations due to the presence of human subjects. The inventive two-pass approach provides more options to use those high accuracy reconstruction approaches, such as laser scan or structure light, to recover the geometry of the background.
The inventive two-pass approach recovers the geometry of the static background and dynamic foreground separately using different methods. Once the background geometry is acquired, it can be used as prior information to acquire the 3D geometry of moving subjects. It can reduce computational cost and increases reconstruction accuracy by restricting the computation within regions of interest. For instance, for the stereo-based methods for range image acquisition, stereo algorithms often need to search correspondence points in the left and right images. If the background geometry is available, the boundary of the foreground objects can be easily obtained. The boundaries then can be used to reduce the correspondence search range, resulting in less computation cost and higher accuracy of correspondence.
The inventive multi-pass 3D acquisition approach, as noted above, is motivated by the lack of a single method capable of capturing 3D information for large environments reliably. Some method works well indoors but not outdoor, others require a static scene. Also computation complexity and accuracy varies substantially between various methods. The inventive 3D reconstruction defines a framework for capturing 3D information that takes advantage of available techniques and their strengths to obtain the best 3D structure information. Combining multiple methods creates the need for new techniques to register the output of each method in a common coordinate system. The invention presents a simple manual technique to register the views obtained from each method.
The inventive multi-pass 3D acquisition framework will be discussed in the context of film set applications, but can be readily applied to other 3D reconstruction applications. In film set applications, 3D information is acquired in two basic scanning phases.
In a static scan phase, a high accuracy 3D acquisition approach is used to construct a three dimension 3D model of a static scene with no subjects present. In this static scan phase, a highly accurate possibly low speed method is used to acquire 3D data. Possible low speed scan methods include laser scanning or structure light methods. These methods produce highly accurate results in static environments without time constraints. Multiple viewpoints need to be acquired to construct a complete 3D reconstruction of the set.
In a dynamic scan phase, the dynamic acquisition of 3D information needs to be performed with a fast, possibly less accurate method of scanning. In this dynamic scan phase, it assumed that actors or other moving objects would also be present. This constrains the use of some method such as laser scanner because of safety or structure light patterns because it disrupts the film shooting. The most suitable method for this phase is stereo scanning since it satisfies the requirements above with no safety and distraction problems. The resulting stereo pair can also be used directly for real time broadcast. Although the use of stereo is emphasized in the dynamic scan phase because of the advantages above, other techniques such as photometric can be combined with stereo or replace it to improve the performance.
The results obtained in static scan phase can significantly improve the speed/accuracy of stereo matching. The speed improvement is achieved by only searching in an area with motion using the static model obtained in the static scan phase as a reference. The accuracy is improved by using the known 3D structure obtained in static scan phase to obtain more accurate point matching and possibly denser 3D data.
Referring now to the diagram 100 of
In
In the dynamic scan phase with actors and subjects performing, 3D information is obtained using stereo scanning. The results obtained in this dynamic scanning phase must be registered with low scan 3D model information results to obtain a complete 3D model view of the film set. The dynamic scanning phase can be done using a technique similar to that used in registering multiple views described above. In
The diagram 400 in
Having described preferred embodiment for the multi-pass approach to 3D acquisition in a film set application, it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments of the invention disclosed which are within the scope and spirit of the invention as outlined by the appended claims. Having thus described the invention with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2006/022215 | 6/8/2006 | WO | 00 | 12/5/2008 |