The invention relates generally to computer graphics. More specifically, the invention relates to a system and methods for creating and editing three-dimensional models from image panoramas.
One objective in the field of computer graphics is to create realistic images of three-dimensional environments using a computer. These images and the models used to generate them have an incredible variety of applications, from movies, games, and other entertainment applications, to architecture, city planning, design, teaching, medicine, and many others.
Traditional techniques in computer graphics attempt to create realistic scenes using geometric modeling, reflection and material modeling, light transport simulation, and perceptual modeling. Despite the tremendous advances that have been made in these areas in recent years, such computer modeling techniques are not able to create convincing photorealistic images of real and complex scenes.
An alternate approach, known as image-based modeling and rendering (IBMR) is becoming increasingly popular, both in computer vision and graphics. IBMR techniques focus on the creation of three-dimensional rendered scenes starting from photographs of the real world. Often, to capture a continuous scene (e.g., an entire room, a large landscape, or a complex architectural scene) multiple photographs, taken from various viewpoints can be stitched together to create an image p anorama. The scene can then be viewed from various directions, but cannot move in space, since there is no geometric information.
Existing IBMR techniques have focused on the problems of modeling and rendering captured scenes from photographs, while little attention has been given to the problems of interactively creating and editing image-based representations and objects within the images. While numerous software packages (such as ADOBE PHOTOSHOP, by Adobe Systems Incorporated, of San Jose, Calif.) provide photo-editing capabilities, none of these packages adequately addresses the problems of interactively creating or editing image-based representations of three-dimensional scenes including objects using panoramic images as input.
What is needed is editing software that includes familiar photo-editing tools adapted to create and edit an image-based representation of a three-dimensional scene captured using panoramic images.
The invention provides a variety of tools and techniques for authoring photorealistic three-dimensional models by adding geometry information to panoramic photographic images, and for editing and manipulating panoramic images that include geometry information. The geometry information can be interactively created, edited, and viewed on a display of a computer system, while the corresponding pixel-level depth information used to render the information is stored in a database. The storing of the geometry information to the database is done in two different representations: vector-based and pixel-based. Vector-based geometry stores the vertices and triangle geometry information in three-dimensional space, while pixel-based representation stores the geometry as a depth map. A depth map is similar to a texture map, however it stores the distance from the camera position (i.e. the point of acquisition of the image) instead of color information. Because each data representation can be converted to the other, the terms pixel-based and vector-based geometry are used synonymously.
The software tools for working with such images include tools for specifying a reference coordinate system that describes a point of reference for modeling and editing, aligning certain features of image panoramas to the reference coordinate system, “extruding” elements of the image from the aligned features for using vector-based geometric primitives such as triangles and other three-dimensional shapes to define pixel-based depth in a two-dimensional image, and tools for “clone brushing” portions of an image with depth information while taking the depth information and lighting into account when copying from one portion of the image to another. The tools also include re-lighting tools that separate illumination information from texture information.
This invention relates to extending image-based modeling techniques discussed above, and combining them with novel graphical editing techniques to produce and edit photorealistic three-dimensional computer graphics models from generalized panoramic image data. Preferably, the present invention comprises one or more tools useful with a computing device having a graphical user interface to facilitate interaction with one or more images, represented as image data, as described below. In general, the systems and methods of the invention display results quickly, for use in interactively modeling and editing a three dimensional scene using one or more image panoramas as input.
In one aspect, the invention provides a computerized method for creating a three dimensional model from one or more panoramas. The method includes steps of receiving one or more image panoramas representing a scene having one or more objects, determining a directional vector for each image panorama that indicates an orientation of the scene with respect to a reference coordinate system, transforming the image panoramas such that the directional vectors are substantially aligned with the reference coordinate system, aligning the transformed image panoramas to each other, and creating a three dimensional model of the scene from the transformed image panoramas using the reference coordinate system and comprising depth information describing the geometry of one or more objects contained in the scene. Thus, objects in the scene can be edited and manipulated from an interactive viewpoint, but the visual representations of the edits will remain consistent with the reference coordinate system.
In some embodiments, the determination of a directional vector is based at least in part on instructions received from a user of the computerized method. In some embodiments, the instructions identify two or more visual features in the image panorama that are substantially parallel. In some embodiments, the instructions identify two sets of substantially parallel features in the image panorama. In some embodiments, the instructions identify and manipulate a horizon line of the image panorama. In some embodiments, the instructions identify two or more areas within the image that contain one or more elements, and automatically identifying the elements contained in the areas. In some embodiments, the automatic detection can be done using techniques such as edge detection and image processing techniques. In some embodiments, the image panoramas are aligned with respect to each other according to instructions from a user.
In some embodiments, the panorama transformation step includes aligning the directional vectors such that they are at least substantially parallel to the reference coordinate system. In some embodiments, the transformation step includes aligning the directional vectors such that they are at least substantially orthogonal to the reference coordinate system.
In another aspect, the invention provides a computerized method of interactively editing objects in a panoramic image. The method includes the steps of receiving an image panorama with a defined point source, creating a three-dimensional model of the scene using features of the visual scene and the point source, receiving an edit to an object in the image panorama, transforming the edit relative to a viewpoint defined by the point source, and projecting the transformed edit onto the object.
In some embodiments, the three-dimensional model includes either depth information, geometry information, or in some embodiments, both. In some embodiments, receiving an edit includes receiving an edit to the color information associated with objects of the image, or to the alpha (i.e., transparency) information associated with objects of the image. In some embodiments, receiving an edit includes receiving an edit to the depth or geometry information associated with objects of the image. In these embodiments, the method may include providing a user with one or more interactive drawing tools or interactive modeling tools for specifying edits to the depth and geometry information, color and texture information of objects in the image. The interactive tools can be one or more of an extrusion tool, a ground plane tool, a depth chisel tool, and a non-uniform rational B-spline tool. In some embodiments, the interactive drawing and geometric modeling tools select a value or values for the depth of an object of the image. In some embodiments the interactive depth editing tools add to or subtract from the depth for an object of the image.
In another aspect, the invention provides a method for projecting texture information onto a geometric feature within an image panorama. The method includes receiving instructions from a user identifying a three-dimensional geometric surface within an image panorama having features with one or more textures; determining a directional vector for the geometric surface, creating a geometric model of the image panorama based at least in part on the surface and the directional vector, and applying the textures to the features in the image panorama based on the geometric model.
In some embodiments, the instructions are received using an interactive drawing tool. In some embodiments, the geometric surface is one of a wall, a floor, or a ceiling. In some embodiments, the directional vector is substantially orthogonal to the surface. In some embodiments, the texture information comprises color information, and in some embodiments the texture information comprises luminance information.
In another aspect, the invention provides a method for creating a three-dimensional model of a visual scene from a set of image panoramas. The method includes receiving multiple image panoramas, arranging each image panorama to a common reference system, receiving information identifying features common to two or more of the arranged panoramas, aligning to two or more image panoramas to each other using the identified features, and creating a three-dimensional model from the aligned image panoramas.
In some embodiments, the instructions are received using an interactive drawing tool, which in some embodiments is used to identify four or more features common to the two or more image panoramas.
In another aspect, the invention provides a system for creating a three-dimensional model from one or more image panoramas. The system includes a means for receiving one or more image panoramas representing a visual scene having one or more objects, a means for allowing a user to interactively determine a directional vector for each image panorama, a means for aligning the image panoramas relatively to each other, and a means for creating a three-dimensional model from the aligned panoramas.
In some embodiments, the input images comprise two-dimensional images, and in some embodiments, the input images comprise three-dimensional images including one or more of depth information and geometry information. In some embodiments, the image panoramas are globally aligned with respect to each other.
In another aspect, the invention provides a system for interactively editing objects in a panoramic image. The system includes a receiver for receiving one or more image panoramas, where the image panoramas represent a visual scene and have one or more objects and a point source. The system further includes a modeling module for creating a three-dimensional model of the visual scene such that the model includes depth information describing the objects, one or more interactive editing tools for providing an edit to the objects, a transformation module for transforming the edit to a viewpoint defined by the point source, and a rendering module for projecting the transformed edit onto the objects.
In some embodiments, the interactive editing tools include a ground plane tool, an extrusion tool, a depth chisel tool, and anon-uniform rational B-spline tool.
The above and further advantages of the invention may be better understood by referring to the following description taken in conjunction with the accompanying drawings, in which:
a is a diagram illustrating a cube panorama in accordance with one embodiment of the invention.
b is a diagram illustrating a cube panorama in accordance with one embodiment of the invention.
c is a diagram illustrating a sphere panorama in accordance with one embodiment of the invention.
a is a diagram illustrating a camera positioned within a room for taking panoramic photographs in accordance with one embodiment of the invention.
b is a diagram illustrating a spherical image panorama representation of the room of
a is a diagram illustrating the local alignment of a panorama in accordance with one embodiment of the invention.
b is a photograph with features identified illustrating the local alignment of a panorama in accordance with one embodiment of the invention.
a is a diagram illustrating the spherical image panorama of
b is the photograph of
a, 11b, and 11c are diagrams illustrating local alignment with two sets of parallel lines in accordance with one embodiment of the invention.
a and 14b are two panoramas to be used in creating a three-dimensional model in accordance with one embodiment of the invention.
a and 15b are images being edited to create a three-dimensional model in accordance with one embodiment of the invention.
a, 16b, and 16c are diagrams illustrating the global alignment process in accordance with one embodiment of the invention.
a, 17b, and 17c are diagrams illustrating the global alignment process in accordance with one embodiment of the invention.
a, 18b, and 18c are diagrams illustrating the global alignment process in accordance with one embodiment of the invention.
a, 22b, and 22c are diagrams illustrating the positioning of a reference plane in accordance with one embodiment of the invention.
a and 26b are diagrams illustrating the rotation of a reference plane in accordance with one embodiment of the invention.
a and 27b are diagrams illustrating locating a reference plane based on the selection of points in a plane in accordance with one embodiment of the invention.
a, 28b, and 28c are diagrams of a screen view, two-dimensional top view, and three-dimensional view respectively illustrating the use of an interactive ground-plane tool to extrude depth information in accordance with one embodiment of the invention.
a, 29b, and 29c are diagrams of a screen view, two-dimensional top view, and three-dimensional view respectively illustrating further use of an interactive ground-plane tool to extrude depth information in accordance with one embodiment of the invention.
a, 30b, and 30c are diagrams of a screen view, two-dimensional top view, and three-dimensional view respectively illustrating further use of an interactive ground-plane tool to extrude depth information in accordance with one embodiment of the invention.
a, 31b, and 31c are diagrams of a screen view, two-dimensional top view, and three-dimensional view respectively illustrating further use of an interactive ground-plane tool to extrude depth information in accordance with one embodiment of the invention.
a, 32b, and 32c are diagrams of a screen view, two-dimensional top view, and three-dimensional view respectively illustrating the use of an interactive vertical tool to extrude depth information in accordance with one embodiment of the invention.
a, 33b, and 33c are diagrams illustrating a screen view, two-dimensional top view, and three-dimensional view respectively of a modeled room in accordance with one embodiment of the invention.
a, 34b, and 34c are diagrams illustrating three-dimensional views and a screen view of a modeled image panorama in accordance with one embodiment of the invention.
a, 41b, and 41c are images illustrating texture mapping in accordance with one embodiment of the invention.
The receiving step 100 includes receiving the original panorama. Alternatively, the computer system can accept for editing a 3D panoramic image that already has some geometric or depth information. 3D images represent a three-dimensional scene, and may include three-dimensional objects, but may be displayed to a user as a 2D image on, for example, a computer monitor. Such images may be acquired from a variety of laser, optical, or other depth measuring techniques for a given field of view. The image may be input by way of a scanner, electronic transfer, via a computer-attached digital camera, or other suitable input mechanism. The image can be stored in one or more memory devices, including local ROM or RAM, which can be permanent to or removable from a computer. In some embodiments, the image can be stored remotely and manipulated over a communications link such as a local or wide area network, an intranet, or the Internet using wired, wireless, or any combination of connection protocols.
Other panorama types such as spherical panoramas or conical panoramas can also be used in accordance with the methods and systems of this invention. For example,
Referring again to
In one embodiment, the features designated by the user generally may comprise any two architectural features, decorative features, or other elements of the image that are substantially parallel to each other. Examples include, but are not necessarily limited to the intersection line of two walls, the sides of columns, edges of windows, lines on wallpaper, edges of wall hangings, or, in the case of outdoor scenes, trees or buildings. Alternatively, in some embodiments, the detection of the elements used for the local alignment step 205 may be done automatically. For example, a user may specify a region or regions that may or may not contain elements to be used for local alignment, and elements are identified using image processing techniques such as snapping, Gaussian edge detection, and other filtering and detection techniques.
a and 7b illustrate one embodiment of the manner in which an image panorama of the room 200 is represented to the user as a spherical panorama. The user, typically using a tripod, takes a series of photographs from a single position while rotating the camera 210 to a full 360 degrees, as shown in
a and 8b illustrate one embodiment of the local alignment step 105. The image panorama is presented to the user with the axes of global reference 300 imposed onto the image. However, at this point, the “up” vector of the image has not been identified, and therefore the features of the image are not aligned with the global reference 300. Using one or more interactive alignment tools, the user identifies two vertical features of the scene that the user believes to be substantially parallel, 810 and 820. Given that two parallel lines, when extended to infinity, meet at a point defined as their “vanishing point,” the system can extend the features 810 and 820 around the entire panorama, creating circles 830 and 840. The circles 830 and 840 intersect at point y′ 850—the vanishing point for the two lines 830 and 840 in three-dimensional coordinates. A reference line 860 is then created connecting the point y′ 850 with the point source 310 creating an “up” vector for the panorama. Rotating the image by an angle .alpha. 870 such that the reference line 860 is aligned with they axis 330 of the global reference 300, the features become locally aligned with they axis 330 of the global reference 300, as depicted in
In some embodiments, more than two features can be used to align the image panorama. For example, where three features are identified, three intersection points can be determined—one for each set of two lines. A true vanishing point can then be linearly interpolated from the three intersection points. This approach can be extended to include additional features as need or as identified by the user.
In another embodiment of the local alignment step 105, the system can determine the horizon line based on user's identification of horizontal features in the original panorama. Similar to the local alignment step described above, the user traces horizontal features that exist in the original panorama. Referring to
Referring to
In another embodiment, a user indicates a horizon line by directly specifying the line segment that represents the horizon. This approach is useful when features of the image are not known to be parallel, or the image is of an outdoor scene such as
In another embodiment of the local alignment step 105, a user employs a manual local alignment tool to rotate the original panorama to be aligned with the global reference coordinate system. The user uses a mouse or other pointing and dragging device such as a track ball to orient the panorama to the true horizon, i.e. a concentric circle around the panorama position that is parallel to the XZ plane.
Once a set of image panoramas are locally aligned to a global reference 300, the global alignment step 110 aligns multiple panoramas to each other by matching features in one panorama to a corresponding features in other panoramas. Generally, if a user can determine that a line representing the intersection of two planes in panorama 1 is substantially vertical, and can identify a similar feature in panorama 2, the correspondence of the two features allows the system to determine the proper rotation and translation necessary to align panorama 1 and panorama 2. Initially, the multiple image panoramas must be properly rotated such that the global reference 300 is consistent (i.e., the x, y and z axes are aligned) and once rotated, the image must be translated such that the relationship between the first camera position and the second camera position can be calculated.
a illustrates an image panorama 1400 of a building 1430 taken from a known first camera position.
a and 15b illustrate a step in the global alignment step 110. Using a drawing tool, tracing tool, pointing tool, or some other interactive device, a user identifies points 1, 2, 3, and 4 in the first panorama 1400, thus associating the facade 1440 with the plane 1505. Similarly, the user identifies the same four points in image 1410, creating the same plane 1505, although viewed from a different vantage point.
Continuing with the global alignment process and referring to
Referring to
Once the panoramas are properly rotated, the second panorama can be translated to the correct position in world coordinates to match its relative position to the first panorama. As shown in
The optimization is formulated such that the closest distances between the corresponding lines from one panorama to the other are minimized, with a constraint that the panorama positions 1600 and 1700 are not equal. The unknown parameters are the X, Y, and Z position of panorama position 1700. The weights on the optimization parameters may also be adjusted accordingly. In some embodiments, the X and Z (i.e. the ground plane) parameters are given greater weight than Y, since real-world panorama acquisition often takes place at an equivalent distance from the ground.
Similarly, another technique is to use an extrusion tool, as is described in detail herein, to create two separate matching facade geometries from each panorama. The system then optimizes the distance between four corresponding points to determine the X, Y, Z position of panorama 1410, as shown in
By aligning multiple panoramas in serial fashion, this allows multiple users to access and align multiple panoramas simultaneously, and avoids the need for global optimization routines that attempt to align every panorama to each other in parallel. For example, if a scene was created using 100 image panoramas, a global optimization routine would have to resolve 100.sup.100 possible alignments. Taking advantage of the user's knowledge of the scene and providing the user with interactive tools to supply some or all of the alignment information significantly reduces the time and computational resources needed to perform such a task.
In some instances, it may be beneficial for the origin of the global reference 300 to be co-located with a particular feature in the image. For example, and referring to
In other embodiment, the user can rotate the reference plane about any axis of the global reference 300 if required by the geometry being modeled. Referring to
It another embodiment, the user can locate a reference plane by identifying three or more features on an existing geometry within the image. For example and referring to
Once the image panoramas are aligned with each other and a reference plane has been defined, the user creates a geometric model of the scene. The geometric modeling step 115 includes using one or more interactive tools to define the geometries and textures of elements within the image. Unlike traditional geometric modeling techniques where pre-defined geometric structures are associated with elements in the image in a retrofit manner, the image-based modeling methods described herein utilize visible features within the image to define the geometry of the element. By identifying the geometries that are intrinsic to elements of the image, the textures and lighting associated with the elements can be then modeled simultaneously.
After the input panoramas have been aligned, the system can start the image-based modeling process.
a, 29b, and 29c illustrate the use of the reference plane tool with which the user identifies the ground plane 350. Starting at the previously identified point 2805, the user draws a line 2905 following the intersection of one wall with the floor to a point 2920 in the image representing the intersection of the floor with another wall.
a, 30b, and 30c further illustrate the use of the reference plane tool with which the user identifies the ground plane 350. Continuing around the room, the user traces lines representing the intersections of the floors with the walls. In some embodiments where the room being modeled is not a quadrilateral, the user traces around the features that define the peculiarities of the room. For example, area 3005 represents a small alcove within the room which cannot be seen from some perspectives. However lines 3010, 3015, and 3020 can be drawn to define the alcove 3005 such that the model is consistent with the actual room shape by constraining the floor-wall edge drawing to match the existing shape and feature of the room. Multiple panorama acquisition can be used to fill in the occluded information not visible from the current panoramic view. The process continues until the entire ground plane has been traced, as illustrated in
With the reference plane defined, the user can “extrude” the walls based on the known shape and alignment of the room.
In some embodiments, the reference plane extrusion tool can be used without an image panorama as an input. For example, where scene is built using geometric modeling methods not including photos, the extrusion tool can extend features of the model, and create additional geometries within the model based on user input.
In some embodiments, the reference plane tool and the extrusion tool can be used to model curved geometric elements. For example, the user can trace on the reference plane the bottom of a curved wall and use the extrusion tool to create and texture map the curved wall.
a, 34b, and 34c illustrate one example of an interior scene modeled using a single panoramic input image, the reference plane tool coupled with the extrusion tool.
To define additional geometric features, the default reference plane 3610 is rotated onto the defined geometry containing the feature to be modeled such that the user can trace the feature with respect to the reference plane 3610. For example, as illustrated in
The texture projection step 120 includes using one or more interactive tools to project the appropriate textures from the original panorama onto the objects in the model. The geometric modeling step 115 and texture mapping step 120 can be done simultaneously as a single step from the user's perspective. The texture map for the modeled geometry is copied from the original panorama, but as a rectified image.
As shown in
The color channels are used to assign colors to pixels in the image. In a one embodiment, the color channels comprise three individual color channels corresponding to the primary colors red, green and blue, but other color channels could be used. Each pixel in the image has a color represented as a combination of the color channels. The alpha channel is used to represent transparency and object masks. This permits the treatment of semi-transparent objects and fuzzy contours, such as trees or hair. A depth channel is used to assign 3D depth for the pixels in the image.
With the image panoramas stored in the data structure, the image can be viewed using a display 4215. Using the display 4215 and a set of interactive tools 4220, the user interacts with the image causing the edits to be transformed into changes to the data structures. This organization makes it easy to add new functionality. Although the features of the system are presented sequentially, all processes are naturally interleaved. For example, editing can start before depth is acquired, and the representation can be refined while the editing proceeds.
In some embodiments, the functionality of the systems and methods described above can be implemented as software on a general-purpose computer. In such an embodiment, the program can be written in any one of a number of high-level languages, such as FORTRAN, PASCAL, C, C++, C#, LISP, JAVA, or BASIC. Further, the program can be written in a script, macro, or functionality embedded in commercially available software, such as VISUAL BASIC. The program may also be implemented as a plug-in for commercially or otherwise available image editing software, such as ADOBE PHOTOSHOP. Additionally, the software could be implemented in an assembly language directed to a microprocessor resident on a computer. For example, the software could be implemented in Intel 80.times.86 assembly language if it were configured to run on an IBM PC or PC clone. The software can be embedded on an article of manufacture including, but not limited to, a “computer-readable medium” such as a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, or CD-ROM.
While the invention has been particularly shown and described with reference to specific embodiments, it should be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. The scope of the invention is thus indicated by the appended claims and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced.
This application claims the benefit of U.S. Provisional Application No. 60/447,652, entitled “Photorealistic 3D Content Creation and Editing From Generalized Panoramic Image Data,” filed Feb. 14, 2003, and U.S. application Ser. No. 10/780,500, entitled “Modeling and Editing Image Panoramas,” filed Feb. 17, 2004, the contents of which are hereby incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
60447652 | Feb 2003 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10780500 | Feb 2004 | US |
Child | 14062544 | US |