Many users may create image data using various devices, such as digital cameras, tablets, mobile devices, smart phones, etc. For example, a user may capture a set of images depicting a beach using a mobile phone while on vacation. The user may organize the set of images to an album, a cloud-based photo sharing stream, a visualization, etc. In an example of a visualization, the set of images may be stitched together to create a panorama of a scene depicted by the set of images. In another example of a visualization, the set of images may be used to create a spin-movie. Unfortunately, navigating the visualization may be unintuitive and/or overly complex due to the set of images depicting the scene from various viewpoints.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key factors or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Among other things, one or more systems and/or techniques for generating a synth packet and/or for providing an interactive view navigation experience utilizing the synth packet are provided herein.
In some embodiments of generating a synth packet, a navigation model associated with a set of input images depicting a scene may be identified. The navigation model may correspond to a capture pattern associated with positional information and/or rotational information of a camera used to capture the set of input images. For example, the capture pattern may correspond to one or more viewpoints from which the input images were captured. In an example, a user may walk down a street while taking pictures of building facades every few feet, which may correspond to a strafe capture pattern. In another example, a user may walk around a statue in a circular motion while taking pictures of the statue, which may correspond to a spin capture pattern.
A local graph structured according to the navigation model may be constructed. The local graph may specify relationship information between respective input images within the set of images. For example, the local graph may comprise a first node representing a first input image and a second node representing a second input image. A first edge may be created between the first node and the second node based upon the navigation model indicating that the second image has a relationship with the first image (e.g., the user may have taken the first image of the statue, walked a few feet, and then taken the second image of the statue, such that a current view of the scene may be visually navigated from the first image to the second image). The first edge may represent translational view information between the first input image and the second input image, which may be used to generate a translated view of the scene based upon image data contributed from the first image and the second image. In another example, the navigation model may indicate that a third image was taken from a viewpoint that is substantially far away from the viewpoint from which the first image and the second image were taken (e.g., the user may have to walk halfway around the statue before taking the third image). Thus, the first node and the second node may not be connected to a third node representing the third image within the local graph because visually navigating from the first image or the second image to the third image may result in various visual quality issues (e.g., blur, jumpiness, incorrect depiction of the scene, seam lines, and/or other visual error).
A synth packet comprising the set of input images and the local graph may be generated. The local graph may be used to navigate between the set of input images during an interactive view navigation of the scene (e.g., a visualization). A user may be capable of continuously navigating the scene in one-dimensional space and/or two-dimensional space using interactive view navigation input (e.g., one or more gestures on a touch device that translate into direct manipulation of a current view of the scene). The interactive view navigation of the scene may appear to the user as a single navigable visualization (e.g., a panorama, a spin movie around an object, moving down a corridor, etc.) as opposed to navigating between individual input images. In some embodiments, the synth packet comprises a camera pose manifold (e.g., view perspectives from which the scene may be viewed), a coarse geometry (e.g., a multi-dimensional representing of a surface of the scene upon which one or more input images may be projected), and/or other image information.
In some embodiments of providing an interactive view navigation experience, the synth packet comprises the set of input images, the camera pose manifold, the coarse geometry, and the local graph. The interactive view navigation experience may display one or more current views of the scene depicted by a set of input images (e.g., a facial view of the statue). The interactive view navigation experience may allow a user to continuously and/or seamlessly navigate the scene in multidimensional space based upon interactive view navigation input. For example, the user may visually “walk around” the statue as though the scene of the statue was a single multi-dimensional visualization, as opposed to visually transitioning between individual input images. The interactive view navigation experience may be provided based upon navigating the local graph within the synth packet. For example, responsive to receiving interactive view navigation input, the local graph may be navigated (e.g., traversed) from a first portion (e.g., a first node or a first edge) to a second portion (e.g., a second node or a second edge) based upon the interactive view navigation input (e.g., navigation from a first node, representing a first image depicting the face of the statue, to a second node representing a second image depicting a left side of the statue). The current view of the scene (e.g., the facial view of the statue) may be transitioned to a new current view of the scene corresponding to the second portion of the local graph (e.g., a view of the left side of the statue). Transitioning between nodes and/or edges may be translated into seamless three-dimensional navigation of the scene.
To the accomplishment of the foregoing and related ends, the following description and annexed drawings set forth certain illustrative aspects and implementations. These are indicative of but a few of the various ways in which one or more aspects may be employed. Other aspects, advantages, and novel features of the disclosure will become apparent from the following detailed description when considered in conjunction with the annexed drawings.
The claimed subject matter is now described with reference to the drawings, wherein like reference numerals are generally used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide an understanding of the claimed subject matter. It may be evident, however, that the claimed subject matter may be practiced without these specific details. In other instances, structures and devices are illustrated in block diagram form in order to facilitate describing the claimed subject matter.
An embodiment of generating a synth packet is illustrated by an exemplary method 100 of
Because the set of input images may not depict every aspect of the scene at a desired quality and/or resolution, a suggested camera position, derived from the navigation model and one or more previously captured input images, may be provided during capture of an input image for inclusion within the set of input images. The suggested camera position may correspond to a view of the scene not depicted by the one or more previously captured input images. For example, the navigation model may correspond to spin capture pattern where a user walked around the house taking pictures of the house. However, the user may not have adequately captured a second story side view of the house, which may be identified based upon the spin capture pattern and the one or more previously captured input images of the house. Accordingly, a suggested camera position corresponding to the second story side view may be provided. In another example, a new input image may be automatically captured for inclusion within the set of input images based upon the new input image (e.g., a current camera view of the scene) depicting the scene from a view, associated with the navigation model, not depicted by the set of input images.
In an example, the navigation model may correspond to a capture pattern associated with positional information and/or rotational information of a camera used to capture at least one input image of the set of input images. The navigation model may be identified based upon the capture pattern.
At 106, a local graph is constructed. The local graph is structured according to the navigation model (e.g., the navigation model may provide insight into how to navigate from a first input image to a second input image because the first input image and the second input image were taken from relatively similar viewpoints of the scene; how to create a current view of the scene from a transitional view corresponding to multiple input images; and/or that navigating from the first input image to a third input image may produce visual error because the first input image and the third input image were taken from relatively different viewpoints of the scene). The local graph may specify relationship information between respective input images within the set of input images, which may be used during navigation of the scene. For example, a current view may correspond to a front portion of a house depicted by a first input image. Interactive view navigation input corresponding to a rotational sweep from the front portion of the house to a side portion of the house may be detected. The local graph may comprise relationship information indicating that a second input image (e.g., or a translational view derived from multiple input images being projected onto a coarse geometry) may be used to provide a new current view of depicting the side portion of the house.
In an example, the local graph comprises one or more nodes connected by one or more edges. For example, the local graph comprise a first node representing a first input image (e.g., depicting the front portion of the house), a second node representing a second input image (e.g., depicting the side portion of the house), a third node representing a third input image (e.g., depicting a back portion of the house), and/or other nodes. A first edge may be created between the first node and the second node based upon the navigation model specifying a view navigation relationship between the first image and the second image (e.g., the first input image and the second input image were taken from relatively similar viewpoints of the scene). However, the first node may not be connected to the third node by an edge based upon the navigation model (e.g., the first input image and the third input image were taking from relatively different viewpoints of the scene). In an example, a current view of the front portion of the house may be seamlessly navigated to a new current view of the side portion of the house (e.g., the first image may be displayed, then one or more transitional views based upon the first image and the second image may be displayed, and finally the second image may be displayed) based upon traversing the local graph from the first node to the second node along the first edge. Because the local graph does not have an edge between the first node and the third node, the current view of the front portion of the house cannot be directly transitioned to the back portion of the house, which may otherwise produce visual errors and/or a “jagged or jumpy” transition. Instead, the graph may be traversed from the first node to the second node, and then from the second node to the third node based upon a second edge connecting the second node to the third node (e.g., the first image may be displayed, then one or more transitional views between the first image and the second image may be displayed, then the second image may be displayed, then one or more transitional views between the second image and the third image may be displayed, and then finally the third image may be displayed). In this way, a user may seamlessly navigate and/or explore the scene of the house by transitioning between input images along edges connecting nodes representing such images within the local graph.
At 108, a synth packet comprising the set of input images and the local graph is generated. In some embodiments, the synth packet comprises a single file (e.g., a file comprising information that may be used to construct a visualization of the scene and/or provide a user with an interactive view navigation of the scene). In some embodiments, the synth packet comprises the camera pose manifold and/or the coarse geometry. The synth packet may be used to provide an interactive view navigation experience, as illustrated by
In some embodiments, the packet generation component 404 is configured to construct a coarse geometry 412 of the scene. Because the coarse geometry 412 may initially represent a non-textured multi-dimensional surface of the scene, one or more input images within the set of input images 402 may be projected onto the coarse geometry 412 to texture (e.g., assign color values to geometry pixels) the coarse geometry, resulting in textured coarse geometry. Because a current view of the scene may not directly correspond to a single input image, the current view may be derived from the coarse geometry 412 (e.g., the textured coarse geometry) from a view perspective defined by the camera pose manifold 410. In this way, the packet generation component 404 may generate the synth packet 408 comprising the set of input images 402, the camera pose manifold 410, the coarse geometry 412, and/or the local graph 414. The synth packet 408 may be used to provide an interactive view navigation experience of the scene. For example, a user may visually explore the outside of the house in three-dimensional space as though the house were represented by a single visualization, as opposed to individual input images (e.g., one or more current views of the scene may be constructed by navigating the local graph 414).
An embodiment of providing an interactive view navigation experience utilizing a synth packet is illustrated by an exemplary method 600 of
The view navigation experience may correspond to a presentation of an interactive visualization (e.g., a panorama, a spin movie, a multi-dimensional space representing the scene, etc.) that a user may navigate in multi-dimensional space to explore the scene depicted by the set of input images. The view navigation experience may provide a 3D experience by navigating from input image to input image, along edges within the local graph, in 3D space (e.g., allowing continuous navigation between input images as though the visualization of the scene was a single navigable entity as opposed to individual input images). That is, the set of input images within the synth packet may be continuously and/or intuitively navigable as a single visualization unit (e.g., a user may continuously navigate through the scene by merely swiping across the visualization, and may intuitively navigate through the scene where navigation input may translate into direct navigation manipulation of the scene). In particular, the scene may be explored as a single visualization because the set of input images are represented on a single continuous manifold within a simple topology, such as the local graph (e.g., spinning around an object, looking at a panorama, moving down a corridor, and/or other visual navigation experiences of a single visualization). Navigation may be simplified because the dimensionality of the scene may be reduced to merely one or more dimensions of the local graph. Thus, navigation of complex image configurations may become feasible on various computing devices, such as a touch device where a user may navigate in 3D space using left/right gestures for navigation in a first dimension and up/down gestures for navigation in a second dimension. The user may be able to zoom into areas and/or navigate to a second scene depicted by second synth packet using other gestures, for example.
At 602, the method starts. At 604, an interactive view navigation input associated with the interactive view navigation experience may be received. At 606, the local graph may be navigated from a first portion of the local graph (e.g., a first node representing a first image used to generate a current view of the scene; a first edge representing a translated view of the scene derived from a projection of one or more input images onto the coarse geometry from a view perspective defined by the camera pose manifold; etc.) to a second portion of the local graph (e.g., a second node representing a second image that may depict the scene from a viewpoint corresponding to the interactive view navigation input; a second edge representing a translated view depicting the scene from a viewpoint corresponding to the interactive view navigation input; etc.) based upon the interactive view navigation. In an example, a current view of a northern side of a house may have been derived from a first input image represented by a first node. A first edge may connect the first node to a second node representing a second input image depicting a northeastern side of the house. For example, the first edge may connect the first node and the second node because the first image and the second image were captures from relatively similar viewpoints of the house. The first edge may be traversed to the second node because the interactive view navigation input may correspond to a navigation of the scene from the northern side of the house to a northeastern side of the house (e.g., a simple gesture may be used to seamlessly navigate to the northeastern side of the house from the northeastern side). At 608, a current view of the scene (e.g., depicting the northern side of the house) corresponding to the first portion of the local graph may be transitioned to a new current view of the scene (e.g., depicting the northeastern side of the house) corresponding to the second portion of the local graph.
In an example, the interactive view navigation input corresponds to the second node within the local graph. Accordingly, the new current view is displayed based upon the second image represented by the second node. In another example, the interactive view navigation input corresponds to the first edge connecting the first node and the second node. The new current view may be displayed based upon a projection of the first image, the second image and/or other images onto the coarse geometry (e.g., thus generating a textured coarse geometry) utilizing the camera pose manifold. The new current view may correspond to a view of the textured coarse geometry from a view perspective defined by the camera pose manifold. At 610, the method ends.
The system 700 may comprise an image viewing interface component 704. The image viewing interface component 704 may be configured to display a current view of the scene based upon navigation within the visualization 706. It may be appreciated that in an example, navigation of the visualization 706 may correspond to multi-dimensional navigation, such as three-dimensional navigation, and that merely one-dimensional and/or two-dimensional navigation are illustrated for simplicity. The current view may correspond to a second node, representing the second input image 710 depicting the portion of the cloud and the portion of the sun, within the local graph. Responsive to receiving interactive view navigation input 716 (e.g., a gesture swiping right across a touch device), the local graph may be traversed from the second node, across a second edge, to a third node representing the third image 712. A new current view may be displayed based upon the third image 712. In this way, a user may seamlessly navigate the visualization 706 as though the visualization 706 was a single navigable entity (e.g., based upon structured movement along edges and/or between nodes within the local graph) as opposed to individual input images.
Still another embodiment involves a computer-readable medium comprising processor-executable instructions configured to implement one or more of the techniques presented herein. An example embodiment of a computer-readable medium or a computer-readable device that is devised in these ways is illustrated in
As used in this application, the terms “component”, “module,” “system”, “interface”, and the like are generally intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a component includes a process running on a processor, a processor, an object, an executable, a thread of execution, a program, or a computer. By way of illustration, both an application running on a controller and the controller can be a component. One or more components residing within a process or thread of execution and a component is localized on one computer or distributed between two or more computers.
Furthermore, the claimed subject matter is implemented as a method, apparatus, or article of manufacture using standard programming or engineering techniques to produce software, firmware, hardware, or any combination thereof to control a computer to implement the disclosed subject matter. The term “article of manufacture” as used herein is intended to encompass a computer program accessible from any computer-readable device, carrier, or media. Of course, many modifications may be made to this configuration without departing from the scope or spirit of the claimed subject matter.
Generally, embodiments are described in the general context of “computer readable instructions” being executed by one or more computing devices. Computer readable instructions are distributed via computer readable media as will be discussed below. Computer readable instructions are implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), data structures, and the like, that perform particular tasks or implement particular abstract data types. Typically, the functionality of the computer readable instructions are combined or distributed as desired in various environments.
In other embodiments, device 912 includes additional features or functionality. For example, device 912 also includes additional storage such as removable storage or non-removable storage, including, but not limited to, magnetic storage, optical storage, and the like. Such additional storage is illustrated in
The term “computer readable media” as used herein includes computer storage media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions or other data. Memory 918 and storage 920 are examples of computer storage media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVDs) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by device 912. Any such computer storage media is part of device 912.
The term “computer readable media” includes communication media. Communication media typically embodies computer readable instructions or other data in a “modulated data signal” such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” includes a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
Device 912 includes input device(s) 924 such as keyboard, mouse, pen, voice input device, touch input device, infrared cameras, video input devices, or any other input device. Output device(s) 922 such as one or more displays, speakers, printers, or any other output device are also included in device 912. Input device(s) 924 and output device(s) 922 are connected to device 912 via a wired connection, wireless connection, or any combination thereof. In some embodiments, an input device or an output device from another computing device are used as input device(s) 924 or output device(s) 922 for computing device 912. Device 912 also includes communication connection(s) 926 to facilitate communications with one or more other devices.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter of the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
Various operations of embodiments are provided herein. The order in which some or all of the operations are described should not be construed as to imply that these operations are necessarily order dependent. Alternative ordering will be appreciated by one skilled in the art having the benefit of this description. Further, it will be understood that not all operations are necessarily present in each embodiment provided herein.
It will be appreciated that layers, features, elements, etc. depicted herein are illustrated with particular dimensions relative to one another, such as structural dimensions and/or orientations, for example, for purposes of simplicity and ease of understanding and that actual dimensions of the same differ substantially from that illustrated herein, in some embodiments.
Further, unless specified otherwise, “first,” “second,” or the like are not intended to imply a temporal aspect, a spatial aspect, an ordering, etc. Rather, such terms are merely used as identifiers, names, etc. for features, elements, items, etc. For example, a first object and a second object generally correspond to object A and object B or two different or two identical objects or the same object.
Moreover, “exemplary” is used herein to mean serving as an example, instance, illustration, etc., and not necessarily as advantageous. As used in this application, “or” is intended to mean an inclusive “or” rather than an exclusive “or”. In addition, “a” and “an” as used in this application are generally be construed to mean “one or more” unless specified otherwise or clear from context to be directed to a singular form. Also, at least one of A and B and/or the like generally means A or B or both A and B. Furthermore, to the extent that “includes”, “having”, “has”, “with”, or variants thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising”.
Also, although the disclosure has been shown and described with respect to one or more implementations, equivalent alterations and modifications will occur to others skilled in the art based upon a reading and understanding of this specification and the annexed drawings. The disclosure includes all such modifications and alterations and is limited only by the scope of the following claims.