This relates generally to user interfaces that enable a user to scan real-world objects on an electronic device.
Extended reality settings are environments where at least some objects displayed for a user's viewing are generated using a computer. In some uses, a user may create or modify Extended reality settings, such as by inserting extended reality objects that are based on physical objects into an extended reality settings.
Some embodiments described in this disclosure are directed to methods for electronic devices to scan a physical object for the purpose of generating a three-dimensional object model of the physical object. Some embodiments described in this disclosure are directed to methods for electronic devices to display capture targets for scanning a physical object. The full descriptions of the embodiments are provided in the Drawings and the Detailed Description, and it is understood that this Summary does not limit the scope of the disclosure in any way.
For a better understanding of the various described embodiments, reference should be made to the Detailed Description below, in conjunction with the following drawings in which like reference numerals refer to corresponding parts throughout the figures.
In the following description of embodiments, reference is made to the accompanying drawings which form a part of this Specification, and in which it is shown by way of illustration, specific embodiments that are within the scope of the present disclosure. It is to be understood that other embodiments are also within the scope of the present disclosure and structural changes can be made without departing from the scope of the disclosure.
As used herein, the phrases “the,” “a,” and “an” include both the singular forms (e.g., one element) and plural forms (e.g., a plurality of elements), unless explicitly indicated or the context indicates otherwise. The term “and/or” encompasses any and all possible combinations of the listed items (e.g., including embodiments that include none of some of the listed items). The terms “comprises,” and/or “includes,” specify the inclusion of stated elements, but do not exclude the addition of other elements (e.g., the existence of other elements that are not explicitly recited in and of itself does not render an embodiment from not “including” or “comprising” an explicitly recited element). As used herein, the terms “first”, “second”, etc. are used to describe various elements, but these terms should not be interpreted as limiting the various elements, and are used merely to distinguish one element from another (e.g., to distinguish two of the same type of element from each other). The term “if” can be interpreted to mean “when”, “upon” (e.g., optionally including a temporal element) or “in response to” (e.g., without requiring a temporal element).
Physical settings are those in the world where people can sense and/or interact without use of electronic systems (e.g., the real-world environment, the physical environment, etc.). For example, a room is a physical setting that includes physical elements, such as, physical chairs, physical desks, physical lamps, and so forth. A person can sense and interact with these physical elements of the physical setting through direct touch, taste, sight, smell, and hearing.
In contrast to a physical setting, an extended reality (XR) setting refers to a computer-produced environment that is partially or entirely generated using computer-produced content. While a person can interact with the XR setting using various electronic systems, this interaction utilizes various electronic sensors to monitor the person's actions, and translates those actions into corresponding actions in the XR setting. For example, if an XR system detects that a person is looking upward, the XR system may change its graphics and audio output to present XR content in a manner consistent with the upward movement. XR settings may incorporate laws of physics to mimic physical settings.
Concepts of XR include virtual reality (VR) and augmented reality (AR). Concepts of XR also include mixed reality (MR), which is sometimes used to refer to the spectrum of realities between physical settings (but not including physical settings) at one end and VR at the other end. Concepts of XR also include augmented virtuality (AV), in which a virtual or computer-produced setting integrates sensory inputs from a physical setting. These inputs may represent characteristics of a physical setting. For example, a virtual object may be displayed in a color captured, using an image sensor, from the physical setting. As another example, an AV setting may adopt current weather conditions of the physical setting.
Some electronic systems for implementing XR operate with an opaque display and one or more imaging sensors for capturing video and/or images of a physical setting. In some implementations, when a system captures images of a physical setting, and displays a representation of the physical setting on an opaque display using the captured images, the displayed images are called a video pass-through. Some electronic systems for implementing XR operate with an optical see-through display that may be transparent or semi-transparent (and optionally with one or more imaging sensors). Such a display allows a person to view a physical setting directly through the display, and allows for virtual content to be added to the person's field-of-view by superimposing the content over an optical pass-through of the physical setting (e.g., overlaid over portions of the physical setting, obscuring portions of the physical setting, etc.). Some electronic systems for implementing XR operate with a projection system that projects virtual objects onto a physical setting. The projector may present a holograph onto a physical setting, or may project imagery onto a physical surface, or may project onto the eyes (e.g., retina) of a person, for example.
Electronic systems providing XR settings can have various form factors. A smartphone or a tablet computer may incorporate imaging and display components to present an XR setting. A head-mountable system may include imaging and display components to present an XR setting. These systems may provide computing resources for generating XR settings, and may work in conjunction with one another to generate and/or present XR settings. For example, a smartphone or a tablet can connect with a head-mounted display to present XR settings. As another example, a computer may connect with home entertainment components or vehicular systems to provide an on-window display or a heads-up display. Electronic systems displaying XR settings may utilize display technologies such as LEDs, OLEDs, QD-LEDs, liquid crystal on silicon, a laser scanning light source, a digital light projector, or combinations thereof. Display technologies can employ substrates, through which light is transmitted, including light waveguides, holographic substrates, optical reflectors and combiners, or combinations thereof.
Embodiments of electronic devices, user interfaces for such devices, and associated processes for using such devices are described. In some embodiments, the device is a portable communications device, such as a mobile telephone, that also contains other functions, such as PDA and/or music player functions. Other portable electronic devices, such as laptops, tablet computers with touch-sensitive surfaces (e.g., touch screen displays and/or touch pads), or wearable devices, are, optionally, used. It should also be understood that, in some embodiments, the device is not a portable communications device, but is a desktop computer or a television with a touch-sensitive surface (e.g., a touch screen display and/or a touch pad). In some embodiments, the device does not have a touch screen display and/or a touch pad, but rather is capable of outputting display information (such as the user interfaces of the disclosure) for display on a separate display device, and capable of receiving input information from a separate input device having one or more input mechanisms (such as one or more buttons, a touch screen display and/or a touch pad). In some embodiments, the device has a display, but is capable of receiving input information from a separate input device having one or more input mechanisms (such as one or more buttons, a touch screen display and/or a touch pad).
In the discussion that follows, an electronic device that includes a display and a touch-sensitive surface is described. It should be understood, however, that the electronic device optionally includes one or more other physical user-interface devices, such as a physical keyboard, a mouse and/or a joystick. Further, as described above, it should be understood that the described electronic device, display and touch-sensitive surface are optionally distributed amongst two or more devices. Therefore, as used in this disclosure, information displayed on the electronic device or by the electronic device is optionally used to describe information outputted by the electronic device for display on a separate display device (touch-sensitive or not). Similarly, as used in this disclosure, input received on the electronic device (e.g., touch input received on a touch-sensitive surface of the electronic device) is optionally used to describe input received on a separate input device, from which the electronic device receives input information.
The device typically supports a variety of applications, such as one or more of the following: a drawing application, a presentation application, a word processing application, a website creation application, a disk authoring application, a spreadsheet application, a gaming application, a telephone application, a video conferencing application, an e-mail application, an instant messaging application, a workout support application, a photo management application, a digital camera application, a digital video camera application, a web browsing application, a digital music player application, a television channel browsing application, and/or a digital video player application.
The various applications that are executed on the device optionally use at least one common physical user-interface device, such as the touch-sensitive surface. One or more functions of the touch-sensitive surface as well as corresponding information displayed on the device are, optionally, adjusted and/or varied from one application to the next and/or within a respective application. In this way, a common physical architecture (such as the touch-sensitive surface) of the device optionally supports the variety of applications with user interfaces that are intuitive and transparent to the user.
Attention is now directed toward embodiments of portable or non-portable devices with touch-sensitive displays, though the devices need not include touch-sensitive displays or displays in general, as described above.
Device 200 includes communication circuitry 202. Communication circuitry 202 optionally includes circuitry for communicating with electronic devices, networks, such as the Internet, intranets, a wired network and/or a wireless network, cellular networks and wireless local area networks (LANs). Communication circuitry 202 optionally includes circuitry for communicating using near-field communication and/or short-range communication, such as Bluetooth®.
Processor(s) 204 include one or more general processors, one or more graphics processors, and/or one or more digital signal processors. In some examples, memory 206 are one or more non-transitory computer-readable storage mediums (e.g., flash memory, random access memory) that store computer-readable instructions configured to be executed by processor(s) 204 to perform the techniques, processes, and/or methods described below (e.g., with reference to
Device 200 includes display(s) 224. In some examples, display(s) 224 include a single display. In some examples, display(s) 224 includes multiple displays. In some examples, device 200 includes touch-sensitive surface(s) 220 for receiving user inputs, such as tap inputs and swipe inputs. In some examples, display(s) 224 and touch-sensitive surface(s) 220 form touch-sensitive display(s) (e.g., a touch screen integrated with device 200 or external to device 200 that is in communication with device 200).
Device 200 includes image sensor(s) 210 (e.g., capture devices). Image sensors(s) 210 optionally include one or more visible light image sensor, such as charged coupled device (CCD) sensors, and/or complementary metal-oxide-semiconductor (CMOS) sensors operable to obtain images of physical objects from the real environment. Image sensor(s) 210 also optionally include one or more infrared (IR) sensor(s), such as a passive IR sensor or an active IR sensor, for detecting infrared light from the real environment. For example, an active IR sensor includes an IR emitter, such as an IR dot emitter, for emitting infrared light into the real environment. Image sensor(s) 210 also optionally include one or more event camera(s) configured to capture movement of physical objects in the real environment. Image sensor(s) 210 also optionally include one or more depth sensor(s) configured to detect the distance of physical objects from device 200. In some examples, information from one or more depth sensor(s) can allow the device to identify and differentiate objects in the real environment from other objects in the real environment. In some examples, one or more depth sensor(s) can allow the device to determine the texture and/or topography of objects in the real environment.
In some examples, device 200 uses CCD sensors, event cameras, and depth sensors in combination to detect the physical environment around device 200. In some examples, image sensor(s) 220 include a first image sensor and a second image sensor. The first image sensor and the second image sensor work in tandem and are optionally configured to capture different information of physical objects in the real environment. In some examples, the first image sensor is a visible light image sensor and the second image sensor is a depth sensor. In some examples, device 200 uses image sensor(s) 210 to detect the position and orientation of device 200 and/or display(s) 224 in the real environment. For example, device 200 uses image sensor(s) 210 to track the position and orientation of display(s) 224 relative to one or more fixed objects in the real environment.
In some examples, device 200 includes microphones(s) 218. Device 200 uses microphone(s) 218 to detect sound from the user and/or the real environment of the user. In some examples, microphone(s) 218 includes an array of microphones (including a plurality of microphones) that optionally operate in tandem, such as to identify ambient noise or to locate the source of sound in space of the real environment.
Device 200 includes location sensor(s) 214 for detecting a location of device 200 and/or display(s) 224. For example, location sensor(s) 214 can include a GPS receiver that receives data from one or more satellites and allows device 200 to determine the device's absolute position in the world.
Device 200 includes orientation sensor(s) 216 for detecting orientation and/or movement of device 200 and/or display(s) 224. For example, device 200 uses orientation sensor(s) 216 to track changes in the position and/or orientation of device 200 and/or display(s) 224, such as with respect to physical objects in the real environment. Orientation sensor(s) 216 optionally include one or more gyroscopes and/or one or more accelerometers.
Device 200 is not limited to the components and configuration of
Attention is now directed towards examples of user interfaces (“UI”) and associated processes that are implemented on an electronic device, such as portable multifunction device 100, device 200, device 300, device 400, device 500, or device 600.
The examples described below provide ways in which an electronic device scans a real-world object, for instance to generate a three-dimensional object of the scanned physical object. The embodiments herein improve the speed and accuracy of object scanning operations, thereby enabling the creation of accurate computer models.
Referring back to
In some examples, user interface 301 is a camera-style user interface that displays a real time view of the real-world environment 310 captured by the one or more sensors of device 300. For example, the one or more sensors capture the vase and a portion of table 320 and thus user interface 310 displays a representation 330 of the vase and a representation of the portion of table 320 that is captured by the one or more sensors (e.g., an XR environment). In some examples, user interface 301 includes reticle 302 that indicates the center position or focus position of the one or more sensors. In some examples, reticle 302 provides the user with a guide and/or target and allows a user to indicate to device 300 what object the user desires to be scanned. As will be described in further detail below, when reticle 302 is placed over a real-world object (e.g., device 300 is positioned such that the one or more sensors are centered on and capture the desired object), device 300 identifies the object of interest separate from other objects in the real-world environment (e.g., using data received from the one or more sensors) and initiates the process of scanning the object.
In some examples, as will be described in further detail below, the process of scanning the object involves performing multiple captures of the respective object from multiple angles and/or perspectives. In some examples, using the data from the multiple captures, device 300 constructs a partial or complete three-dimensional scan of the respective object. In some examples, device 300 processes the three-dimensional scan and generates a three-dimensional model of the object. In some examples, device 300 sends the three-dimensional scan data to a server to generate the three-dimensional model of the object. In some examples, processing the three-dimensional scan and generating a three-dimensional model of the object includes performing one or more photogrammetry processes. In some examples, the three-dimensional model can be used in a XR setting creation application. In some examples, device 300 is able to perform the process of scanning the object without requiring the user to place the object on, in, or next to a particular reference pattern (e.g., a predetermined pattern, such as a hashed pattern) or reference object (e.g., a predetermined object), or at a reference location (e.g., a predetermined location). For example, device 300 is able to identify the object separate from other objects in the environment and scan the object without any external reference.
In some examples, device 400 performs one or more captures of the vase using the one or more capture devices. In some examples, the one or more capture devices capture a subset of the total environment that is displayed on user interface 401. For example, the one or more capture devices may capture only a small radius at or near the center of the capture devices (e.g., the focal point), such as at or near the location of reticle 402 while user interface 401 displays a larger view of the real-world environment 410. In some examples, the one or more capture devices captures one or more of the color(s), shape, size, texture, depth, topography, etc. of a respective portion of the object. In some examples, while performing directed captures of the object, the one or more capture devices continue to capture the real world environment, for the purpose of display the real world environment in user interface 401, for example.
In some examples, a capture of a portion of the object is accepted if and/or when the capture satisfies one or more capture criteria. For example, the one or more capture criteria includes a requirement that the one or more capture devices be at a particular position with respect to the portion of the object being captured. In some examples, the capture devices must be at certain angles with respect to the portion being captured (e.g., at a “normal” angle, at a perpendicular angle, optionally with a tolerance of 5 degrees, 10 degrees, 15 degrees, 30 degrees, etc. in any direction from the “normal” angle). In some examples, the capture devices must be more than a certain distance from the portion being captured (e.g., more than 3 inches away, 6 inches away, 12 inches away, 2 feet away, etc.), and/or less than a certain distance from the portion being captured (e.g., less than 6 feet away, 3 feet away, 1 foot away, 6 inches away, etc.). In some examples, the distance(s) at which the captures satisfy the criteria depend on the size of the object. For example, a large object requires scans from further away and a small object requires scans from closer. In some examples, the distance(s) at which the captures satisfy the criteria does not depend on the size of the object (e.g., is the same regardless of the size of the object). In some examples, the one or more capture criteria includes a requirement that the camera be held at the particular position for more than a threshold amount of time (e.g., 0.5 seconds, 1 second, 2 seconds).
In some examples, the one or more capture criteria include a requirement that the portion of the object captured by the capture overlaps with portions of the object captured by previous captures by a threshold amount (e.g., 10% of the new capture overlaps with previous captures, 25% overlap, 30% overlap, 50% overlap, etc.). In some examples, if a new capture does not overlap with a previous capture by the threshold amount, the one or more capture criteria are not satisfied. In some examples, overlapping the captures allows device 400 (or optionally a server that generates the three-dimensional model) to align the new capture with previous captures.
In some examples, captures of a portion of the object that satisfy the one or more capture criteria are accepted by device 400. In some examples, captures of a portion of the object that do not satisfy the one or more criteria are rejected by device 400 and a user may be required to perform another capture of the portion of the object (e.g., an indication or prompt may be displayed on the user interface, or the interface does not display an indication that the capture was successful). In some examples, captures that are accepted by device 400 are saved and/or merged with previous captures of the object. In some examples, captures that do not satisfy the one or more capture criteria are discarded (e.g., not served and not merged with previous captures of the object). In some examples, if the one or more capture criteria is not satisfied, user interface 401 can display one or more indications to instruct and/or guide the user. For example, user interface 401 can display a textual indication instructing the user to slow down, move closer, move further, move to a new location, etc.
Referring back to
In some examples, as the user moves around the vase and/or changes angles and/or positions with respect to vase (and user interface 401 is updated to show different angles or portions of the vase due to device 400 moving to different positions and angles), device 400 continually performs additional captures of the vase (e.g., every 0.25 seconds, 0.5 seconds, every 1 second, every 5 seconds, every 10 seconds, every 30 seconds, etc.). In some examples, additional captures are performed in response to detecting that the device has moved to a new position, that the device position has stabilized (e.g., has moved less than a threshold for more than a time threshold), and/or that the device is able to capture a new portion of the object (e.g., has less than a threshold amount of overlap with a previous capture), etc. In some examples, in response to the additional captures of the vase and in accordance with a determination that the additional captures satisfy the one or more capture criteria (e.g., with respect to uncaptured portions of the vase), device 400 displays additional sets of voxels corresponding to the portions of the vase that were captured by the additional captures. For example, for each capture, device 400 determines whether the capture satisfies the capture criteria and if so, the capture is accepted.
For example, a user may move device 400 such that reticle 402 is positioned over a second portion of vase 430 (e.g., a portion that was not fully captured by the first capture). In response to determining that the user has moved device 400 such that reticle 402 is over the second portion of the vase (e.g., in response to determining that reticle 402 is over the second portion of the vase), device 400 performs a capture of the second portion of the vase. In some examples, if the second capture satisfies the one or more capture criteria, then the second capture is accepted and device 40 displays a second set of voxels on representation 430 of the vase corresponding to the second portion of vase that was captured.
As described above, in some examples, device 400 performs captures of the object in response to determining that device 400 is positioned over an uncaptured portion of the object (e.g., a not fully captured portion of the object or a partially captured portion of the object). In some examples, device 400 performs continuous captures of the object (e.g., even if the user has not moved device 400) and accepts captures that satisfy the one or more capture criteria (e.g., position, angle, distance, etc.).
In some examples, when device 400 determines that the user is interested in scanning the vase (e.g., such as after the techniques discussed with reference to
In some examples, when device 400 determines that the user is interested in scanning the vase, representation 430 of the vase is displayed without modifying (e.g., darkening) representation 430 of the vase. In such examples, as device 400 performs successful captures of the vase, the portion of representation 430 corresponding to the captured portions of the vase are modified to have a different visual characteristic than the original unmodified representation of the vase (e.g., displayed darker, lighter, with a different color, etc.).
In some examples, shape 550 is not displayed in user interface 501 (e.g., exists only in software and is displayed in
In some examples, targets 552 (e.g., targets 522-1 to 552-5) are displayed in user interface 501 around representation 530 of the vase. In some examples, targets 552 are placed on the surface of shape 550 such that targets 552 are floating in three-dimensional space around representation 530 of the vase. In some examples, each of the targets are discrete visual elements placed at discrete locations around representation 530 of the vase (e.g., the elements are not contiguous and do not touch each other). In some examples, targets 552 are circular. In some examples, targets 552 can be any other shape (e.g., rectangular, square, triangular, oval, etc.). In some examples, targets 552 are angled to be facing representation 530 of the vase (e.g., each of the targets 552 are at a normal angle to the center of representation 530 of the vase). As shown in
Referring back to
In
In some examples, as shown in
In some examples, after the capture has successfully completed, target 552-1 ceases to be displayed in user interface 501, as shown in
Thus, as described above, in some examples, only captures that are taken when reticle 502 is aligned (or partially aligned) with a target are accepted and saved (e.g., optionally only if the capture satisfies the one or more capture criteria described above when reticle 502 is aligned with a target).
In some examples, as shown in
In some examples, preview 560 is scaled such that the object being scanned fits entirely within preview 560. For example, as shown in
In
In some examples, capture 564 has the same or similar visual characteristics as the portions of the vase that have been captured and/or as has the same or similar visual characteristics as how the final three-dimensional model will look. For example, instead of displaying a set of voxels or displaying the vase as darker or lighter than the capture (e.g., such as in the main portion of user interface 501), capture 564 displays a rendering of the actual capture of the object, including the color(s), shape, size, texture, depth, and/or topography, etc. of the three-dimensional model of the vase to be generated. In some examples, as additional captures are taken and accepted, capture 564 is updated to include the new captures (e.g., expands to include the additional captures).
It is understood that, in some examples, preview 560 can be displayed in any user interface for capturing an object, such as user interface 300 and/or 400. In some examples, preview 560 is not displayed in the user interface before, during, or after capturing an object.
Returning to
For similar reasons, in some examples, when device 500 determines that the user is interested in scanning the vase, device 500 can determine, based on the initial capture of the vase, that certain portions of the object require additional captures (e.g., in addition to the regularly spaced targets that are displayed on the surface of a bounding volume). In some examples, in response to determining that additional captures are required, device 500 can place one or more additional targets on the surface of the bounding volume or inside or outside of the surface of the bounding volume. Thus, in this way, device 500 can determine, at the outset, that additional targets are required, and display them in the user interface at the appropriate positions and/or angles around the representation of the object. It is understood that, in this example, the device is also able to dynamically place additional targets as necessary while the user is performing captures of the object.
It is understood that the process described above can be repeated and/or performed multiple times, as necessary, to fully capture the object. For example, after performing a partial (e.g., capturing a subset of all the targets) or full capture of the object (e.g., capturing all of the targets), based on information captured, device 500 can determine (e.g., generate, identify, etc.) a new or additional bounding volume around the representation of the object and place new targets on the new or additional bounding volume. In this way, device 500 is able to indicate to the user that another pass is required to fully capture the details of the object.
In some examples, a user is able to prematurely end the capture process (e.g., before capturing all of the targets). In such an example, device 500 can discard the captures and terminate the process for generating the three-dimensional model. For example, if a threshold number of captures have not been captured (e.g., less than 50% captured, less than 75% captured, less than 90% captured, etc.), it may not be possible to generate a satisfactory three-dimensional model, and device 500 can terminate the process for generating the three-dimensional model. In some examples, device 500 can preserve the captures that have been captured so far and attempt to generate a three-dimensional model using the data captured so far. In such examples, the resulting three-dimensional model may have a lower resolution or may have a lower level of detail, than otherwise would be achieved by a full capture. In some examples, the resulting three-dimensional model may be missing certain surfaces that have not been captured.
In
In some examples, device 600 changes a visual characteristic of the suggested target for capture to visually highlight and differentiate the suggested target from the other targets. In some examples, changing a visual characteristic includes changing one or more of color, shading, brightness, pattern, size and/or shape. For example, the suggested target can be displayed with a different color (e.g., the target can be filled with a particular color, or the border of the target can be changed to a particular color). In the example illustrated in
In
In some examples, as shown in
As shown in
In some examples, a user is able to physically change the orientation of the object being scanned (e.g., the vase) and device 600 is able to detect the change in orientation and adjust accordingly. For example, a user is able to turn the vase upside down such that the bottom of the vase is facing upwards (e.g., revealing a portion of the vase that was previously not capture-able). In some examples, device 600 is able to determine that the orientation of the vase has changed and in particular, that the bottom of the vase is now facing upwards. In some examples, in response to this determination, preview 660 is updated such that captures 664 are displayed upside down, thus providing the user a visualization of areas that haven't been captured (e.g., namely the bottom of the vase). In some examples, because the main portion of user interface 601 is displaying a live view of the real-world environment, representation 630 is also displayed upside down. In some examples, the indications of capture progress (e.g., the voxels) are displayed in the appropriate position on representation 630 (e.g., are also displayed upside down). In another example, the user is able to turn the vase sideways, and preview 660 is updated such that capture 664 is sideways and representation 630 and its accompanying voxels are also displayed sideways. Thus, in some examples, a user is able to walk around an object and scan the object from different angles, and then turn the object to scan areas that were hidden, such as the bottom. Alternative, the user can stay within a relatively small area, and continue to physically rotate the object to scan portions of the object that were hidden (e.g., the back side/far side of the object). In some examples, the targets displayed around representation 630 also rotate, move, or otherwise adjust based on the determined change in orientation.
It is understood that although
In some examples, the process for scanning/capturing a real-world object to generate a three-dimensional model of the object is initiated in response to a request to insert a virtual object in an extended reality (XR) setting. For example, an electronic device (e.g., device 100, 200, 300, 400, 500, 600) can execute and/or display an XR setting creation application. While manipulating, generating, and/or modifying a XR setting (e.g., a CGR environment) in the XR setting creation application, a user may desire to insert an object for which a three-dimensional object model does not exist. In some examples, a user is able to request the insertion of said object and in response to the request, the device initiates a process to scan/capture the appropriate real-world object and displays a user interface for scanning/capturing the real-world object (e.g., such as user interface 301, 401, 501, 601 described above). In some examples, after completing the process for scanning/capturing the real-world object, a placeholder model (e.g., temporary model) can be generated and inserted into the XR setting using the XR setting creation application. In some examples, the placeholder model is based on the general size and shape of the object captured during the capture process. In some examples, the placeholder model is the same or similar to the preview discussed above with respect to
In some examples, after the process for capturing the object is complete, the capture data is processed to generate the complete three-dimensional model. In some examples, processing the data includes transmitting the data to a server and the generation of the model is performed at the server. In some examples, when the three-dimensional object model of the object is completed (e.g., by the device or by the server), the XR setting creation application automatically replaces the placeholder object with the completed three-dimensional model of the object. In some examples, the completed three-dimensional model includes the visual details that were missing in the placeholder model, such as the color and/or textures. In some examples, the completed three-dimensional model is a higher resolution object than the placeholder object.
As described below, the method 700 provides methods of scanning a real-world object in accordance with some embodiments of the disclosure (e.g., as discussed above with respect to
In some examples, an electronic device in communication with a display (e.g., a display generation component, a display integrated with the electronic device (optionally a touch screen display), and/or an external display such as a monitor, projector, television, etc.) and one or more cameras (e.g., a mobile device (e.g., a tablet, a smartphone, a media player, or a wearable device), or a computer, optionally in communication with one or more of a visible light camera, a depth camera, a depth sensor, an infrared camera, and/or a capture device, etc.), while receiving, via the one or more cameras, one or more captures of a real world environment, including a first real world object, wherein the one or more captures includes a first set of captures (702): displays (704), using the display, a representation of the real world environment, including a representation of the first real world object, wherein a first portion of the representation of the first real world object is displayed with a first visual characteristic; and in response to receiving, via the one or more cameras, a first capture of the first set of captures of the first real world object that includes a first portion of the first real world object corresponding to the first portion of the representation of the first real world object (706), in accordance with a determination that the first capture satisfies one or more object capture criteria, updates the representation of the first real world object to indicate a scanning progress of the first real world object, including modifying (708), using the display, the first portion of the representation of the first real world object from having the first visual characteristic to having a second visual characteristic.
Additionally or alternatively, in some examples, the one or more cameras includes a visual light camera. Additionally or alternatively, in some examples, the one or more cameras includes a depth sensor. Additionally or alternatively, in some examples, modifying the first portion of the representation of the first real world object from having the first visual characteristic to having the second visual characteristic includes changing a shading of the first portion of the representation of the first real world object. Additionally or alternatively, in some examples, modifying the first portion of the representation of the first real world object from having the first visual characteristic to having the second visual characteristic includes changing a color of the first portion of the representation of the first real world object.
Additionally or alternatively, in some examples, the electronic device receives, via the one or more cameras, a second capture of the first set of captures of the first real world object that includes a second portion of the first real world object, different from the first portion. Additionally or alternatively, in some examples, in response to receiving the second capture and in accordance with a determination that the second capture satisfies the one or more object capture criteria, the electronic device modifies, using the display, a second portion of the representation of the first real world object corresponding to the second portion of the first real world object from having a third visual characteristic to having a fourth visual characteristic.
Additionally or alternatively, in some examples, the one or more object capture criteria include a requirement that a respective capture is within a first predetermined range of angles relative to a respective portion of the first real world object. Additionally or alternative, in some examples, the one or more object capture criteria includes a requirement that the capture is within a first predetermined range of distances. Additionally or alternative, in some examples, the one or more object capture criteria includes a requirement that the capture is held for a threshold amount of time. Additionally or alternative, in some examples, the one or more object capture criteria includes a requirement that the capture is not of a portion that has already been captured. Additionally or alternative, in some examples, determining whether the one or more object capture criteria is satisfied can be performed using data that is captured by the one or more cameras (e.g., by analyzing the images and/or data to determine whether it satisfies the criteria and/or has an acceptable level quality, detail, information, etc.).
Additionally or alternatively, in some examples, in response to receiving the first capture of the first portion of the first real world object and in accordance with a determination that the first capture does not satisfy the one or more object capture criteria, the electronic device forgoes modifying the first portion of the representation of the first real world object. Additionally or alternatively, in some examples, the electronic device discards the data corresponding to the first capture if the first capture does not satisfy the one or more object capture criteria.
Additionally or alternatively, in some examples, while receiving the one or more captures of the real world environment, the electronic device displays using the display, a preview of a model of the first real world object, including captured portions of the first real world object. Additionally or alternatively, in some examples, the preview of the model does not include uncaptured portions of the first real world object.
Additionally or alternatively, in some examples, while displaying the preview of the model of the first real world object, the electronic device detects a change in an orientation of the first real world object. Additionally or alternatively, in some examples, in response to detecting the change in the orientation of the first real world object, the electronic device updates the preview of the model of the first real world object based on the change in orientation of the first real world object, including revealing uncaptured portions of the first real world object and maintaining display of captured portions of the first real world object.
Additionally or alternatively, in some examples, the one or more captures includes a second set of captures, before the first set of captures. Additionally or alternatively, in some examples, the electronic device receives, via the one or more cameras, a first capture of the second set of captures of the real world environment, including the first real world object. Additionally or alternatively, in some examples, in response to receiving the first capture of the second set of captures, the electronic device identifies the first real world object in the real world environment, separate from other objects in the real world environment, and determines a shape and size of the first real world object.
Additionally or alternatively, in some examples, the first capture of the second set of captures is received via a capture device of a first type (e.g., a depth sensor). Additionally or alternatively, in some examples, the first capture of the first set of captures is received via a capture device of a second type, different from the first type (e.g., a visible light camera).
Additionally or alternatively, in some examples, while displaying virtual object creation user interface (e.g., an XR setting creation user interface, a user interface for generating, designing, and/or creating a virtual or XR setting, a user interface for generating, designing and/or creating virtual objects and/or XR objects, etc.), the electronic device receives a first user input corresponding to a request to insert a first virtual object corresponding to the first real world object at a first location in a virtual environment (e.g., an XR environment), wherein a virtual model (e.g., an XR model) of the first real world object is not available on the electronic device. Additionally or alternatively, in some examples, in response to receiving the first user input, the electronic device initiates a process for generating the virtual model of the first real world object, including performing, using the one or more cameras, the one or more captures of the real world environment, including the first real world object, and displays a placeholder object at the first location in the virtual environment, wherein the placeholder object is based on an initial capture of the one or more captures of the first real world object. Additionally or alternatively, in some examples, the electronic device receives a second user input corresponding to a request to insert a second virtual object of a second real world object at a second location in the virtual environment, wherein a virtual model (e.g., an XR model) of the second real world object is available on the electronic device, and in response to receiving the second user input, the electronic device displays a representation of the virtual model of the second real world object at the second location in the virtual environment, without initiating a process for generating a virtual model of the second real world object.
Additionally or alternatively, in some examples, after initiating the process for generating the virtual model of the first real world object, the electronic device determines that generation of the virtual model of the first real world object has completed. Additionally or alternatively, in some examples, in response to determining that generation of the virtual model of the first real world object has been completed, the electronic device replaces the placeholder object with a representation of the virtual model of the first real world object.
Additionally or alternatively, before updating the representation of the first real world object to indicate the scanning progress of the first real world object, the representation of the first real world object is a photorealistic representation of the first real world object at the time of the first capture. For example, the device captures a photorealistic representation of the first real world object using the one or more cameras (e.g., a visible light camera) and displays the photorealistic representation in the representation of the real world environment (e.g., before scanning the first real world object). In some embodiments, modifying the first portion of the representation of the first real world object from having the first visual characteristic to having the second visual characteristic indicates the scanning progress of the first real world object (e.g., the second visual characteristic indicates that a portion the first real world object corresponding to the first portion of the representation of the first real world object has been scanned, has been marked for scanning, or will be scanned). In some embodiments, the second visual characteristic is a virtual modification of the representation of the first real world object (e.g., an augmented reality modification) and not a result of a change in the visual characteristic of the first real world object that is captured by the one or more cameras (e.g., and is optionally reflected in the representation of the first real world object). In some embodiments, after modifying the first portion of the first real world object to have the second visual characteristic, the first portion of the first real world object is no longer a photorealistic representation of the first portion of the first real world object (e.g., due to having the second visual characteristic).
It should be understood that the particular order in which the operations in
The operations in the information processing methods described above are, optionally, implemented by running one or more functional modules in an information processing apparatus such as general-purpose processors (e.g., as described with respect to
As described below, the method 800 provides ways to display capture targets in accordance with some embodiments of the disclosure (e.g., as discussed above with respect to
In some examples, an electronic device in communication with a display (e.g., a display generation component, a display integrated with the electronic device (optionally a touch screen display), and/or an external display such as a monitor, projector, television, etc.) and one or more cameras (e.g., a mobile device (e.g., a tablet, a smartphone, a media player, or a wearable device), or a computer, optionally in communication with one or more of a visible light camera, a depth camera, a depth sensor, an infrared camera, and/or a capture device, etc.), while displaying, using the display, a representation of a real world environment, including a representation of a first real world object, receives (802) a request to capture the first real world object. In some examples, in response to receiving the request to capture the first real world object (804), the electronic device determines (804) a bounding volume around the representation of the first real world object, and displays (806), using the display, a plurality of capture targets on a surface of the bounding volume, wherein one or more visual characteristics of each of the capture targets indicates a device position for capturing a respective portion of the first real world object associated with the respective capture target.
Additionally or alternatively, in some examples, the request to capture the first real world object includes placing a reticle over the representation of the real world object (optionally for a threshold amount of time). Additionally or alternatively, in some examples, determining the bounding volume around the representation of the first real world object includes: identifying the first real world object in the real world environment, separate from other objects in the real world environment, and determining a physical characteristic (e.g., shape and/or size) of the first real world object.
Additionally or alternatively, in some examples, while displaying the plurality of capture targets on the surface of the bounding volume, the electronic device determines that a first camera of the one or more cameras is aligned with a first capture target of the one or more capture targets associated with the first portion of the first real world object. Additionally or alternatively, in some examples, in response to determining that the first camera is aligned with the first capture target, the electronic device performs, using the first camera, one or more captures of the first portion of the first real world object associated with the first capture target.
Additionally or alternatively, in some examples, in response to performing the one or more captures of the first portion of the first real world object, the electronic device modifies the first capture target to indicate a progress of the capture. Additionally or alternatively, in some examples, generating the bounding volume around the representation of the real world object includes receiving, via one or more input devices, a user input modifying a size of the bounding volume.
Additionally or alternatively, in some examples, while displaying the plurality of capture targets on the surface of the bounding volume, suggesting a first capture target of the plurality of capture targets, including the electronic device modifies, via the display generation device, the first capture target to have a first visual characteristic. Additionally or alternatively, in some examples, while displaying the first capture target with the first visual characteristic, the electronic device determines that a first camera of the one or more cameras is aligned with the first capture target.
Additionally or alternatively, in some examples, in response to determining that the first camera is aligned with the first capture target and while the first camera is aligned with the first capture target, the electronic device modifies, via the display generation device, the first capture target to have a second visual characteristic, different from the first visual characteristic, and performs, using the first camera, one or more captures of the first portion of the first real world object associated with the first capture target. Additionally or alternatively, in some examples, after performing the one or more captures of the first portion of the first real world object, the electronic device modifies, via the display generation device, the first capture target to have a third visual characteristic, different from the first visual characteristic and the second visual characteristic.
Additionally or alternatively, in some examples, suggesting the first capture target of the plurality of capture targets includes determining that the first capture target is a closest capture target to a reticle displayed by the display generation device. Additionally or alternatively, in some examples, modifying the first capture target to have the first visual characteristic includes changing a color of a portion of the first capture target. Additionally or alternatively, in some examples, modifying the first capture target to have the second visual characteristic includes changing the color of the portion of the first capture target. Additionally or alternatively, in some examples, modifying the first capture target to have the third visual characteristic includes ceasing display of the first capture target.
It should be understood that the particular order in which the operations in
The operations in the information processing methods described above are, optionally, implemented by running one or more functional modules in an information processing apparatus such as general-purpose processors (e.g., as described with respect to
The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best use the invention and various described embodiments with various modifications as are suited to the particular use contemplated.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/020062 | 2/26/2021 | WO |
Number | Date | Country | |
---|---|---|---|
62984242 | Mar 2020 | US |