The present disclosure generally relates to visualizing inputs and, in particular, to systems, methods, and methods for visualizing multi-modal inputs.
Various scenarios may involve selecting a user interface (UI) element by based on gaze direction and head motion (e.g., nodding). However, a user may not be aware that head motion controls the UI element.
So that the present disclosure can be understood by those of ordinary skill in the art, a more detailed description may be had by reference to aspects of some illustrative implementations, some of which are shown in the accompanying drawings.
In accordance with common practice the various features illustrated in the drawings may not be drawn to scale. Accordingly, the dimensions of the various features may be arbitrarily expanded or reduced for clarity. In addition, some of the drawings may not depict all of the components of a given system, method, or device. Finally, like reference numerals may be used to denote like features throughout the specification and figures.
Various implementations disclosed herein include devices, systems, and methods for visualizing multi-modal inputs. According to some implementations, the method is performed at a computing system including non-transitory memory and one or more processors, wherein the computing system is communicatively coupled to a display device and one or more input devices. The method includes: displaying, via the display device, a first user interface element within an extended reality (XR) environment: determining a gaze direction based on first input data from the one or more input devices: in response to determining that the gaze direction is directed to the first user interface element, displaying, via the display device, a focus indicator with a first appearance in association with the first user interface element: detecting, via the one or more input devices, a change in pose of at least one of a head pose or a body pose of a user of the computing system: and in response to detecting the change of pose, modifying the focus indicator by changing the focus indicator from the first appearance to a second appearance different from the first appearance.
Various implementations disclosed herein include devices, systems, and methods for visualizing multi-modal inputs. According to some implementations, the method is performed at a computing system including non-transitory memory and one or more processors, wherein the computing system is communicatively coupled to a display device and one or more input devices. The method includes: presenting, via the display device, a user interface (UI) element within a UI: and obtaining a gaze vector based on first input data from the one or more input devices, wherein the gaze vector is associated with a gaze direction of a user. In accordance with a determination that the gaze vector satisfies an attention criterion associated with the UI element, the method also includes: obtaining a head vector based on second input data from the one or more input devices, wherein the head vector is associated with a head pose of the user: and presenting, via the display device, a head position indicator at a first location within the UI. The method further includes: after presenting the head position indicator at the first location, detecting, via the one or more input devices, a change to one or more values of the head vector: updating presentation of the head position indicator from the first location to a second location within the UI based on the change to the one or more values of the head vector: and in accordance with a determination that the second location for the head position indicator coincides with a selectable region of the UI element, performing an operation associated with the UI element.
In accordance with some implementations, an electronic device includes one or more displays, one or more processors, a non-transitory memory, and one or more programs; the one or more programs are stored in the non-transitory memory and configured to be executed by the one or more processors and the one or more programs include instructions for performing or causing performance of any of the methods described herein. In accordance with some implementations, a non-transitory computer readable storage medium has stored therein instructions, which, when executed by one or more processors of a device, cause the device to perform or cause performance of any of the methods described herein. In accordance with some implementations, a device includes: one or more displays, one or more processors, a non-transitory memory, and means for performing or causing performance of any of the methods described herein.
In accordance with some implementations, a computing system includes one or more processors, non-transitory memory, an interface for communicating with a display device and one or more input devices, and one or more programs: the one or more programs are stored in the non-transitory memory and configured to be executed by the one or more processors and the one or more programs include instructions for performing or causing performance of the operations of any of the methods described herein. In accordance with some implementations, a non-transitory computer readable storage medium has stored therein instructions which when executed by one or more processors of a computing system with an interface for communicating with a display device and one or more input devices, cause the computing system to perform or cause performance of the operations of any of the methods described herein. In accordance with some implementations, a computing system includes one or more processors, non-transitory memory, an interface for communicating with a display device and one or more input devices, and means for performing or causing performance of the operations of any of the methods described herein.
Numerous details are described in order to provide a thorough understanding of the example implementations shown in the drawings. However, the drawings merely show some example aspects of the present disclosure and are therefore not to be considered limiting. Those of ordinary skill in the art will appreciate that other effective aspects and/or variants do not include all of the specific details described herein. Moreover, well-known systems, methods, components, devices, and circuits have not been described in exhaustive detail so as not to obscure more pertinent aspects of the example implementations described herein.
A person can interact with and/or sense a physical environment or physical world without the aid of an electronic device. A physical environment can include physical features, such as a physical object or surface. An example of a physical environment is physical forest that includes physical plants and animals. A person can directly sense and/or interact with a physical environment through various means, such as hearing, sight, taste, touch, and smell. In contrast, a person can use an electronic device to interact with and/or sense an extended reality (XR) environment that is wholly or partially simulated. The XR environment can include mixed reality (MR) content, augmented reality (AR) content, virtual reality (VR) content, and/or the like. With an XR system, some of a person's physical motions, or representations thereof, can be tracked and, in response, characteristics of virtual objects simulated in the XR environment can be adjusted in a manner that complies with at least one law of physics. For instance, the XR system can detect the movement of a user's head and adjust graphical content and auditory content presented to the user similar to how such views and sounds would change in a physical environment. In another example, the XR system can detect movement of an electronic device that presents the XR environment (e.g., a mobile phone, tablet, laptop, or the like) and adjust graphical content and auditory content presented to the user similar to how such views and sounds would change in a physical environment. In some situations, the XR system can adjust characteristic(s) of graphical content in response to other inputs, such as a representation of a physical motion (e.g., a vocal command).
Many different types of electronic systems can enable a user to interact with and/or sense an XR environment. A non-exclusive list of examples include heads-up displays (HUDs), head mountable systems, projection-based systems, windows or vehicle windshields having integrated display capability, displays formed as lenses to be placed on users eyes (e.g., contact lenses), headphones/earphones, input systems with or without haptic feedback (e.g., wearable or handheld controllers), speaker arrays, smartphones, tablets, and desktop/laptop computers. A head mountable system can have one or more speaker(s) and an opaque display. Other head mountable systems can be configured to accept an opaque external display (e.g., a smartphone). The head mountable system can include one or more image sensors to capture images/video of the physical environment and/or one or more microphones to capture audio of the physical environment. A head mountable system may have a transparent or translucent display, rather than an opaque display. The transparent or translucent display can have a medium through which light is directed to a user's eyes. The display may utilize various display technologies, such as uLEDs, OLEDs, LEDs, liquid crystal on silicon, laser scanning light source, digital light projection, or combinations thereof. An optical waveguide, an optical reflector, a hologram medium, an optical combiner, combinations thereof, or other similar technologies can be used for the medium. In some implementations, the transparent or translucent display can be selectively controlled to become opaque. Projection-based systems can utilize retinal projection technology that projects images onto users' retinas. Projection systems can also project virtual objects into the physical environment (e.g., as a hologram or onto a physical surface).
In some implementations, the controller 110 is configured to manage and coordinate an XR experience (sometimes also referred to herein as a “XR environment” or a “virtual environment” or a “graphical environment”) for a user 150 and optionally other users. In some implementations, the controller 110 includes a suitable combination of software, firmware, and/or hardware. The controller 110 is described in greater detail below with respect to
In some implementations, the electronic device 120 is configured to present audio and/or video (A/V) content to the user 150. In some implementations, the electronic device 120 is configured to present a user interface (UI) and/or an XR environment 128 to the user 150. In some implementations, the electronic device 120 includes a suitable combination of software, firmware, and/or hardware. The electronic device 120 is described in greater detail below with respect to
According to some implementations, the electronic device 120 presents an XR experience to the user 150 while the user 150 is physically present within a physical environment 105 that includes a table 107 within the field-of-view (FOV) 111 of the electronic device 120. As such, in some implementations, the user 150 holds the electronic device 120 in their hand(s). In some implementations, while presenting the XR experience, the electronic device 120 is configured to present XR content (sometimes also referred to herein as “graphical content” or “virtual content”), including an XR cylinder 109, and to enable video pass-through of the physical environment 105 (e.g., including the table 107) on a display 122. For example, the XR environment 128, including the XR cylinder 109, is volumetric or three-dimensional (3D).
In one example, the XR cylinder 109 corresponds to display-locked content such that the XR cylinder 109 remains displayed at the same location on the display 122 as the FOV 111 changes due to translational and/or rotational movement of the electronic device 120. As another example, the XR cylinder 109 corresponds to world-locked content such that the XR cylinder 109 remains displayed at its origin location as the FOV 111 changes due to translational and/or rotational movement of the electronic device 120. As such, in this example, if the FOV 111 does not include the origin location, the XR environment 128 will not include the XR cylinder 109. For example, the electronic device 120 corresponds to a near-eye system, mobile phone, tablet, laptop, wearable computing device, or the like.
In some implementations, the display 122 corresponds to an additive display that enables optical see-through of the physical environment 105 including the table 107. For example, the display 122 corresponds to a transparent lens, and the electronic device 120 corresponds to a pair of glasses worn by the user 150. As such, in some implementations, the electronic device 120 presents a user interface by projecting the XR content (e.g., the XR cylinder 109) onto the additive display, which is, in turn, overlaid on the physical environment 105 from the perspective of the user 150. In some implementations, the electronic device 120) presents the user interface by displaying the XR content (e.g., the XR cylinder 109) on the additive display, which is, in turn, overlaid on the physical environment 105 from the perspective of the user 150.
In some implementations, the user 150 wears the electronic device 120 such as a near-eye system. As such, the electronic device 120 includes one or more displays provided to display the XR content (e.g., a single display or one for each eye). For example, the electronic device 120 encloses the FOV of the user 150. In such implementations, the electronic device 120 presents the XR environment 128 by displaying data corresponding to the XR environment 128 on the one or more displays or by projecting data corresponding to the XR environment 128 onto the retinas of the user 150.
In some implementations, the electronic device 120 includes an integrated display (e.g., a built-in display) that displays the XR environment 128. In some implementations, the electronic device 120 includes a head-mountable enclosure. In various implementations, the head-mountable enclosure includes an attachment region to which another device with a display can be attached. For example, in some implementations, the electronic device 120 can be attached to the head-mountable enclosure. In various implementations, the head-mountable enclosure is shaped to form a receptacle for receiving another device that includes a display (e.g., the electronic device 120). For example, in some implementations, the electronic device 120 slides/snaps into or otherwise attaches to the head-mountable enclosure. In some implementations, the display of the device attached to the head-mountable enclosure presents (e.g., displays) the XR environment 128. In some implementations, the electronic device 120 is replaced with an XR chamber, enclosure, or room configured to present XR content in which the user 150 does not wear the electronic device 120.
In some implementations, the controller 110 and/or the electronic device 120 cause an XR representation of the user 150 to move within the XR environment 128 based on movement information (e.g., body pose data, eye tracking data, hand/limb/finger/extremity tracking data, etc.) from the electronic device 120 and/or optional remote input devices within the physical environment 105. In some implementations, the optional remote input devices correspond to fixed or movable sensory equipment within the physical environment 105 (e.g., image sensors, depth sensors, infrared (IR) sensors, event cameras, microphones, etc.). In some implementations, each of the remote input devices is configured to collect/capture input data and provide the input data to the controller 110 and/or the electronic device 120 while the user 150 is physically within the physical environment 105. In some implementations, the remote input devices include microphones, and the input data includes audio data associated with the user 150 (e.g., speech samples). In some implementations, the remote input devices include image sensors (e.g., cameras), and the input data includes images of the user 150. In some implementations, the input data characterizes body poses of the user 150 at different times. In some implementations, the input data characterizes head poses of the user 150 at different times. In some implementations, the input data characterizes hand tracking information associated with the hands of the user 150 at different times. In some implementations, the input data characterizes the velocity and/or acceleration of body parts of the user 150 such as their hands. In some implementations, the input data indicates joint positions and/or joint orientations of the user 150. In some implementations, the remote input devices include feedback devices such as speakers, lights, or the like.
In some implementations, the one or more communication buses 204 include circuitry that interconnects and controls communications between system components. In some implementations, the one or more I/O devices 206 include at least one of a keyboard, a mouse, a touchpad, a touchscreen, a joystick, one or more microphones, one or more speakers, one or more image sensors, one or more displays, and/or the like.
The memory 220 includes high-speed random-access memory, such as dynamic random-access memory (DRAM), static random-access memory (SRAM), double-data-rate random-access memory (DDR RAM), or other random-access solid-state memory devices. In some implementations, the memory 220 includes non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 220 optionally includes one or more storage devices remotely located from the one or more processing units 202. The memory 220 comprises a non-transitory computer readable storage medium. In some implementations, the memory 220 or the non-transitory computer readable storage medium of the memory 220 stores the following programs, modules and data structures, or a subset thereof described below with respect to
The operating system 230 includes procedures for handling various basic system services and for performing hardware dependent tasks.
In some implementations, a data obtainer 242 is configured to obtain data (e.g., captured image frames of the physical environment 105, presentation data, input data, user interaction data, camera pose tracking information, eye tracking information, head/body pose tracking information, hand/limb/finger/extremity tracking information, sensor data, location data, etc.) from at least one of the I/O devices 206 of the controller 110, the I/O devices and sensors 306 of the electronic device 120, and the optional remote input devices. To that end, in various implementations, the data obtainer 242 includes instructions and/or logic therefor, and heuristics and metadata therefor.
In some implementations, a mapper and locator engine 244 is configured to map the physical environment 105 and to track the position/location of at least the electronic device 120 or the user 150 with respect to the physical environment 105. To that end, in various implementations, the mapper and locator engine 244 includes instructions and/or logic therefor, and heuristics and metadata therefor.
In some implementations, a data transmitter 246 is configured to transmit data (e.g., presentation data such as rendered image frames associated with the XR environment, location data, etc.) to at least the electronic device 120 and optionally one or more other devices. To that end, in various implementations, the data transmitter 246 includes instructions and/or logic therefor, and heuristics and metadata therefor.
In some implementations, a privacy architecture 408 is configured to ingest input data and filter user information and/or identifying information within the input data based on one or more privacy filters. The privacy architecture 408 is described in more detail below with reference to
In some implementations, an eye tracking engine 412 is configured to obtain (e.g., receive, retrieve, or determine/generate) an eye tracking vector 413 (sometimes also referred to herein as a “gaze vector” or a “gaze direction”) as shown in
For example, the eye tracking vector 413 corresponds to or includes a UI element (or an identifier associated therewith) that has been selected, identified or targeted by the eye tracking engine 412 based on the gaze direction. As such, in some implementations, the eye tracking vector 413 indicates the target or focus of the eye tracking engine 412 such as a specific UI element, XR content portion, or the like. The eye tracking engine 412 is described in more detail below with reference to
In some implementations, a body/head pose tracking engine 414 is configured to determine a pose characterization vector 415 based on the input data and update the pose characterization vector 415 over time. For example, as shown in
In some implementations, a content selector 422 is configured to select XR content (sometimes also referred to herein as “graphical content” or “virtual content”) from a content library 425 based on one or more user requests and/or inputs (e.g., a voice command, a selection from a user interface (UI) menu of XR content items, and/or the like). The content selector 422 is described in more detail below with reference to
In some implementations, the content library 425 includes a plurality of content items such as audio/visual (A/V) content and/or XR content, objects, items, scenery, etc. As one example, the XR content includes 3D reconstructions of user captured videos, movies, TV episodes, and/or other XR content. In some implementations, the content library 425 is pre-populated or manually authored by the user 150. In some implementations, the content library 425 is located local relative to the controller 110. In some implementations, the content library 425 is located remote from the controller 110 (e.g., at a remote server, a cloud server, or the like).
In some implementations, a content manager 430 is configured to manage and update the layout, setup, structure, and/or the like for the XR environment 128 including one or more of XR content, one or more user interface (UI) elements associated with the XR content, and a focus indicator in association with one of the one or more UI elements. The content manager 430 is described in more detail below with reference to
In some implementations, a focus visualizer 432 is configured to generate a focus indicator in association with a respective UI element when the gaze direction (e.g., the eye tracking vector 413 in
In some implementations, the focus visualizer 432 is configured to generate a head position indicator based on a head vector associated with the pose characterization vector 415 (e.g., a ray emanating from a predefined portion of the head of the user such as their chin, nose, center of forehead, centroid of face, center point between eyes, etc.) when the gaze direction (e.g., the eye tracking vector 413 in
In some implementations, a pose displacement determiner 434 is configured to detect a change in pose of at least one of a head pose or a body pose of the user 150 and determine an associated displacement value or difference between pose characterization vectors 415 over time. In some implementations, the pose displacement determiner 434 is configured to determine that the displacement value satisfies a threshold displacement metric and, in response, cause an operation associated with the respective UI element to be performed. To that end, in various implementations, the pose displacement determiner 434 includes instructions and/or logic therefor, and heuristics and metadata therefor.
In some implementations, in response to the change in pose of at least one of a head pose or a body pose of the user 150, the content updater 436 is configured to modify an appearance of the focus indicator from a first appearance to a second appearance such as to indicate a magnitude of the change in the pose of at least one of the head pose or the body pose of the user 150. Various examples of changes to the appearance of the focus indicator are described below with reference to the sequences of instances in
In some implementations, in response to the change in pose of at least one of a head pose or a body pose of the user 150, the content updater 436 is configured to modify a location of the head position indicator from a first location to a second location. Various examples of changes to the head position indicator are described below with reference to
In some implementations, a feedback engine 438 is configured to generate sensory feedback (e.g., visual feedback such as text or lighting changes, audio feedback, haptic feedback, etc.) when the focus indicator is displayed, when the appearance of the focus indicator changes, when the focus indicator is removed, and/or the like. Various examples of sensory feedback are described below with reference to the sequences of instances in
In some implementations, a rendering engine 450 is configured to render an XR environment 128 (sometimes also referred to herein as a “graphical environment” or “virtual environment”) or image frame associated therewith as well as the XR content, one or more UI elements associated with the XR content, and/or a focus indicator in association with one of the one or more UI elements. To that end, in various implementations, the rendering engine 450 includes instructions and/or logic therefor, and heuristics and metadata therefor. In some implementations, the rendering engine 450 includes a pose determiner 452, a renderer 454, an optional image processing architecture 462, and an optional compositor 464. One of ordinary skill in the art will appreciate that the optional image processing architecture 462 and the optional compositor 464 may be present for video pass-through configuration but may be removed for fully VR or optical see-through configurations.
In some implementations, the pose determiner 452 is configured to determine a current camera pose of the electronic device 120 and/or the user 150 relative to the A/V content and/or XR content. The pose determiner 452 is described in more detail below with reference to
In some implementations, the renderer 454 is configured to render the A/V content and/or the XR content according to the current camera pose relative thereto. The renderer 454 is described in more detail below with reference to
In some implementations, the image processing architecture 462 is configured to obtain (e.g., receive, retrieve, or capture) an image stream including one or more images of the physical environment 105 from the current camera pose of the electronic device 120 and/or the user 150. In some implementations, the image processing architecture 462 is also configured to perform one or more image processing operations on the image stream such as warping, color correction, gamma correction, sharpening, noise reduction, white balance, and/or the like. The image processing architecture 462 is described in more detail below with reference to
In some implementations, the compositor 464 is configured to composite the rendered A/V content and/or XR content with the processed image stream of the physical environment 105 from the image processing architecture 462 to produce rendered image frames of the XR environment 128 for display. The compositor 464 is described in more detail below with reference to
Although the data obtainer 242, the mapper and locator engine 244, the data transmitter 246, the privacy architecture 408, the eye tracking engine 412, the body/head pose tracking engine 414, the content selector 422, the content manager 430, and the rendering engine 450 are shown as residing on a single device (e.g., the controller 110), it should be understood that in other implementations, any combination of the data obtainer 242, the mapper and locator engine 244, the data transmitter 246, the privacy architecture 408, the eye tracking engine 412, the body/head pose tracking engine 414, the content selector 422, the content manager 430, and the rendering engine 450 may be located in separate computing devices.
In some implementations, the functions and/or components of the controller 110 are combined with or provided by the electronic device 120 shown below in
In some implementations, the one or more communication buses 304 include circuitry that interconnects and controls communications between system components. In some implementations, the one or more I/O devices and sensors 306 include at least one of an inertial measurement unit (IMU), an accelerometer, a gyroscope, a magnetometer, a thermometer, one or more physiological sensors (e.g., blood pressure monitor, heart rate monitor, blood oximetry monitor, blood glucose monitor, etc.), one or more microphones, one or more speakers, a haptics engine, a heating and/or cooling unit, a skin shear engine, one or more depth sensors (e.g., structured light, time-of-flight, LiDAR, or the like), a localization and mapping engine, an eye tracking engine, a body/head pose tracking engine, a hand/limb/finger/extremity tracking engine, a camera pose tracking engine, or the like.
In some implementations, the one or more displays 312 are configured to present the XR environment to the user. In some implementations, the one or more displays 312 are also configured to present flat video content to the user (e.g., a 2-dimensional or “flat” AVI, FLV, WMV, MOV, MP4, or the like file associated with a TV episode or a movie, or live video pass-through of the physical environment 105). In some implementations, the one or more displays 312 correspond to touchscreen displays. In some implementations, the one or more displays 312 correspond to holographic, digital light processing (DLP), liquid-crystal display (LCD), liquid-crystal on silicon (LCoS), organic light-emitting field-effect transitory (OLET), organic light-emitting diode (OLED), surface-conduction electron-emitter display (SED), field-emission display (FED), quantum-dot light-emitting diode (QD-LED), micro-electro-mechanical system (MEMS), and/or the like display types. In some implementations, the one or more displays 312 correspond to diffractive, reflective, polarized, holographic, etc. waveguide displays. For example, the electronic device 120 includes a single display. In another example, the electronic device 120 includes a display for each eye of the user. In some implementations, the one or more displays 312 are capable of presenting AR and VR content. In some implementations, the one or more displays 312 are capable of presenting AR or VR content.
In some implementations, the image capture device 370 correspond to one or more RGB cameras (e.g., with a complementary metal-oxide-semiconductor (CMOS) image sensor or a charge-coupled device (CCD) image sensor), IR image sensors, event-based cameras, and/or the like. In some implementations, the image capture device 370) includes a lens assembly, a photodiode, and a front-end architecture. In some implementations, the image capture device 370 includes exterior-facing and/or interior-facing image sensors.
The memory 320 includes high-speed random-access memory, such as DRAM, SRAM, DDR RAM, or other random-access solid-state memory devices. In some implementations, the memory 320 includes non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 320 optionally includes one or more storage devices remotely located from the one or more processing units 302. The memory 320 comprises a non-transitory computer readable storage medium. In some implementations, the memory 320 or the non-transitory computer readable storage medium of the memory 320 stores the following programs, modules and data structures, or a subset thereof including an optional operating system 330 and a presentation engine 340.
The operating system 330 includes procedures for handling various basic system services and for performing hardware dependent tasks. In some implementations, the presentation engine 340 is configured to present media items and/or XR content to the user via the one or more displays 312. To that end, in various implementations, the presentation engine 340 includes a data obtainer 342, a presenter 470, an interaction handler 520, and a data transmitter 350.
In some implementations, the data obtainer 342 is configured to obtain data (e.g., presentation data such as rendered image frames associated with the user interface or the XR environment, input data, user interaction data, head tracking information, camera pose tracking information, eye tracking information, hand/limb/finger/extremity tracking information, sensor data, location data, etc.) from at least one of the I/O devices and sensors 306 of the electronic device 120, the controller 110, and the remote input devices. To that end, in various implementations, the data obtainer 342 includes instructions and/or logic therefor, and heuristics and metadata therefor.
In some implementations, the interaction handler 420 is configured to detect user interactions with the presented A/V content and/or XR content (e.g., gestural inputs detected via hand tracking, eye gaze inputs detected via eye tracking, voice commands, etc.). To that end, in various implementations, the interaction handler 420 includes instructions and/or logic therefor, and heuristics and metadata therefor.
In some implementations, the presenter 470 is configured to present and update A/V content and/or XR content (e.g., the rendered image frames associated with the user interface or the XR environment 128 including the XR content, one or more UI elements associated with the XR content, and a focus indicator in association with one of the one or more UI elements) via the one or more displays 312. To that end, in various implementations, the presenter 470 includes instructions and/or logic therefor, and heuristics and metadata therefor.
In some implementations, the data transmitter 350 is configured to transmit data (e.g., presentation data, location data, user interaction data, head tracking information, camera pose tracking information, eye tracking information, hand/limb/finger/extremity tracking information, etc.) to at least the controller 110. To that end, in various implementations, the data transmitter 350 includes instructions and/or logic therefor, and heuristics and metadata therefor.
Although the data obtainer 342, the interaction handler 420, the presenter 470, and the data transmitter 350 are shown as residing on a single device (e.g., the electronic device 120), it should be understood that in other implementations, any combination of the data obtainer 342, the interaction handler 420, the presenter 470, and the data transmitter 350) may be located in separate computing devices.
Moreover,
As shown in
Similarly, as shown in
According to some implementations, the privacy architecture 408 ingests the local sensor data 403 and the remote sensor data 405. In some implementations, the privacy architecture 408 includes one or more privacy filters associated with user information and/or identifying information. In some implementations, the privacy architecture 408 includes an opt-in feature where the electronic device 120 informs the user 150 as to what user information and/or identifying information is being monitored and how the user information and/or the identifying information will be used. In some implementations, the privacy architecture 408 selectively prevents and/or limits content delivery architecture 400 or portions thereof from obtaining and/or transmitting the user information. To this end, the privacy architecture 408 receives user preferences and/or selections from the user 150 in response to prompting the user 150 for the same. In some implementations, the privacy architecture 408 prevents the content delivery architecture 400 from obtaining and/or transmitting the user information unless and until the privacy architecture 408 obtains informed consent from the user 150. In some implementations, the privacy architecture 408 anonymizes (e.g., scrambles, obscures, encrypts, and/or the like) certain types of user information. For example, the privacy architecture 408 receives user inputs designating which types of user information the privacy architecture 408 anonymizes. As another example, the privacy architecture 408 anonymizes certain types of user information likely to include sensitive and/or identifying information, independent of user designation (e.g., automatically).
According to some implementations, the eye tracking engine 412 obtains the local sensor data 403 and the remote sensor data 405 after having been subjected to the privacy architecture 408. In some implementations, the eye tracking engine 412 obtains (e.g., receives, retrieves, or determines/generates) an eye tracking vector 413 (sometimes also referred to herein as a “gaze vector” or a “gaze direction”) based on the input data and updates the eye tracking vector 413 over time. For example, the eye tracking vector 413 corresponds to or includes a UI element (or an identifier associated therewith) that has been selected, identified or targeted by the eye tracking engine 412 based on the gaze direction. As such, in some implementations, the eye tracking vector 413 indicates the target or focus of the eye tracking engine 412 such as a specific UI element, XR content portion, or the like.
For example, the gaze direction indicates a point (e.g., associated with x, y, and z coordinates relative to the physical environment 105 or the world-at-large), a physical object, or a region of interest (ROI) in the physical environment 105 at which the user 150 is currently looking. As another example, the gaze direction indicates a point (e.g., associated with x, y, and z coordinates relative to the XR environment 128), an XR object, or a region of interest (ROI) in the XR environment 128 at which the user 150 is currently looking.
According to some implementations, the body/head pose tracking engine 414 obtains the local sensor data 403 and the remote sensor data 405 after it has been subjected to the privacy architecture 408. In some implementations, the body/head pose tracking engine 414 determines a pose characterization vector 415 based on the input data and updates the pose characterization vector 415 over time.
According to some implementations, the interaction handler 420 obtains (e.g., receives, retrieves, or detects) one or more user inputs 421 provided by the user 150 that are associated with selecting A/V content and/or XR content for presentation. For example, the one or more user inputs 421 correspond to a gestural input selecting XR content from a UI menu detected via hand tracking, an eye gaze input selecting XR content from the UI menu detected via eye tracking, a voice command selecting XR content from the UI menu detected via a microphone, and/or the like. In some implementations, the content selector 422 selects XR content 427 from the content library 425 based on one or more user inputs 421 (e.g., a voice command, a selection from a menu of XR content items, and/or the like).
In various implementations, the content manager 430 manages and updates the layout, setup, structure, and/or the like for the XR environment 128 including one or more of XR content, one or more user interface (UI) elements associated with the XR content, and a focus indicator in association with one of the one or more UI elements. To that end, the content manager 430 includes the focus visualizer 432, the pose displacement determiner 434, the content updater 436, and the feedback engine 438.
In some implementations, the focus visualizer 432 generates a focus indicator in association with a respective UI element when the eye tracking vector 413 is directed to the respective UI element for at least a threshold time period (e.g., a dwell threshold time). Various examples of the focus indicator are described below with reference to the sequences of instances in
In some implementations, the pose displacement determiner 434 detects a change in pose of at least one of a head pose or a body pose of the user 150 and determines an associated displacement value or difference between pose characterization vectors 415 over time. In some implementations, the pose displacement determiner 434 determines that the displacement value satisfies a threshold displacement metric and, in response, causes an operation associated with the respective UI element to be performed.
In some implementations, in response to the change in pose of at least one of a head pose or a body pose of the user 150, the content updater 436 modifies an appearance of the focus indicator from a first appearance to a second appearance to indicate a magnitude of the change in pose. Various examples of changes to the appearance of the focus indicator are described below with reference to the sequences of instances in
In some implementations, the feedback engine 438 generates sensory feedback (e.g., visual feedback such as text or lighting changes, audio feedback, haptic feedback, etc.) when the focus indicator is displayed, when the appearance of the focus indicator changes, when the focus indicator is removed, and/or the like.
According to some implementations, the pose determiner 452 determines a current camera pose of the electronic device 120 and/or the user 150 relative to the XR environment 128 and/or the physical environment 105. In some implementations, the renderer 454 renders the XR content 427, one or more UI elements associated with the XR content, and a focus indicator in association with one of the one or more UI elements according to the current camera pose relative thereto.
According to some implementations, the optional image processing architecture 462 obtains an image stream from an image capture device 370 including one or more images of the physical environment 105 from the current camera pose of the electronic device 120 and/or the user 150. In some implementations, the image processing architecture 462 also performs one or more image processing operations on the image stream such as warping, color correction, gamma correction, sharpening, noise reduction, white balance, and/or the like. In some implementations, the optional compositor 464 composites the rendered XR content with the processed image stream of the physical environment 105 from the image processing architecture 562 to produce rendered image frames of the XR environment 128. In various implementations, the presenter 470 presents the rendered image frames of the XR environment 128 to the user 150 via the one or more displays 312. One of ordinary skill in the art will appreciate that the optional image processing architecture 462 and the optional compositor 464 may not be applicable for fully virtual environments (or optical see-through scenarios).
As shown in
In other words, in some implementations, the electronic device 120 is configured to present XR content and to enable optical see-through or video pass-through of at least a portion of the physical environment 105 on the display 122 (e.g., the door 115). For example, the electronic device 120 corresponds to a mobile phone, tablet, laptop, near-eye system, wearable computing device, or the like.
As shown in
As shown in
In response to detecting that the gaze direction of the user 150 has been directed to the UI element 504A for at least a threshold amount of time (e.g., X seconds), the electronic device 120 presents a focus indicator 512A with a first appearance in association with the UI element 504A. As shown in
In response to detecting a change in a head pose of the user 150 while the gaze direction is still directed at the UI element 504A, the electronic device 120 modifies the focus indicator to indicate a magnitude of the change in the head pose of the user 150 by changing the focus indicator from the first appearance to a second appearance. As shown in
In response to detecting a further change in the head pose of the user 150 while the gaze direction is still directed at the UI element 504A, the electronic device 120 modifies the focus indicator to indicate the magnitude of the change in the head pose of the user 150 by changing the focus indicator from the second appearance to a third appearance. As shown in
In response to determining that the displacement value 524C exceeds the threshold displacement metric 526, the electronic device 120 activates the UI element 504A or, in other words, performs an operation associated with the UI element 504A. As shown in
As shown in
In response to detecting that the first gaze direction of the user 150 has been directed to the UI element 504A for at least a threshold amount of time (e.g., X seconds), the electronic device 120 presents a focus indicator 612A with a first appearance in association with the UI element 504A. As shown in
In response to detecting that the gaze direction of the user 150 is no longer directed to the UI element 504A, the electronic device 120 removes the focus indicator 612A from the XR environment 128. As shown in
In response to detecting that the second gaze direction of the user 150 has been directed to the UI element 504C for at least the threshold amount of time (e.g., X seconds), the electronic device 120 presents a focus indicator 642A with a first appearance in association with the UI element 504C. As shown in
As shown in
In response to detecting that the gaze direction of the user 150 has been directed to the UI element 504A for at least a threshold amount of time (e.g., X seconds), the electronic device 120 presents a focus indicator 712A with a first appearance in association with the UI element 504A. As shown in
In response to detecting a change in a head pose of the user 150 while the gaze direction is still directed at the UI element 504A, the electronic device 120 modifies the focus indicator to indicate a magnitude of the change in the head pose of the user 150 by changing the focus indicator from the first appearance to a second appearance. As shown in
In response to detecting a further change in the head pose of the user 150 while the gaze direction is still directed at the UI element 504A, the electronic device 120 modifies the focus indicator to indicate the magnitude of the change in the head pose of the user 150 by changing the focus indicator from the second appearance to a third appearance. As shown in
In response to determining that the displacement value 724C exceeds the threshold displacement metric 526, the electronic device 120 activates the UI element 504A or, in other words, performs an operation associated with the UI element 504A. As shown in
While
As represented by block 802, the method 800 includes displaying a user interface (UI) element. As represented by block 804, the method 800 includes determining whether a gaze direction (e.g., the eye tracking vector 413 in
As represented by block 806, the method 800 includes presenting a focus indicator in associated with the UI element. As one example,
As represented by block 808, the method 800 includes determining whether the gaze direction 412 is still directed to the UI element. If the gaze direction (e.g., the eye tracking vector 413 in
As represented by block 812, the method 800 includes determining whether a change in pose (e.g., the body and/or head pose of the user 150) is detected (based on the pose characterization vector(s) 415) while the gaze direction (e.g., the eye tracking vector 413 in
As represented by block 814, the method 800 includes modifying the focus indicator by changing its appearance, sound, haptics, or the like. As one example, in response to the change in the head pose of the user 150,
As represented by block 816, the method 800 includes determining whether a displacement value associated with the change in the pose satisfies a threshold displacement metric. If the change in the pose satisfies the threshold displacement metric (“Yes” branch from block 816), the method 800 continues to block 818. If the change in the pose does not satisfy the threshold displacement metric (“No” branch from block 816), the method 800 continues to block 806.
As represented by block 818, the method 800 includes performing an operation associated with the UI element. As one example,
As discussed above, various scenarios may involve selecting a user interface (UI) element by focusing a UI element (e.g., based on the gaze direction) and preforming a secondary action such as nodding. However, a user may not be aware that the nod input controls the UI element or that the nod input is successful. As such, in various implementations, an abstraction of the nod (e.g., a dynamic visual slide bar) is displayed in association with the UI element to indicate the progress and completion of the nod input.
As represented by block 902, the method 900 includes displaying, via the display device, a first user interface element within an extended reality (XR) environment. In some implementations, the XR environment includes the first user interface element and at least one other user interface element. In some implementations, the XR environment includes XR content, and the first user interface element is associated with performing a first operation on the XR content. For example,
In some implementations, the first UI element is associated with XR content that is also overlaid on the physical environment. For example, the first UI element is operable to perform an operation on the XR content, manipulate the XR content, change/modify the XR content, and/or the like. In some implementations, the UI element is one of world-locked (e.g., anchored to a physical object in the physical environment 105), head-locked (e.g., anchored to a predefined position in the user's FOV), body-locked, and/or the like. As one example, if the UI element is head-locked, the UI element remains in the FOV 111 of the user 150 when he/she locomotes about the physical environment 105. As another example, if the UI element is world-locked, the UI element remains anchored to a physical object in the physical environment 105 when the user 150 locomotes about the physical environment 105. In some implementations, the UI element is one of world-locked (e.g., anchored to a physical object in the physical environment 105), head-locked (e.g., anchored to a predefined position in the user's FOV), body-locked, and/or the like.
For example, with reference to
In some implementations, the display device includes a transparent lens assembly, and wherein the XR content and the first user interface element is projected onto the transparent lens assembly. In some implementations, the display device includes a near-eye system, and wherein presenting the XR content and the first user interface element includes compositing the XR content and the first user interface element with one or more images of a physical environment captured by an exterior-facing image sensor. In some implementations, the XR environment corresponds to AR content overlaid on the physical environment. In one example, the XR environment is associated with an optical see-through configuration. In another example, the XR environment is associated with a video pass-through configuration. In some implementations, the XR environment corresponds a VR environment with VR content.
In some implementations, the method 900 includes: displaying, via the display device, a gaze indicator within the XR environment associated with the gaze direction. For example,
As represented by block 904, the method 900 includes determining a gaze direction based on first input data from the one or more input devices. For example, the first input data corresponds to images from one or more eye tracking cameras. In some implementations, the computing system determines that the first UI element is the intended focus/ROI from among a plurality of UI elements based on that the gaze direction. In some implementations, the computing system or a component thereof (e.g., the eye tracking engine 412 in
For example,
As represented by block 906, in response to determining that the gaze direction is directed to the first user interface element, the method 900 includes displaying, via the display device, a focus indicator with a first appearance in association with the first user interface element. In some implementations, the computing system also determines whether the gaze direction has been directed to the first user interface element for at least a predefined amount of time (e.g., X seconds). In some implementations, the computing system or a component thereof (e.g., the focus visualizer 432 in
As one example,
As represented by block 909, the method 900 includes detecting, via the one or more input devices, a change in pose of at least one of a head pose or a body pose of a user of the computing system. In some implementations, the computing system or a component thereof (e.g., the body/head pose tracking engine 414 in
As represented by block 910, in response to detecting the change of pose, the method 900 includes modifying the focus indicator in pose by changing the focus indicator from the first appearance to a second appearance different from the first appearance. In some implementations, in response to the change in pose of at least one of a head pose or a body pose of the user 150, the computing system or a component thereof (e.g., the content updater 436 in
In some implementations, the first appearance corresponds to a first position within the XR environment and the second appearance corresponds to a second position within the XR environment different from the first position. For example, the computing system moves the first UI element relative to one axis such up/down or left/right. For example, the computing system moves the first UI element relative to two or mor axes. As one example, in response to the change in the head pose of the user 150,
In some implementations, the first appearance corresponds to a first size for the focus indicator and the second appearance corresponds to a second size for the focus indicator different from the first size. For example, the computing system increases or decreases the size of the focus indicator. As another example, in response to the change in the head pose of the user 150,
In some implementations, modifying the focus indicator includes movement of the focus indicator based on the magnitude of the change in pose. In some implementations, a sensitivity value for the movement be preset or adjusted by the user 150, which corresponds to the proportionality or mapping therebetween. As one example, 1 cm of head pose movement may correspond to 1 cm of focus indicator movement. As another example, 1 cm of head pose movement may correspond to 5 cm of focus indicator movement. As yet another example, 5 cm of head pose movement may correspond to 1 cm of focus indicator movement.
In some implementations, the movement of the focus indicator is proportional to the magnitude of the change in pose. For example, the computing system modifies the focus indicator based on one-to-one movement between head pose and focus indicator. In some implementations, the movement of the focus indicator is not proportional to the magnitude of the change in pose. For example, the movement between head pose and focus indicator is not one-to-one and corresponds to a function or mapping therebetween.
In some implementations, the method 900 includes: prior to detecting the change in pose, determining a first pose characterization vector based on second input data from the one or more input devices, wherein the first pose characterization vector corresponds to one of an initial head pose or an initial body pose of the user of the computing system: and (e.g., an initial body/head pose) after detecting the change in pose, determining a second pose characterization vector based on the second input data from the one or more input devices, wherein the second pose characterization vector corresponds to one of a subsequent head pose or a subsequent body pose of the user of the computing system.
In some implementations, the method 900 includes: determining a displacement value between the first and second pose characterization vectors: and in accordance with a determination that the displacement value satisfies a threshold displacement metric, performing an operation associated with the first user interface element within the XR environment. For example, the operation is performed on an associated XR content with the XR environment. In some implementations, the computing system or a component thereof (e.g., the pose displacement determiner 434 in
As one example,
In some implementations, the method 900 includes: determining a change of the gaze direction based on first input data from the one or more input devices: and in response to determining that the gaze direction is not directed to the first user interface element due to the change of the gaze direction, ceasing display of the focus indicator in association with the first user interface element. In some implementations, the computing system or a component thereof (e.g., the pose displacement determiner 434 in
As shown in
In other words, in some implementations, the electronic device 120 is configured to present XR content and to enable optical see-through or video pass-through of at least a portion of the physical environment 105 on the display 122 (e.g., the door 115). For example, the electronic device 120 corresponds to a mobile phone, tablet, laptop, near-eye system, wearable computing device, or the like.
As shown in
As shown in
As shown in
As shown in
One of ordinary skill in the art will appreciate that the visualization 1008 may be removed in various implementations or replaced with other forms or configurations in various other implementations. As shown in
As shown in
According to some implementations, the second location for the head position indicator 1042 coincides with the activation region 1044 (e.g., the selectable region) of the affordance 1014 (e.g., the UI element) in accordance with a determination that at least a portion of the head position indicator 1042 breaches the activation region 1044 (e.g., the selectable region) of the affordance 1014. According to some implementations, the second location for the head position indicator 1042 coincides with the activation region 1044 of the affordance 1014 (e.g., the UI element) in accordance with a determination that the head position indicator 1042 is fully within the activation region 1044.
As shown in
As shown in
As shown in
As shown in
As shown in
As shown in
As shown in
In some implementations, the electronic device 120 presents the selectable region 1076 before the dwell timer 1005 has been satisfied. In some implementations, the electronic device 120 presents the selectable region 1076 according to a determination that the dwell timer 1005 has been satisfied. In some implementations, the electronic device 120 presents the bounding box 10128 within the XR environment 128 before the dwell timer 1005 has been satisfied. In some implementations, the electronic device 120 presents the bounding box 10128 within the XR environment 128 according to a determination that the dwell timer 1005 has been satisfied.
As shown in
As shown in
As shown in
As shown in
As shown in
As shown in
Various scenarios involve selecting a user interface element based on gaze direction and/or the like. However, using gaze alone as an input modality, which is inherently jittery and inaccurate, may lead to false positives when interacting with a user interface (UI) and also with UI elements therein. As such, in various implementations, when a gaze direction satisfies a dwell timer, a head position indicator is provided which may directly track a current head vector or indirectly track the current head vector with some offset therebetween. Thereafter, the head position indicator may be used as a cursor to activate user interface elements and/or otherwise interact with an XR environment. As such, as described herein, a user may activate a UI element and/or otherwise interact with the UI using a head position indicator (e.g., a head position cursor or focus indicator) that surfaces in response to satisfying a gaze-based dwell timer associated with the UI element.
As represented by block 1102, the method 1100 includes presenting, via the display device, a user interface (UI) element within a UI. For example, the UI element includes one or more selectable regions such as a selectable affordance, an activation affordance, a radio button, a slider, a knob/dial, and/or the like. As one example,
In some implementations, the UI element is presented within an extended reality (XR) environment. As shown in
For example, the UI element is operable to perform an operation on the XR content, manipulate the XR content, animate the XR content, change/modify the XR content, and/or the like. In some implementations, the UI element is one of world-locked (e.g., anchored to a physical object in the physical environment 105), body-locked (e.g., anchored to a predefined portion of the user's body), and/or the like. As one example, if the UI element is world-locked, the UI element remains anchored to a physical object or a point within the physical environment 105 when the user 150 locomotes about the physical environment 105.
As represented by block 1104, the method 1100 includes obtaining (e.g., receiving, retrieving, or generating/determining) a gaze vector based on first input data from the one or more input devices, wherein the gaze vector is associated with a gaze direction of a user. In some implementations, as represented by block 1104, the method 1100 includes updating a pre-existing gaze vector based on the first input data from the one or more input devices, wherein the gaze vector is associated with the gaze direction of the user. For example, with reference to
For example,
For example, the first input data corresponds to images from one or more image sensors or eye tracking cameras integrated with or separate from the computing system. In some implementations, the computing system includes an eye tracking engine that maintains the gaze vector (sometimes also referred to herein as an “eye tracking vector”) based on images that include the pupils of the user from one or more interior-facing image sensors. In some implementations, the gaze vector corresponds to an intersection of rays emanating from each of the eyes of the user or a ray emanating from a center point between the user's eyes.
As represented by block 1106, the method 1100 includes determining whether the gaze satisfies an attention criterion associated with the UI element. In some implementations, the attention criterion is satisfied according to a determination that the gaze vector satisfies an accumulator threshold associated with the UI element. In some implementations, the attention criterion is satisfied according to a determination that the gaze vector is directed to the UI element for at least a threshold time period. As one example, the threshold time period corresponds to a predefined dwell timer. As another example, the threshold time period corresponds to a non-deterministic dwell timer that is dynamically determined based on user preferences, usage information, eye gaze confidence, and/or the like. For example,
If the gaze vector satisfies the attention criterion associated with the UI element (“Yes” branch from block 1106), the method 1100 continues to block 1108. If the gaze vector does not satisfy the attention criterion associated with the UI element (“No” branch from block 1106), the method 1100 continues to block 1104 and updates the gaze vector for a next frame, instance, iteration, time period, cycle, or the like. As such, in some implementations, in accordance with the determination that the gaze vector does not satisfy the attention criterion associated with the UI element, the method 1100 includes forgoing presenting the head position indicator at the first location.
As represented by block 1108, in accordance with the determination that the gaze vector satisfies the attention criterion associated with the UI element, the method 1100 includes obtaining (e.g., receiving, retrieving, or generating/determining) a head vector based on second input data from the one or more input devices, wherein the head vector is associated with a head pose of the user. In some implementations, as represented by block 1108, the method 1100 includes updating a pre-existing head vector based on the input data from the one or more input devices, wherein the head vector is associated with a head pose of the user. In some implementations, the method 1100 includes updating at least one of the gaze vector or the head vector in response to a change in the input data from the one or more input devices. For example, with reference to
For example, the second input data corresponds to IMU data, accelerometer data, gyroscope data, magnetometer data, image data, etc. from sensors integrated with or separate from the computing system. In some implementations, the head vector corresponds to a ray emanating from a predefined portion of the head of the user such as their chin, nose, center of forehead, centroid of face, center point between eyes, or the like. For example,
In some implementations, the computing system obtains the first and second input data from at least one overlapping sensor. In some implementations, the computing system obtains the first and second input data from different sensors. In some implementations, the first and second input data include overlapping data. In some implementations, the first and second input data include mutually exclusive data.
As represented by block 1110, in accordance with the determination that the gaze vector satisfies the attention criterion associated with the UI element, the method 1100 includes presenting, via the display device, a head position indicator at a first location within the UI. For example, with reference to
As one example, with reference to
In some implementations, the head position indicator corresponds to XR content presented within the XR environment. In some implementations, the computing system presents the head position indicator at a default location relative to the UI element such as the center of the UI element, an edge of the UI element, or the like. In some implementations, the computing system presents the head position indicator at a location where the head vector intersects with the UI element or another portion of the UI. Thus, for example, the head position indicator may start outside of or exit a volumetric region associated with the UI element.
In some implementations, the computing system ceases display of the head position indicator according to a determination that a disengagement criterion has been satisfied. As one example, the disengagement criterion is satisfied when the gaze vector is no longer directed to the UI element (e.g., quick deselection, but may accidentally trigger with jittery gaze tracking). As another example, the disengagement criterion is satisfied when the gaze vector is no longer directed to the UI element for at least the threshold time period. As yet another example, the disengagement criterion is satisfied when the gaze vector no longer fulfills an accumulator threshold for the UI element.
According to some implementations, as represented by block 1110A, the first location for the head position indicator corresponds to a default location associated with the UI element. As one example, the default location corresponds to a center or centroid of the UI element. As another example, the default location corresponds to an edge of the UI element. As shown in
According to some implementations, as represented by block 1110B, the first location for the head position indicator corresponds to a point along the head vector. In some implementations, the head position indicator tracks the head vector. For example, while the head vector is directed to the UI element, the first location corresponds to an intersection between the head vector and the UI element. As shown in
According to some implementations, as represented by block 1110C, the first location for the head position indicator corresponds to a spatial offset relative to a point along the head vector. According to some implementations, as represented by block 1110D, the first location for the head position indicator corresponds to a point along the gaze vector. According to some implementations, as represented by block 1110E, the first location for the head position indicator corresponds to a spatial offset relative to a point along the gaze vector.
In some implementations, in accordance with the determination that the gaze vector satisfies the attention criterion associated with the UI element, the method 1100 also includes presenting, via the display device, an activation region associated with the selectable region of the UI element. For example, the activation region corresponds to a collider/hit area associated with the UI element (or a portion thereof). As such, in some implementations, the computing system presents the activation region in accordance with the determination that the gaze vector lingers on the UI element (or a volumetric region associated therewith) for at least the threshold time period.
As one example, in
As represented by block 1112, after presenting the head position indicator at the first location, the method 1100 includes detecting, via the one or more input devices, a change to one or more values of the head vector. For example, the change to one or more values of the head vector corresponds to displacement in x, y, and/or z positional values and/or in pitch, roll, and/or yaw rotational values. As one example, the computing system detects a change to one or more values of the head vector between
As represented by block 1114, the method 1100 includes updating presentation of the head position indicator from the first location to a second location within the UI based on the change to the one or more values of the head vector. In some implementations, while the head vector intersects with the UI element, the head position indicator tracks the location of the head vector. In some implementations, the head position indicator is offset is one or more spatial dimensions relative to the head vector, and the head position indicator moves at the head vector changes while preserving the offset.
As one example, with reference to
As represented by block 1116, the method 1100 includes determining whether the second location for the head position indicator coincides with the selectable region of the UI element. As one example, in
If the second location for the head position indicator coincides with the selectable region of the UI element (“Yes” branch from block 1116), the method 1100 continues to block 1118. If the second location for the head position indicator does not coincide with the selectable region of the UI element (“No” branch from block 1116), the method 1100 continues to block 1108 and updates the head vector for a next frame, instance, iteration, time period, cycle, or the like. As such, in some implementations, in accordance with a determination that the second location for the head position indicator does not coincide with the selectable region of the UI element, the method 1100 includes foregoing performance of the operation associated with the UI element.
As represented by block 1118, in accordance with a determination that the second location for the head position indicator coincides with the selectable region of the UI element, the method 1100 includes performing an operation associated with the UI element (or a portion thereof). As one example, the operation corresponds to one of toggling on/off a setting if the selectable region corresponds to a radio button, displaying XR content within the XR environment (e.g., the VA customization menu 1062 in
In some implementations, the operation associated with the UI element (or the portion thereof) is performed in accordance with the determination that the second location for the head position indicator coincides with the selectable region of the UI element and in accordance with a determination that the change to the one or more values of the head vector corresponds to a movement pattern. As one example, the movement pattern corresponds to a predefined pattern such as a substantially diagonal movement, a substantially z-like movement, a substantially v-like movement, a substantially upside-down v-like movement, or the like. As another example, the movement pattern corresponds to a non-deterministic movement pattern that is dynamically determined based on user preferences, usage information, head pose confidence, and/or the like.
In some implementations, the method 1100 includes: in accordance with a determination that a magnitude of the change to the one or more values of the head vector satisfies a displacement criterion, performing the operation associated with the UI element; and in accordance with a determination that the magnitude of the change to the one or more values of the head vector does not satisfy the displacement criterion, foregoing performance of the operation associated with the UI element. In some implementations, the displacement criterion corresponds to a predefined or non-deterministic amount of horizontal head movement. In some implementations, the displacement criterion corresponds to a predefined or non-deterministic amount of vertical head movement. In some implementations, the displacement criterion corresponds to a predefined or non-deterministic amount of diagonal (e.g., vertical and horizontal) head movement. In some implementations, the displacement criterion corresponds to a predefined pattern of head movement.
While various aspects of implementations within the scope of the appended claims are described above, it should be apparent that the various features of implementations described above may be embodied in a wide variety of forms and that any specific structure and/or function described above is merely illustrative. Based on the present disclosure one skilled in the art should appreciate that an aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method may be practiced using any number of the aspects set forth herein. In addition, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to or other than one or more of the aspects set forth herein.
It will also be understood that, although the terms “first”, “second”, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first media item could be termed a second media item, and, similarly, a second media item could be termed a first media item, which changing the meaning of the description, so long as the occurrences of the “first media item” are renamed consistently and the occurrences of the “second media item” are renamed consistently. The first media item and the second media item are both media items, but they are not the same media item.
The terminology used herein is for the purpose of describing particular implementations only and is not intended to be limiting of the claims. As used in the description of the implementations and the appended claims, the singular forms “a”, “an”, and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “comprises” and/or “comprising.” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
As used herein, the term “if” may be construed to mean “when” or “upon” or “in response to determining” or “in accordance with a determination” or “in response to detecting,” that a stated condition precedent is true, depending on the context. Similarly, the phrase “if it is determined [that a stated condition precedent is true]” or “if [a stated condition precedent is true]” or “when [a stated condition precedent is true]” may be construed to mean “upon determining” or “in response to determining” or “in accordance with a determination” or “upon detecting” or “in response to detecting” that the stated condition precedent is true, depending on the context.
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/US2022/011922 | 1/11/2022 | WO |
| Number | Date | Country | |
|---|---|---|---|
| 63286188 | Dec 2021 | US | |
| 63137204 | Jan 2021 | US |