This disclosure relates generally to eye tracking, and in particular to illumination for eye tracking.
Current 3D sensing technologies (such as structured light), especially for the near range, suffer from noisy and lower quality depth maps. The noise and quality deficit of depth maps can be attributed to significant down-sampling of camera images.
Non-limiting and non-exhaustive embodiments of the invention are described with reference to the following figures, wherein like reference numerals refer to like parts throughout the various views unless otherwise specified.
Embodiments of a fringe projector for depth sensing are described herein. In the following description, numerous specific details are set forth to provide a thorough understanding of the embodiments. One skilled in the relevant art will recognize, however, that the techniques described herein can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring certain aspects.
Reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Throughout this specification, several terms of art are used. These terms are to take on their ordinary meaning in the art from which they come, unless specifically defined herein or the context of their use would clearly suggest otherwise.
In some embodiments of the disclosure, “near-eye” may be defined as including an optical element that is configured to be placed within 35 mm of an eye of a user while a near-eye optical device such as a head mounted device is being utilized.
Eye region or eyebox region may be defined as a region of interest that includes an area or a volume next to a lens assembly where an eye of a user is located or may be located while using a head mounted device. The eye region or eyebox region may have height dimensions of 8 mm to 20 mm, width dimensions of 10 mm to 20 mm, and depth dimensions of 5 mm to 25 mm, in some implementations.
Current 3D sensing technologies (such as structured light), especially for near range, suffer from noisy and lower quality depth maps. The noise and low quality are mainly due to significant down-sampling of camera images. Fringe illumination-based (3D) depth sensing can provide denser and more accurate depth maps. To obtain more accurate 3D shapes using fringe-based depth sensing, embodiments of the disclosure include a compact, versatile, and efficient fringe projector that can be implement in head mounted devices (e.g., smart glasses, augmented reality headset, or virtual reality headset). The fringe projectors may be used for eye tracking, face tracking, and/or hand tracking.
Implementations of this disclosure propose compact and inconspicuous fringe projectors directly formed on a transparent substrate that can be positioned in a field-of-view (FOV) of a user without obscuring or blocking a user's see-through view of a scene. Thus, implementations of the disclosure may be suitable for AR/VR applications. Photonic integrated circuits that may provide these features may include static phase shifting, a tunable laser for phase shifting, and/or wavelength multiplexing.
A head mounted device may include a depth sensing system to support eye tracking operations. The depth sensing system may include a fringe projector to generate a fringe pattern, a sensor (e.g., image sensor), and a controller.
The fringe projector may include a laser, a waveguide structure, and two outcoupling elements to generate a fringe pattern, for example, on an eye of a user. The laser may be positioned within a frame of the head mounted device and may be configured emit a light beam into a transparent layer of a lens assembly. The light beam may be a Gaussian beam. The wavelength of the light beam may be chirped (e.g., ramped up and/or down over time) or adjusted to alter characteristics of the fringe pattern.
The waveguide structure may include two channels. Each of the channels may be optically coupled to one of two outcoupling elements. The channels propagate light from the laser to the outcoupling elements. The waveguide structure may have two independent channels that couple the light source to the outcoupling elements. The waveguide structure may have a shared channel that is split into two independent channels that enable separation of the outcoupling elements. One of the channels may be elongated to optically delay the phase of light passing through the elongated channel. Delaying the phase of light for one channel and not the other channel may be combined with sweeping the wavelength of the laser to modify (e.g., move left, right, up, down) the interference lines of the fringe pattern.
The outcoupling elements may be placed directly in front of the eye of a user to enable fringe projection from in front of the eye. The outcoupling elements may be fabricated in a transparent layer of the lens assembly and may be unnoticeably small (e.g., 3 μm×3 μm). The outcoupling elements may be mirrors, grating couplers, or some other optical output coupler. The outcoupling elements are configured to redirect light beam from the waveguide structure to an eye region.
A sensor may be coupled to the frame and may be oriented towards the eye region to capture reflections of a fringe pattern off of an eye. The sensor may provide image data that is representative of the reflections to a controller. The controller may determine or generate a depth map of the eye region based on the image data, according to an embodiment.
The depth sensing system may have a variety of configurations. For example, one lens assembly (or optical element) may include two or more waveguide structures that direct light to two or more pairs of outcoupling elements. Each waveguide structure may be coupled to its own laser. The lasers may be operated concurrently or sequentially. The lasers may be configured to operate on two different wavelengths (e.g., 850 nm and 940 nm) to provide fringe patterns of different wavelengths to improve depth sensing. The outcoupling elements of multiple fringe projectors may be interleaved and inline with each other, may be perpendicular to each other, or may be offset from one another at a variety of angles. Multiple pairs of outcoupling elements may be positioned close to each other in a lens assembly so as to generate fringe patterns with approximately the same center point. One or sensors may be positioned around the frame of the head mounted device to image the fringe patterns. The one or more sensors may be configured to image different wavelengths, may be configured to image one or two specific wavelengths (e.g., 850 nm and 940 nm), and/or may be configured to image one or more fringe patterns from different locations on the frame.
The apparatus, system, and method for depth sensing that are described in this disclosure may enable improvements in eye tracking technologies, for example, to support operations of a head mounted device. These and other embodiments are described in more detail in connection with
The illustrated example of head mounted device 100 is shown as including a frame 102, temple arms 104A and 104B, and optical elements 106A and 106B.
As shown in
In further examples, some or all of optical elements 106A and 106B may be incorporated into a virtual reality headset where the transparent nature of optical elements 106A and 106B allows the user to view an electronic display (e.g., a liquid crystal display (LCD), an organic light emitting diode (OLED) display, a micro-LED display, etc.) incorporated in the virtual reality headset. In this context, display layer 120 may be replaced by the electronic display.
Illumination layer 110 may be configured to project fringe pattern 116 from in-field (within a field-of-view) and from directly in front of eye region 115, in accordance with aspects of the disclosure. Illumination layer 110 includes a transparent layer that may be formed of optical polymers, glasses, silicon dioxide (SiO2), transparent wafers (such as high-purity semi-insulating SiC wafers) or any other transparent materials used for this purpose. Illumination layer 110 includes a waveguide structure 108 and outcoupling elements 111A and 111B. Waveguide structure 108 and outcoupling elements 111A and 111B are configured to receive light from light source 114 and are configured to emit light beams 113A and 113B to generate fringe pattern 116.
Light beams 113A and 113B may be near-infrared light beams, in some aspects. Light source 114 generates non-visible light (e.g., laser beam) for waveguide structure 108 and may include one or more of a vertical cavity surface emitting laser (VCSEL), on-chip integrated laser, hybrid integrated laser, or an edge emission laser. Light source 114 may be a VCSEL configured to emit with a wavelength in the near infrared range, such as 750-1550 nm. Light source 114 may be a VCSEL that has a single transverse mode. Light source 114 may be a VCSEL with a single junction or with multiple junctions. Light source 114 may be an edge emission laser that is implemented as a FP (Fabry-Perot) or a DFB (distributed feedback) laser. Light source 114 may be a light source that is coupled to waveguide 108 with a microoptical bench. Light source 114 may be a VCSEL with emitters facing toward waveguide structure 108, and light from light source 114 may be incoupled into waveguide structure 108 through a grating coupler or some other input coupler. Light source 114 may be an edge emission laser that is coupled with the waveguides through butt coupling. Light source 114 may be enclosed in frame 102 to be out of the field-of-view (FOV) of a user.
A waveguide structure 108 is configured to receive non-visible light from light source 114, which is coupled to frame 102. In one implementation, waveguide structure 108 may include two independent channels from light source 114 to outcoupling elements 111A and 111B. In one implementation, waveguide structure 108 may include a waveguide splitter that distributes the light from a common channel (e.g., segment 127 shown in
Waveguide structure 108 is configured to deliver the non-visible light from light source 114 to outcoupling elements 111A and 111B. Only one waveguide structure 108 and one pair of outcoupling elements 111A and 111B are illustrated in
As shown in
Head mounted device 100 may include a sensor 117 that is configured to capture reflections of fringe pattern 116 from eye region 115, according to an embodiment. Sensor 117 may be oriented towards eye region 115 and may be configured to capture reflections directly from an eye (e.g., eye 126 shown in
In some implementations, optical element 106A may have a curvature for focusing light (e.g., display light 123) to the eye of the user. The curvature may be included in the transparent layer of illumination layer 110. Thus, optical element 106A may be referred to as a lens. In some aspects, optical element 106A may have a thickness and/or curvature that corresponds to the specifications of a user. In other words, optical element 106A may be considered a prescription lens.
Head mounted device 100 includes a controller 130 that is configured to provide control signals and receive image data 133, in accordance with aspects of the disclosure. Controller 130 may be coupled to light source 114 through a communication channel 131. Controller 130 may provide control signals/instructions to light source 114 that synchronizes the operation of light source 114 with the operation of sensor 117. Controller 130 may provide control signals/instructions to light source 114 to define or modify the emission wavelength of light source 114. Controller 130 may be coupled to sensor 117 through a communication channel 132. Controller 130 may receive image data 133 of fringe pattern 116 from sensor 117. Controller 130 may be configured to generate a depth map of eye 126 based on image data 133. Controller may include one or more processors, a field-programmable gate array (FPGA) chip, and one or more memories. Controller 130 may be coupled to or embedded in frame 102 or in one or more arms 104A and 104B.
Fringe projector 119 may be used to perform depth sensing of objects other than an eye. In implementations, fringe projector 119 may be used to project fringe pattern 116 on an eye, a hand, or a face for depth sensing and/or imaging purposes.
As shown in
Fringe projector 404 includes a light source 408, a waveguide structure 410, and outcoupling elements 412A and 412B, according to an embodiment. Light source 408 may be a tunable laser with an emission wavelength that may be varied or chirped. A portion of the channel of waveguide structure 410 may be curved to at least partially interleave outcoupling elements 412A and 412B of fringe projector 404 with the outcoupling elements of fringe projector 406.
Fringe projector 406 includes a light source 414, a waveguide structure 416, and outcoupling elements 418A and 418B, according to an embodiment. Light source 414 may be a tunable laser with an emission wavelength that may be varied or chirped, which may cause the fringe pattern generated by fringe projector 406 to shift on an eye. A first segment (or channel) of waveguide structure 416 may be longer than a second segment (or channel) to insert an optical delay to shift the phase of the light beam output from outcoupling element 418A, with respect to the phase of the light beam output from outcoupling element 418B. Changing the frequency or wavelength of light source 414 may cause the output light beams to constructively and destructively interfere at different locations—causing the dark and light interference patterns of the generated fringe pattern to shift (e.g., left and right).
Controller 130 may be coupled to light source 408, light source 414, and sensor 117. Controller 130 may be communicatively coupled to light source 408 through a communication channel 420 to send control signals to turn light source 408 on and off and to set or adjust the operating wavelength or frequency of light source 408. Controller 130 may be communicatively coupled to light source 414 through a communication channel 422 to send control signals to turn light source 414 on and off and to set or adjust the operating wavelength or frequency of light source 414.
Fringe projector 504 includes a light source 508, a waveguide structure 510, and outcoupling elements 512A and 512B, according to an embodiment. Light source 508 may be a laser configured to emit light at a first wavelength (e.g., 850 nm) to cause the outcoupling elements 512A and 512B to generate a first fringe pattern at the first wavelength.
Fringe projector 506 includes a light source 514, a waveguide structure 516, and outcoupling elements 518A and 518B, according to an embodiment. Light source 514 may be a laser configured to emit light at a second wavelength (e.g., 940 nm) to cause the outcoupling elements 518A and 518B to generate a second fringe pattern at the second wavelength. The spacing of the outcoupling elements 518A and 518B may be different than the spacing of outcoupling elements 512A and 512B. For example, outcoupling elements 512A and 512B may be spaced 25 μm apart and outcoupling elements 518A and 518B may be spaced 40 μm apart. In one implementation, the spacing between the different outcoupling elements is the same.
Fringe projectors 504 and 506 may have a variety of configurations. For example, the fringe projectors are illustrated as being perpendicular to each other. However, the general angle between the fringe projectors may be greater than or less than 90°. Additionally, more than two fringe projectors may be used to generate fringe patterns with more than two wavelengths. For example, optical element 502 may be coupled to four different fringe projectors and each of the four fringe projectors may be configured to generate a fringe projector at a different wavelength (e.g., 850 nm, 890 nm, 940 nm, 990 nm). Additionally, the features of one or more of any other fringe projector disclosed herein may be combined with fringe projector 504 or 506 to generate fringe patterns at different wavelengths, chirped fringe patterns, and/or fringe patterns generated by selectively using an optical delay in one channel.
Controller 130 may be coupled to light source 508, light source 514, and sensor 524. Controller 130 may be communicatively coupled to light source 508 through a communication channel 520 to send control signals to turn light source 508 on and off and to set or adjust the operating wavelength or frequency of light source 508. Controller 130 may be communicatively coupled to light source 514 through a communication channel 522 to send control signals to turn light source 514 on and off and to set or adjust the operating wavelength or frequency of light source 514. Sensor 524 may include a color filter array (CFA) that passes a first wavelength of light for some of the pixels (e.g., half of the pixels), and that passes a second wavelength of light for others of the pixels. The CFA may have a checkerboard pattern of filters for the first and second wavelengths. Controller 130 may be coupled to sensor 524 through a communication channel 526 to receive image data 133. Controller 130 is configured to use image data 133 to determine a depth map (e.g., a 3D depth map) of the eye of a user, according to an embodiment. Controller 130 may operate light sources 508 and 514 at the same time (concurrently) or sequentially (one after the other, repeatedly). In one implementation, two sensors are used, where a first sensor images light of a first wavelength (e.g., 850 nm) and a second sensor images light of a second wavelength (e.g., 940 nm).
In an implementation of the disclosure, a fringe projector disposed on a transparent substrate utilizes photonic integrated circuits for AR/VR applications such as eye-tracking and/or face tracking with depth sensing. Multiple light sources may be used. Multiple light sources may be integrated on a same substrate with photonic waveguides. In some implementations, a metallic mirror is used as the outcoupling elements and the outcoupling elements are fabricated on the same substrate.
In an implementation of the disclosure, a device with an optical path induces delay for phase shifting to support depth sensing.
In an implementation of the disclosure, a device with a light source can be tuned with its wavelength to manipulate phase shifting.
Implementations of the disclosure include wavelength and time multiplexed system configuration for accurate and fast 3D sensing. The fringe projectors may be realized with photonic integrated circuits and disposed in a FOV of a user (“in-field”). Because of the small size of the fringe projector and the close proximity to the user's eye, the fringe projector may not be noticeable to the user even though it is disposed within the user's FOV.
In an implementation of the disclosure, a fringe projector disposed on a transparent substrate utilizes photonic integrated circuits for AR/VR applications such as eye-tracking and/or face tracking with depth sensing. Multiple light sources may be used. Multiple light sources may be integrated on a same substrate with photonic waveguides. In some implementations, a metallic mirror is used as the output couplers and the output couplers are fabricated on the same substrate.
In an implementation of the disclosure, a device with an optical path induces delay for phase shifting needed for depth tracking.
In an implementation of the disclosure, a device with a light source can be tuned with its wavelength to effect phase shifting.
Implementations of the disclosure include wavelength and time multiplexed system configuration for accurate and fast 3D sensing. The fringe projectors are realized with photonic integrated circuits and disposed in a FOV of a user (“in-field”). Because of the small size of the fringe projector and the close proximity to the user's eye, the fringe projector may not be noticeable to the user even though it is disposed within the user's FOV.
At process block 802, process 800 emits, with a laser, a light beam into a waveguide embedded in a lens assembly of a head mounted device, according to an embodiment. Process block 802 proceeds to process block 804, according to an embodiment.
At process block 804, process 800 splits, with the waveguide, the light beam into a first light beam and a second light beam, according to an embodiment. Process block 804 proceeds to process block 806, according to an embodiment.
At process block 806, process 800 redirecting, with mirrors, the first light beam and the second light beam towards an eyebox region of the head mounted device to generate a fringe pattern on an eye of a user of the head mounted device, according to an embodiment. Process block 806 proceeds to process block 808, according to an embodiment.
At process block 808, process 800 capturing one or more images of the fringe pattern, according to an embodiment. Process block 806 proceeds to process block 808, according to an embodiment.
At process block 810, process 800 generates a depth map based on the one or more images, according to an embodiment.
Embodiments of the invention may include or be implemented in conjunction with an artificial reality system. Artificial reality is a form of reality that has been adjusted in some manner before presentation to a user, which may include, e.g., a virtual reality (VR), an augmented reality (AR), a mixed reality (MR), a hybrid reality, or some combination and/or derivatives thereof. Artificial reality content may include completely generated content or generated content combined with captured (e.g., real-world) content. The artificial reality content may include video, audio, haptic feedback, or some combination thereof, and any of which may be presented in a single channel or in multiple channels (such as stereo video that produces a three-dimensional effect to the viewer). Additionally, in some embodiments, artificial reality may also be associated with applications, products, accessories, services, or some combination thereof, that are used to, e.g., create content in an artificial reality and/or are otherwise used in (e.g., perform activities in) an artificial reality. The artificial reality system that provides the artificial reality content may be implemented on various platforms, including a head-mounted display (HMD) connected to a host computer system, a standalone HMD, a mobile device or computing system, or any other hardware platform capable of providing artificial reality content to one or more viewers.
The term “processing logic” (e.g. controller 130) in this disclosure may include one or more processors, microprocessors, multi-core processors, Application-specific integrated circuits (ASIC), and/or Field Programmable Gate Arrays (FPGAs) to execute operations disclosed herein. In some embodiments, memories (not illustrated) are integrated into the processing logic to store instructions to execute operations and/or store data. Processing logic may also include analog or digital circuitry to perform the operations in accordance with embodiments of the disclosure.
A “memory” or “memories” described in this disclosure may include one or more volatile or non-volatile memory architectures. The “memory” or “memories” may be removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. Example memory technologies may include RAM, ROM, EEPROM, flash memory, CD-ROM, digital versatile disks (DVD), high-definition multimedia/data storage disks, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information for access by a computing device.
A network may include any network or network system such as, but not limited to, the following: a peer-to-peer network; a Local Area Network (LAN); a Wide Area Network (WAN); a public network, such as the Internet; a private network; a cellular network; a wireless network; a wired network; a wireless and wired combination network; and a satellite network.
Communication channels (e.g., 131 and 132) may include or be routed through one or more wired or wireless communication utilizing IEEE 802.11 protocols, short-range wireless protocols, SPI (Serial Peripheral Interface), I2C (Inter-Integrated Circuit), USB (Universal Serial Port), CAN (Controller Area Network), cellular data protocols (e.g. 3G, 4G, LTE, 5G), optical communication networks, Internet Service Providers (ISPs), a peer-to-peer network, a Local Area Network (LAN), a Wide Area Network (WAN), a public network (e.g. “the Internet”), a private network, a satellite network, or otherwise.
A computing device may include a desktop computer, a laptop computer, a tablet, a phablet, a smartphone, a feature phone, a server computer, or otherwise. A server computer may be located remotely in a data center or be stored locally.
The processes explained above are described in terms of computer software and hardware. The techniques described may constitute machine-executable instructions embodied within a tangible or non-transitory machine (e.g., computer) readable storage medium, that when executed by a machine will cause the machine to perform the operations described. Additionally, the processes may be embodied within hardware, such as an application specific integrated circuit (“ASIC”) or otherwise.
A tangible non-transitory machine-readable storage medium includes any mechanism that provides (i.e., stores) information in a form accessible by a machine (e.g., a computer, network device, personal digital assistant, manufacturing tool, any device with a set of one or more processors, etc.). For example, a machine-readable storage medium includes recordable/non-recordable media (e.g., read only memory (ROM), random access memory (RAM), magnetic disk storage media, optical storage media, flash memory devices, etc.).
The above description of illustrated embodiments of the invention, including what is described in the Abstract, is not intended to be exhaustive or to limit the invention to the precise forms disclosed. While specific embodiments of, and examples for, the invention are described herein for illustrative purposes, various modifications are possible within the scope of the invention, as those skilled in the relevant art will recognize.
These modifications can be made to the invention in light of the above detailed description. The terms used in the following claims should not be construed to limit the invention to the specific embodiments disclosed in the specification. Rather, the scope of the invention is to be determined entirely by the following claims, which are to be construed in accordance with established doctrines of claim interpretation.
This application claims priority to U.S. Provisional Application No. 63/319,772 filed Mar. 15, 2022, which is hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
63319772 | Mar 2022 | US |