Extended reality (XR) technologies include virtual reality (VR), augmented reality (AR), and mixed reality (MR) technologies, and quite literally extend the reality that users experience. XR technologies may employ head-mountable displays (HMDs). An HMD is a display device that can be worn on the head. In VR technologies, the HMD wearer is immersed in an entirely virtual world, whereas in AR technologies, the HMD wearer's direct or indirect view of the physical, real-world environment is augmented. In MR, or hybrid reality, technologies, the HMD wearer experiences the merging of real and virtual worlds.
As noted in the background, a head-mountable display (HMD) can be employed as an extended reality (XR) technology to extend the reality experienced by the HMD's wearer. An HMD can include a small display panel in front of each eye of the wearer, as well as various sensors to detect or sense the wearer and/or the wearer's environment. Images on the display panels convincingly immerse the wearer within an XR environment, be it a virtual reality (VR), augmented reality (AR), a mixed reality (MR), or another type of XR.
An HMD can include one or multiple cameras, which are image-capturing devices that capture still or motion images. For example, one camera of an HMD may be employed to capture images of the wearer's lower face, including the mouth. Two other cameras of the HMD may be each be employed to capture images of a respective eye of the HMD wearer and a portion of the wearer's face surrounding the eye.
Based on such captured images, the wearer's facial expressions and correspondingly derived information, like facial cues and emotions, of the wearer can be assessed. Detecting facial features of the wearer of an HMD provides for even fuller immersion within an XR. For instance, the XR may suitably respond to the wearer's facial expressions, facial cues, and emotions, including changing any graphical representation of the wearer within the XR, such as an avatar, in correspondence with the changes in the wearer's actual facial expressions.
For accuracy, such HMD camera-based facial expression detection should be calibrated for the current wearer of the HMD before the wearer's facial expressions are actually detected based on images captured by the cameras. Calibration is performed based on an image of a neutral facial expression of the wearer of the HMD. That is, once a neutral facial expression image has been captured, subsequently captured images can be compared to the neutral facial expression image to identify the current facial expression of the wearer. The neutral facial expression image thus serves as a baseline image.
Capture of the neutral facial expression image is usually manually initiated with wearer involvement and interaction. After a user has mounted the HMD on his or her head, the user, who is now the wearer of the HMD, may be prompted to exhibit a neutral facial expression and provide an input once the wearer is exhibiting the neutral facial expression. Responsive to receiving the input, the HMD captures an image of the wearer's face, which is presumed to be an image of the wearer's neutral facial expression. HMD camera-based facial expression detection can then be calibrated based on the captured image.
However, such manual initiation of camera-based facial expression detection calibration can be cumbersome to the wearer. The wearer may ignore the prompting for capturing a neutral facial expression image, resulting in a less optimal XR experience. The wearer may exhibit a facial expression that he or she thinks is a neutral facial expression, but is actually a forced simulation of the wearer's actual neutral facial expression, which can deleteriously affect subsequent camera-based facial expression detection. At a minimum, manual initiation of camera-based facial expression detection calibration can be inconvenient to the HMD wearer, who may be eager to experience XR with a minimum of setup.
Techniques described herein provide for automated capture of a baseline neutral facial expression image of the wearer of an HMD, on which basis HMD camera-based detection of facial expressions of the wearer can be calibrated. The techniques leverage facial electromyographic (fEMG) sensors of the HMD. The fEMG sensors output signals that measure facial muscle activity by detecting and amplifying small electrical impulses that muscle fibers generate when they contract. Upon detection of a neutral facial expression based on such fEMG sensor signals, a baseline neutral facial image of the wearer is automatedly captured.
The HMD camera-based detection of the facial expressions of the wearer can thus be calibrated without the wearer of the HMD having to explicitly initiate calibration. That is, the wearer's neutral facial expression is detected and the baseline neutral facial expression image responsively automatedly captured without explicitly prompting the wearer to first exhibit a neutral facial expression. The wearer may indeed be unaware that such a baseline neutral facial expression has been captured. Such capture may occur within a short time period after the wearer first puts on the HMD, such as within the first few minutes of wearing the HMD, for instance.
The techniques described herein therefore leverage an HMD's fEMG sensors just for purposes of automated capture of a baseline neutral facial expression image that can be used for subsequent camera-based facial expression detection. The fEMG sensors are not used for detecting facial expressions of the wearer of the HMD beyond initial (e.g., one-time) detection of a neutral facial expression for purposes of calibrating camera-based facial expression detection. While fEMG sensors may be used for facial expression detection, in some cases such techniques may not be as accurate as camera-based detection and can be more computationally intensive and complicated than camera-based detection.
The HMD 100 can include a display panel 107 inside the other end of the main body 105 and that is positionable incident to the eyes 152 of the wearer 102. The display panel 107 may in actuality include a right display panel incident to and viewable by the wearer 102's right eye 152A, and a left display panel incident to and viewable by the wearer's 102 left eye 152B. By suitably displaying images on the display panel 107, the HMD 100 can immerse the wearer 102 within an XR.
The HMD 100 can include an adjustable strap 108 that can be tightened once the HMD 100 has been placed on the wearer 102, to maintain the gasket 106 fixably positioned against the wearer 102's face 104 in a hands-free manner. The depicted strap 108 is a behind-the-head strap. The HMD 100 may also include another strap, such as an over-the-head strap, to even more securely maintain positioning of the HMD 100 on the wearer 102.
The HMD 100 can include eye camera 110A and 110B and/or a mouth camera 110C, which are collectively referred to as the cameras 110. While just one mouth camera 110C is shown, there may be multiple mouth cameras 110C. Similarly, whereas just one eye camera 110A and one eye camera 110B are shown, there may be multiple eye cameras 110A and/or multiple eye cameras 110B. The cameras 110 capture images of different portions of the face 104 of the wearer 102 of the HMD 100, on which basis the facial expression of the wearer 102 can be detected.
The eye cameras 110A and 110B are inside the main body 105 of the HMD 100 and are directed towards respective eyes 152. The right eye camera 110A captures images of the facial portion including and around the wearer 102's right eye 152A, whereas the left eye camera 110B captures images of the facial portion including and around the wearer 102's left eye 152B. The mouth camera 110C is exposed at the outside of the body 105 of the HMD 100, and is directed towards the mouth 154 of the wearer 102 (per
The HMD 100 can include fEMG sensors 156A, 156B, 156C, and 156D, which are collectively referred to as the fEMG sensors 156, as well as fEMG sensors 158A, 158B, 158C, and 158D, which are collectively referred to as the fEMG sensors 158. The fEMG sensors 156 and 158 are disposed within the gasket 106 of the HMD 100. The fEMG sensors 156 and 158 are externally exposed at the gasket 106, so that the sensors 156 and 158 come into contact with the skin of the wearer 102's face 104 when the HMD 100 is worn by the wearer 102.
The fEMG sensors 156 are positioned above the wearer 102's eyes 152 and thus can be referred to as above-eye fEMG sensors 156, whereas the fEMG sensors 158 are positioned below the wearer 102's eyes 152 and thus can be referred to as below-eye fEMG sensors 158. In the depicted example there are four fEMG sensors 156 and four fEMG sensor 158, such that there are two fEMG sensors 156 and two sensors 158 around each eye 152 of the wearer 102. If less sensitivity is sufficient, there may be just one fEMG sensor 156 and one fEMG sensor 158 for each eye 152, in which case the sensors 156 and 158 for an eye 152 may be respectively centered above and below that eye 152. If even less sensitivity is sufficient, there may be just one fEMG sensor 156 and one fEMG sensor 158, in which case the sensors 156 and 158 may be respectively centered above and below in-between the eyes 152 of the wearer 102.
The fEMG sensors 156 and 158 output signals measuring facial muscle activity of the wearer 102 and can be used to detect a neutral facial expression of the wearer 102. The above-eye fEMG sensors 156 specifically output signals measuring muscle activity primarily in an upper facial portion above the wearer 102's eyes 152, whereas the below-eye fEMG sensors 158 output signals measuring muscle activity primarily in a lower facial portion below the wearer 102's eyes 152. The fEMG sensors 156 can therefore be employed to detect a neutral facial expression of the upper facial portion of the wearer 102 and the fEMG sensors 158 can be employed to detect a neutral facial expression of the lower facial portion of the wearer 102.
The fEMG sensors 156 and 158 may be able to respectively detect the neutral facial expression of the wearer's 102 upper and lower facial portions independently of one another. That is, the fEMG sensors 156 may be able to detect a neutral facial expression of the upper facial portion independent of the fEMG sensors 158 detecting a neutral facial expression of the lower facial portion, and vice-versa. Furthermore, if the neutral facial expression of just the upper or lower facial portion of the wearer 102 is of interest, then the HMD 100 may respectively include just the fEMG sensors 156 or 158. The HMD 100 may therefore similarly include just the eye cameras 110A and 110B or just the mouth camera 110C in this case.
The processing includes detecting a neutral facial expression of the wearer 102 of the HMD 100 based on signals received from one or multiple fEMG sensors 156 and 158 of the HMD 100 (204). Two example ways in which a neutral facial expression can be detected based on signals from the fEMG sensors 156 and 158 are described later in the detailed description. Most generally, an overall neutral facial expression of the wearer 102 can be detected. However, as noted above, neutral facial expressions of the upper and lower facial portions may be independently detected (at different times even) and the neutral facial expression of just the upper or lower facial portion may be detected.
The processing includes responsively (i.e., in response to detection of the neutral facial expression by the fEMG sensors 156 and 158) automatedly capturing a baseline neutral facial expression image of the wearer 102 using the cameras 110 of the HMD 100 (206). The capture of the baseline neutral facial expression image is automated in that the wearer 102 does not have to explicitly initiate capture of the image, and the wearer 102 does not have to be explicitly prompted to exhibit a neutral facial expression. Rather, the baseline neutral facial expression is automatically captured once the fEMG sensors 156 and 158 detect that the wearer 102 is exhibiting a neutral facial expression image. As noted above, the wearer 102 may even be unaware that such image capture is taking place, particularly in the context of camera 110-based facial expression detection calibration.
The processing includes calibrating camera 110-based facial expression detection based on the captured baseline neutral facial expression image (208). In one implementation, calibration can include just storing the captured baseline neutral facial expression image for subsequent usage to detect facial expressions of the wearer 102 of the HMD 100 based on facial images of the wearer 102 captured by the cameras 110. In other implementations, to the extent that camera 110-based facial expression detection involves algorithms or models that have to parameters changed or that are otherwise adjusted for a given baseline neutral facial expression image, calibration can include changing such parameters or otherwise adjusting the algorithms or models on the basis of the captured image. As noted above, camera 110-based facial expression detection calibration can occur without the wearer 102 having to initiate the calibration and without the wearer 102 even having to be aware that the calibration is occurring.
The processing can then include (calibrated) camera 110-based detection of facial expressions of the wearer 102 of the HMD 100 (210). The upper and lower facial expressions of the wearer 102 may be independently detected, such as in a case in which neutral upper and lower facial expressions are captured in separate baseline images responsive to independent detection of the neutral facial expressions of the upper and lower facial portions of the wearer 102. Similarly, the detected facial expression can include just the upper facial expression using images from just the eye cameras 110A and 110B, such as if the HMD 100 does not include the mouth camera 110C. Likewise, the detected facial expression can include just the lower facial expression using images from just the mouth camera 110C, such as if the HMD 100 does not include the eye cameras 110A and 110B.
The facial expressions of the wearer 102 of the HMD 100 can be detected in a (calibrated) camera 110-based approach as follows. The cameras 110, subsequent to camera 110-based detection calibration, capture facial images of the wearer 102 (212). The changing facial expressions of the wearer 102 over time can then be detected by comparing these subsequently captured facial images to the previously captured neutral facial expression image (214). For example, a calibrated algorithm or a model may be applied to the subsequently captured facial images of the wearer 102. The baseline neutral facial expression image may also be input to the algorithm or model at the same time, particularly in the case if the algorithm or model was not previously adjusted based on the baseline neutral facial expression image.
The processing can include performing one or multiple actions related to the HMD 100 based on the facial expressions of the wearer 102 that have been detected (216). For instance, the emotions of the wearer 102 may be predicted based on the detected facial expressions. As another example, the images displayed by the HMD 100 on the display panel 107 for viewing by the wearer 102 of the HMD 100 may be changed based on the detected facial expressions of the wearer 102.
As a concrete example of the latter, a displayed avatar corresponding to the wearer 102 may have its facial expressions changed in correspondence with the changing actual facial expressions of the wearer 102 as detected. The avatar may additionally or instead be displayed to users other than the wearer 102, such as in the case in which the wearer 102 is participating in a video meeting or other collaboration with the other users. In this case, the action in question still is related to the HMD 100 insofar as the action pertains to the weather 102 of the HMD 100.
In
The time period in question may be several seconds in length or more. If the upper and lower facial expressions of the wearer 102 are to be independently detected, then the wearer 102's neutral facial expressions may likewise be independently detected. For example, the method 300 may be performed to detect a neutral upper facial expression of the wearer 102 based on just the signals from the above-eye sensors 156, and may be separately (e.g., concurrently) performed to a detect a neutral lower facial expression of the wearer 102 based on just the signals from the below-eye sensors 158.
In
As in the method 300, the time period in question may be several seconds in length or more in the method 350. Also as in the method 300, neutral upper and lower facial expressions of the wearer 102 may be able to be detected independently of one another. There may be a different trained probabilistic model for determining the probability of a neutral facial expression for each of the upper and lower facial portions of the wearer 102. In another implementation, the same trained probabilistic model may be used in each case.
Techniques have been described for automated capture of a baseline neutral facial expression image of the wearer of an HMD, on which basis HMD camera-based facial expression detection can be calibrated. Such automated capture is triggered by fEMG sensor-based detection of the wearer currently exhibiting a neutral facial expression. The wearer of the HMD therefore does not have to be explicitly prompted to exhibit a neutral facial expression, and does not have to manually initiate calibration of HMD camera-based facial expression detection.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/038082 | 6/18/2021 | WO |