1. Field
The present disclosure relates generally to augmented reality (AR) devices, e.g., AR eyeglasses, having optical see-through head mounted displays (HMD), and more particularly, to enabling remote screen sharing using such AR devices. AR is a technology in which a user's view of the real world is enhanced with additional information generated from a computer model. The enhancements may include labels, 3D rendered models, or shading and illumination changes. AR allows a user to work with and examine the physical real world, while receiving additional information about the objects in it.
2. Background
AR devices typically include an optical see-through HMD and one or more user input mechanisms that allow users to simultaneously see and interact with their surroundings while interacting with applications, such as e-mail and media players. User input mechanisms may include one or more of gesture recognition technology, eye tracking technology, and other similar mechanisms.
In optical see-through HMD with AR, virtual objects augment the user's view of real world objects such that both virtual and real-world objects are properly aligned. For example, a person in the field of view of a user may be augmented with her name, an artwork may be augmented with descriptive information, and a book may be augmented with its price.
It may be desirable for a user of an AR device with optical see-through HMD to share what he is seeing through the device, with remote users. To this end, a user's view, including both real world scene and augmented reality, may be captured, transmitted to a remote device over a network, and reconstructed at the remote device in real-time. This capability is beneficial for different use cases such as supervised heating, ventilation, and air conditioning (HVAC) troubleshooting, user interaction research, live demonstration of HMD apps, etc. Such remote observance of users augmented view is referred to herein as “remote screen sharing in HMDs”.
Remote screen sharing in optical see-through HMD is challenging since image data of the user's view are formed in the user's retina as opposed to a video see-through HMD where image data are directly accessible. As such, it is difficult to replicate for remote display, what the user is viewing through their eyes.
In an aspect of the disclosure, a method, an apparatus, and a computer program product for constructing an augmented view as perceived by a user of an augmented reality (AR) device having an optical see-through head mounted display (HMD) with AR, for display at a remote device are provided. An apparatus obtains scene data corresponding to a real-world scene visible through the optical see-through HMD, and screen data of at least one of a first augmented object displayed on the optical see-through HMD, and a second augmented object displayed on the optical see-through HMD. The apparatus determines to apply at least one of a first offset to the first augmented object relative to an origin of the real-world scene, and a second offset to the second augmented object relative to the origin. The apparatus then generates augmented-view screen data for displaying the augmented view on an HMD remote from the AR device. The augmented-view screen data is based on at least one of the first offset and the second offset.
The detailed description set forth below in connection with the appended drawings is intended as a description of various configurations and is not intended to represent the only configurations in which the concepts described herein may be practiced. The detailed description includes specific details for the purpose of providing a thorough understanding of various concepts. However, it will be apparent to those skilled in the art that these concepts may be practiced without these specific details. In some instances, well known structures and components are shown in block diagram form in order to avoid obscuring such concepts.
Several aspects of remote screen sharing through an AR device having an optical see-through HMD will now be presented with reference to various apparatus and methods. These apparatus and methods will be described in the following detailed description and illustrated in the accompanying drawings by various blocks, modules, components, circuits, steps, processes, algorithms, etc. (collectively referred to as “elements”). These elements may be implemented using electronic hardware, computer software, or any combination thereof. Whether such elements are implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system.
By way of example, an element, or any portion of an element, or any combination of elements may be implemented with a “processing system” that includes one or more processors. Examples of processors include microprocessors, microcontrollers, digital signal processors (DSPs), field programmable gate arrays (FPGAs), programmable logic devices (PLDs), state machines, gated logic, discrete hardware circuits, and other suitable hardware configured to perform the various functionality described throughout this disclosure. One or more processors in the processing system may execute software. Software shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.
Accordingly, in one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or encoded as one or more instructions or code on a computer-readable medium. Computer-readable media includes computer storage media. Storage media may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise a random-access memory (RAM), a read-only memory (ROM), an electrically erasable programmable ROM (EEPROM), compact disk ROM (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Disk and disc, as used herein, includes CD, laser disc, optical disc, digital versatile disc (DVD), and floppy disk where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
“Remote device” as used herein is a device that is separate from the AR device 102 that generated the shared image data. The remote device 108 may be a computer, Smartphones, tablets, laptops, etc. As described above, the HMD remote application 106 receives screen data and scene data from the AR device 102. As described further below, the HMD remote application 106 processes the screen data and the scene data to generate an image corresponding to the image viewed by the user of the AR device 102. The HMD remote application 106 sends the image to the remote device 108. Although the HMD remote application 106 is illustrated as a separate element, the application may be part of the remote device 108.
The processing system 206 and the eye tracking components provide eye tracking capability. Depending on the eye tracking technology being employed, eye tracking components may include one or both of eye cameras and infra-red emitters, e.g. diodes. The processing system 206 and the scene camera 208 provide gesture tracking capability.
The feedback devices 210 provide perception feedback to the user in response to certain interactions with the AR device. Feedback devices 210 may include a speaker or a vibration device. Perception feedback may also be provided by visual indication through the HMD.
The transceiver 212 facilitates wireless communication between the processing system 206 and remote devices, systems or networks. For example, the AR device may communicate with remote servers through the transceiver 212 for purposes of remote processing, such as on-line searches through remote search engines, or remote sharing of image data.
As mention above, the AR device 200 allows a user to view real-world scenes through optical see-through HMDs together with content displayed on the HMDs. For example, with reference to
User interaction with the AR device 200 is provided by one or more user input mechanisms, such as a gesture tracking module or an eye-gaze tracking module. Gesture tracking is provided by the scene camera 208 in conjunction with a gesture tracking module of the processing system 206. With gesture tracking, a user may attempt to activate an application by placing his finger on an application icon 304, 306, 308 in the field of view of the AR device. The scene camera 208 captures an image of the finger and sends the image to the gesture tracking module. The gesture tracking module processes the image and determines coordinates of a gesture point corresponding to where the user is pointing. The processing system 206 compares the coordinate location of the gesture point to the coordinate location of the icon on the display. If the locations match, or are within a threshold distance of each other, the processing system 206 determines that the user has selected the icon 304, 306, 308 and accordingly, launches the application.
Eye-gaze tracking is provided by the eye tracking components (not visible) in conjunction with an eye tracking module of the processing system 206. A user may attempt to activate an application by gazing at an application icon 304, 306, 308 in the field of view of the AR device. The eye tracking components capture images of the eyes, and provide the images to the eye tracking module. The eye tracking module processes the images and determines coordinates of an eye-gaze point corresponding to where the user is looking The processing system 206 compares the coordinate location of the eye-gaze point to the coordinate location of the icon on the display. If the locations match, or are within a threshold distance of each other, the processing system 206 determines that the user has selected the icon 304, 306, 308 and accordingly, launches the application. Often, such eye-gaze based launching is coupled with another form of input, e.g., gesture, to confirm the user's intention of launching the application.
The AR device 400 includes an on-board processing system 410, which in turn includes one or more of an eye tracking module 412 and a gesture tracking module 414. An object selection module 416 processes the outputs of the one or more tracking modules to determine user interactions and tracking module accuracy. A tracking calibration module 418 calibrates the one or more tracking modules if the tracking module is determined to be inaccurate.
The on-board processing system 410 may also include a scene camera calibration module 420, a graphical user interface (GUI) adjustment module 422, a perception feedback module 424, and a sharing module 436. The scene camera calibration module 420 calibrates the AR device so that the AR content is aligned with real world objects. The GUI adjustment module 422 may adjust the parameters of GUI objects displayed on the HMD to compensate for eye-tracking or gesture-tracking inaccuracies detected by the object selection module 416. Such adjustments may precede, supplement, or substitute for the actions of the tracking calibration module 418. The feedback module 424 controls one or more feedback devices 426 to provide perception feedback to the user in response to one or more types of user interactions. For example, the feedback module may command a feedback device 426 to output sound when a user selects an icon in the field of view using a gesture or eye gaze. The sharing module 436 receives scene data from scene camera 408, captures screen data from the HMD 402, and transmits the data to a remote HMD application 438 for further processing as describe in detail below.
The AR device 400 further includes memory 428 for storing program code to implement the foregoing features of the on-board processing system 410. A communications module 430 and transceiver 432 facilitate wireless communications with remote devices, systems and networks.
With further respect to eye tracking capability, the diodes 404 and eye cameras 406, together with the eye tracking module 412, provide eye tracking capability as generally described above. In the example implementation of
The scene camera 408, together with the gesture tracking module 414, provide gesture tracking capability using a known technology as generally described above. In the example implementation of
The object selection processor 416 functions to determine whether interactions of the user, as characterized by one or more of the eye tracking module 412 and the gesture tracking module 414, correspond to a selection of an object, e.g., application icon, displayed on the HMD 402 and visible in the field of view. If an interaction does correspond to a selection by the user, for example, a selection of an icon to launch an application 434, the object selection processor 416 outputs a command to the application 434.
As previously mentioned, reconstructing the user's view, such as the real world scene together with the augmented object as shown in
Disclosed herein are methods and apparatuses that enable remote screen sharing in optical see-through HMD by constructing the user's augmented view. “Augmented view” as used herein means the view of the user through the AR device including both the real-world scene as seen by the user and augmented reality objects as also seen by the user.
The AR device 410 disclosed herein may enable such remote screen sharing of augmented views. Components of the AR device that facilitate such sharing include the scene camera 408, the HMDs 402, the sharing module 436, and the communication module 430. The scene camera 408 is configured to capture the real world scene component of the augmented view that the user of the AR device is seeing through the optical see-through HMD lens of the glasses.
The sharing module 436 includes an HMD screen capture, or screen shot, function that is configured to capture the augmented reality component of the augmented view seen by the user. The augmented reality component includes the augmented reality objects displayed in front of the user on the optical see-through HMDs of the AR device.
Proper reconstruction of the user's augmented view at a remote device, however, cannot be achieved by simply superimposing screen pixels captured in the HMD screen over scene pixels captured by the scene camera.
In order to provide accurate remote viewing by others of a user's augment view, methods and apparatuses disclosed herein reconstruct the user's augmented view. More specifically, to adjust this misalignment, methods and apparatuses disclosed herein compute alignment offset for both the left and rights eyes of the user dynamically and then superimpose the adjusted augmentation over the scene image. The disclosed framework takes the following data as input and produces a correct augmented view as an output, such as shown in
1) Scene camera image, also referred to as scene data (shown in
2) HMD screen dump, also referred to as screen data (shown in
3) Projection matrix of both eyes (PR and PL) (defines the transformation from scene camera to user's eye) and camera (Pc)
4) Current modelview matrix M related to the marker (defines the transformation from marker to scene camera)
The above inputs can be sent to a remote HMD remote application 106 over the communications network 104. The remote HMD remote application 106 constructs the user's augmented view using the following algorithm. For ease in description, the screen resolution (Sx, Sy) and scene resolution (Ix, Iy) are assumed identical.
For the inputs, PR is the projection matrix for the right eye, PL is the projection matrix for the left eye, PC is the projection matrix for the scene camera, M is model view matrix, screen_buf is the screen capture from the HMD screen, scene_buf is the scene capture from the scene camera.
The following is code corresponds to line 3 (xR, yR, xL, yL=get aligned offsets(PR, PL, Pc, M) of Algorithm 1
For the output, scene buffer is the aligned scene output by the algorithm. This buffer will override the input scene buffer.
xR, yR are the aligned offset computed for the right eye. xL, yL are the aligned offset computed for the left eye. The origin for the offsets is the center of the real-world object as provided by the scene camera.
The algorithm scans through each pixel on the screen, e.g., the screen in
Once left or right eye augmentation is determined, the proper xR, yR, xL, yL offsets are applied to the corresponding coordinates of the pixel and the augmented pixel is superimposed on the scene image, by overriding the corresponding pixel in the scene buffer with the offset screen image for that pixel. The algorithm scans the screen data by starting at pixel (x, 0) and runs through all values of x. The algorithm then goes to pixel (x, 1) and runs through all values of x and so on.
All inputs vary from user to user, not from HMD remote application to application, thus the framework disclosed herein does not require support from individual HMD remote applications. Projection matrixes and the modelview matrix are globally available in the HDM environment for the user using a HMD. Therefore, this framework can be implemented as a separate service in an HMD environment. This separate service may collect input data, reconstruct the user's augmented view following the above algorithm and send it to the remote HMD remote application 106 over the network 104 for any arbitrary HMD remote application.
At step 802, the remote application 438 obtains scene data corresponding to a real-world scene visible through the optical see-through HMD. The scene data may be obtained from the AR device through which the user is seeing the augmented view. For example, the scene camera of the AR device may capture the real-world scene.
At step 804, the remote application obtains screen data of at least one of a first augmented object displayed on the optical see-through HMD, and a second augmented object displayed on the optical see-through HMD. The screen data may be obtained from the AR device through which the user is seeing the augmented view. For example, a sharing module 436 of the AR device may capture the screen data displayed on the optical see-through HMD.
At step 806, the remote application determines to apply at least one of a first offset to the first augmented object relative to an origin of the real-world scene, and a second offset to the second augmented object relative to the origin. In one configuration, the screen data includes a plurality of pixels and the remote application determines to apply offsets by determining if a pixel is non-black. For a non-black pixel, the remote application then determines if the pixel corresponds to the first augmented object or the second augmented object. If the pixel corresponds to the first augmented object, the remote application applies the first offset to the pixel. If the pixel corresponds to the second augmented object, the remote application applies the second offset if the pixel corresponds to the second augmented object.
The optical see-through HMD may correspond to a right lens of the AR device, in which case the first offset includes an x coordinate offset and a y coordinate offset for the user's right eye. The optical see-through HMD may corresponds to a left lens of the AR device, in which case the second offset includes an x coordinate offset and a y coordinate offset for the user's left eye.
The first offset and the second offset may be respectively based on a first projection matrix and second projection matrix, together with one or more of a scene camera projection matrix defining a transformation from the scene camera to a first eye of the user, and a model view matrix defining a transformation from a marker to the scene camera.
At step 808, the remote application generates augmented-view screen data for displaying the augmented view on an HMD remote from the AR device. The augmented-view screen data is based on at least one of the first offset and the second offset. Generating the augmented-view screen data includes for each offset pixel, replacing the corresponding pixel in the scene data with the offset pixel. In doing so, the image data output by the HMD remote application produces an image on a remote HMD corresponding to the augmented view of the user. In other words, the remote HMD displays the image of
The apparatus 902 also includes an offset application determination module 908 that determines to apply at least one of a first offset to the first augmented object relative to an origin of the real-world scene, and a second offset to the second augmented object relative to the origin. The apparatus 902 further includes an augmented-view screen data generating module 908 that generates augmented-view screen data for displaying the augmented view on an HMD remote from the AR device. The augmented-view screen data is based on at least one of the first offset and the second offset.
The remote HDM application, as illustrated in
The processing system 1014 includes a processor 1004 coupled to a computer-readable medium/memory 1006. The processor 1004 is responsible for general processing, including the execution of software stored on the computer-readable medium/memory 1006. The software, when executed by the processor 1004, causes the processing system 1014 to perform the various functions described supra for any particular apparatus. The computer-readable medium/memory 1006 may also be used for storing data that is manipulated by the processor 1004 when executing software. The processing system further includes at least one of the modules 904, 906, 908 and 910. The modules may be software modules running in the processor 1004, resident/stored in the computer readable medium/memory 1006, one or more hardware modules coupled to the processor 1004, or some combination thereof.
In one configuration, the apparatus 902/902′ includes means obtaining scene data corresponding to a real-world scene visible through the optical see-through HMD, means for obtaining screen data of at least one of a first augmented object displayed on the optical see-through HMD, and a second augmented object displayed on the optical see-through HMD, means for determining to apply at least one of a first offset to the first augmented object relative to an origin of the real-world scene, and a second offset to the second augmented object relative to the origin, and means for generating augmented-view screen data for displaying the augmented view on an HMD remote from the AR device, the augmented-view screen data based on at least one of the first offset and the second offset. The aforementioned means may be one or more of the aforementioned modules of the apparatus 902 and/or the processing system 1014 of the apparatus 902′ configured to perform the functions recited by the aforementioned means.
A method of reconstructing a user's view through an optical see-through AR device for display at a remote device includes obtaining data corresponding to a scene image of a real-world object visible through the AR device, obtaining data corresponding to a first screen image of a first augmented object displayed on the AR device, and a second screen image of a second augmented object displayed on the AR device, and determining a first offset for the first screen image relative to an origin provided by the scene image, and a second offset for the second screen image relative to the origin, and generating display data based on the first offset and the second offset, wherein the display data provides a display of the real-world object aligned with the first augmented object and the second augmented object. The first screen image corresponds to the right lens of the AR device and the first offset comprises an x coordinate offset and a y coordinate offset. The second screen image corresponds to the left lens of the AR device and the second offset comprises an x coordinate offset and a y coordinate offset.
A corresponding apparatus for reconstructing a user's view through an optical see-through AR device for display at a remote device includes means for obtaining data corresponding to a scene image of a real-world object visible through the AR device, means for obtaining data corresponding to a first screen image of a first augmented object displayed on the AR device, and a second screen image of a second augmented object displayed on the AR device, means for determining a first offset for the first screen image relative to an origin provided by the scene image, and a second offset for the second screen image relative to the origin, and means for generating display data based on the first offset and the second offset, wherein the display data provides a display of the real-world object aligned with the first augmented object and the second augmented object.
Another apparatus for reconstructing a user's view through an optical see-through an AR device for display at a remote device includes a memory, and at least one processor coupled to the memory and configured to obtain data corresponding to a scene image of a real-world object visible through the AR device, to obtain data corresponding to a first screen image of a first augmented object displayed on the AR device, and a second screen image of a second augmented object displayed on the AR device, to determine a first offset for the first screen image relative to an origin provided by the scene image, and a second offset for the second screen image relative to the origin, and to generate display data based on the first offset and the second offset, wherein the display data provides a display of the real-world object aligned with the first augmented object and the second augmented object.
A computer program product for reconstructing a user's view through an optical see-through AR device for display at a remote device includes a computer-readable medium comprising code for obtaining data corresponding to a scene image of a real-world object visible through the AR device, code for obtaining data corresponding to a first screen image of a first augmented object displayed on the AR device, and a second screen image of a second augmented object displayed on the AR device, code for determining a first offset for the first screen image relative to an origin provided by the scene image, and a second offset for the second screen image relative to the origin, and code for generating display data based on the first offset and the second offset, wherein the display data provides a display of the real-world object aligned with the first augmented object and the second augmented object.
It is understood that the specific order or hierarchy of steps in the processes disclosed is an illustration of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged. Further, some steps may be combined or omitted. The accompanying method claims present elements of the various steps in a sample order, and are not meant to be limited to the specific order or hierarchy presented.
The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein, but is to be accorded the full scope consistent with the language claims, wherein reference to an element in the singular is not intended to mean “one and only one” unless specifically so stated, but rather “one or more.” The word “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any aspect described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects.” Unless specifically stated otherwise, the term “some” refers to one or more. Combinations such as “at least one of A, B, or C,” “at least one of A, B, and C,” and “A, B, C, or any combination thereof” include any combination of A, B, and/or C, and may include multiples of A, multiples of B, or multiples of C. Specifically, combinations such as “at least one of A, B, or C,” “at least one of A, B, and C,” and “A, B, C, or any combination thereof” may be A only, B only, C only, A and B, A and C, B and C, or A and B and C, where any such combinations may contain one or more member or members of A, B, or C. All structural and functional equivalents to the elements of the various aspects described throughout this disclosure that are known or later come to be known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the claims. Moreover, nothing disclosed herein is intended to be dedicated to the public regardless of whether such disclosure is explicitly recited in the claims. No claim element is to be construed as a means plus function unless the element is expressly recited using the phrase “means for.”
This application claims the benefit of U.S. Provisional Application Ser. No. 61/867,536, entitled “Enabling Remote Screen Sharing in Optical See-Through Head Mounted Display with Augmented Reality” and filed on Aug. 19, 2013, which is expressly incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
61867536 | Aug 2013 | US |