The technical field generally relates to computer interface interaction, and more particularly relates to using a virtualized environment for providing computer interface interaction for users.
Virtual reality (VR) is an artificial, computer-generated simulation of a real-life environment or situation. It immerses the user by making the user feel as if they are experiencing the simulated reality firsthand, primarily by stimulating their vision and hearing through a sensor-packed wearable device, such as HTC's Vive™ virtual reality system. Augmented reality (AR) takes a user's view of the real world and adds digital information data on top of it. This might be as simple as numbers or text notifications, or as complex as a simulated screen.
Applications of VR and AR technology have allowed users to be inserted within a digital environment in varying degrees of immersion, such as within a gaming environment. However, applications of VR and AR technology to interacting with computer user interfaces and their associated programs have experienced technological limitations. This has resulted in a limited user experience in interacting with such applications within virtualized environments (e.g., virtual reality environments, augmented related environments, mixed reality environments, etc.).
In accordance with the teachings provided herein, systems, methods, apparatuses, non-transitory computer-readable medium for operation upon data processing devices are provided for capturing screen contents of a plurality of applications. The applications operate on a processor-implemented device. The applications' screen contents are rendered for user interaction in a virtualized environment. A server continuously updates images of the windows of the multiple applications that are open for use within the virtualized environment.
As another example, a system and method includes maintaining on a server a list of windows and their associated process identifiers. The windows are opened by user interaction and contain the screen contents of the applications. The server requests application render updates from a host machine using a messaging command that provides dynamic compression of the windows using the identifiers associated with the windows. The server continuously updates images of the windows of the multiple applications that are open, and caching, on the server, the images. The server asynchronously responds to requests for the images by serving the cached images for user interaction in the virtualized environment.
The following detailed description is merely exemplary in nature and is not intended to limit the application and uses. Furthermore, there is no intention to be bound by any expressed or implied theory presented in the preceding technical field, background, brief summary, or the following detailed description.
With reference to
Next, the server checks if the client needs a specific encoding (GZIP, PNG (Portable Network Graphics), raw pixel binary, etc.) at 305. The client may request encoding/compression. If the client is only running one application at 600×1024 pixels with 3 bytes per pixel, refreshing at 10 frames per second—this would require 17.58 MB/s (600*1024 pixels*3 bytes per pixel*10 per second/10242 convert to megabytes). Having multiple windows open or refreshing at a high rate (unencoded) for high priority or high fidelity applications (like video) may needlessly tax the network. Furthermore, most applications compress very well because they have limited color palettes and significant empty space.
If no encoding is necessary, then processing resumes. If the client needs a specific format, the process encodes it at 306. The updated bitmap is written to the PID to bitmap cache 107. This write occurs synchronously because different threads could be serving (reading) the underlying image to clients during the update. The thread then waits at 308 depending on the required refresh rate or priority for the application. For example, if the window requires sixty hertz updates, the bitmap update occurs faster than 16.6 milliseconds (1 second/60 hertz*1000 convert to milliseconds). If the thread finishes the update in 12 milliseconds, it will wait for 4.6 milliseconds to maintain an exact refresh rate. This refresh rate is configurable.
If the request is for the image of a window, the server performs a synchronous read from the PID to Bitmap Cache 107 (in case the specific bitmap is being updated from a render thread). The image will be sent to the client in the HTTP Response and the thread will close.
If the request is for sending an input event to a window, the pertinent information is parsed from the request (e.g., input type, coordinates, button, window, etc.) and sent to the window manager 500. The window manager attempts to perform the requested action and responds with either a success or error code in the HTTP Response.
If the request is for opening a new window, the window manager 500 begins by starting the requested process 502. On most operating systems, the rendered windows have different identifiers than the process id. Specifically, in the Windows operating system, processes with a visual component have a window handle. Furthermore, some processes will immediately spawn a child process and then exit, while the child process handles the rendering. Most applications will ask the user to select the appropriate window from a list of open windows, instead of launching an application and dynamically trying to resolve the handle. The window manager 500 compares the list of open windows before and after the process starts and the difference between the lists represents newly opened windows. If there are multiple new windows, the window manager 500 will also inspect the titles to find the right handle. This step relates to finding the window handle at 503 that is associated with the process that was started in 502. The PID to window handle map 504 is updated so that this conversion can be repeated later. A new thread is started from the thread pool 401, and that thread will execute its render loop 300 until the window is closed.
If the request is to close a window, the process is killed 505. The PID to window handle map 504 is updated accordingly. The associated render loop 300 thread will terminate on its next iteration.
If the request is to send an input event to a window, the appropriate window handle is retrieved from the PID to window handle map 504. In the Windows operating system, User32.dll can be used to fire the input event to the window at 506.
Next, the update window images process 800 runs. After that, the update loop 600 will initiate the render process 900, to send relevant data to the GPU and produce an image for the output device (headset). The update loop 600 will wait for V-Sync 605 so that images are being sent to the output headset exactly as fast as the hardware refresh rate.
Using the equation:
∥{right arrow over (u)}∥∥{right arrow over (v)}*cos(θ)={right arrow over (u)}·{right arrow over (v)}
This can be rewritten in terms of theta (the angle between the two vectors):
The plane's surface normal can be generated by multiplying (0, 0, 1, 0) by the associated model matrix to convert to world space. The plane-to-camera vector can generate the positions derived from the model matrices. For the plane, the camera yields the plane-to-camera vector. If the dot product of the plane-to-camera and surface normal vector is positive, they are facing the same direction.
The coordinate conversion may be done from local coordinate space for the 3D window (represented by a rectangle). The Z value of the vertices is 0. The bounding corners can be given by (X0, Y0, 0) and (X1, Y1, 0). Given the resolution of the windowed application (width and height) in pixels, the 3D mouse location can be converted from (Xm, Ym, 0) to 2D screen space using:
If an update is needed, the process generates an HTTP request 803 with the identifying information for the current window (PID). The next steps can be handled on their own thread because networking and decompression may be too slow to occur on the main rendering thread (e.g., typically has less than 16 milliseconds to finish a loop). The request is sent to the virtualized desktop server 100, and the response 804 will be handled. The response image is processed at 805 as the data might be compressed or encoded. Additionally, the bits may need to be realigned due to underlying formats (e.g. Microsoft Windows bitmaps use BGRA—Blue, Green, Red, and Alpha. Many graphics languages use ARGB). The updated pixel array is synchronously written (in case the render loop thread is also reading) to the windows bitmaps 806 data structure.
A stereoscopic draw call is sent to the GPU to produce two images (left 902 and right 903 eye) for the associated scene information. These two images are output from the HDMI on the graphics card or by similar mechanism sent to the headset display 904.
Configuration 1 shows virtualized desktop server 100 and virtualized desktop client 200 running on separate computers that are networked together. This is an example of how a tethered headset may be used. Specifically, the headset does not have on-board computer and receives images over HDMI, VGA, DVI, or a different video cable.
Configuration 2 shows virtualized desktop server 100 and virtualized desktop client 200 running on separate computers. In this case, the headset is capable of running the client software directly and does not need to transfer the final display over a cable. An example of this type of hardware is the Hololens™ technology from Microsoft.
Configuration 3 shows a setup where the virtualized desktop server 100 and virtualized desktop client 200 are running on the same computer. In this situation, bitmaps can be transferred directly through memory instead of HTTP. Output images may be sent to the headset 201 over a cable.
Configuration 4 shows a virtualized desktop client 200 rendering multiple windows from multiple virtualized desktop servers 100. The servers do not need to be running the same operating system. The display is sent to the headset 201 over a display cable.
Configuration 5 shows a virtualized desktop client 200 Rendering multiple windows from multiple virtualized desktop servers 100. The servers do not need to be running the same operating system. The client may be running on the headset using hardware similar to the Hololens™ technology from Microsoft.
Each open window being managed by the virtualized desktop server 100 has its own thread constantly executing a render loop 300 and writing encoded/compressed updates to the PID to bitmap cache 107. In this manner, image requests do not wait for the calculations (e.g., the request can be handled immediately).
The update window images 800 also leverages asynchronous behavior. The networking and decompression 805 of images may also be too slow to handle on the main render thread. If multiple images need to update on the same frame, they each use their own thread.
From the perspective of the main render loop 600, the process requests updates and handles the GPU rendering. Meanwhile, the rest of the sub-systems write their data forward from the local memory in the server render loop 300 to the pid-bitmap cache 107 to the window bitmaps structure 806, where the client renderer 900 reads from. The effect is that the bitmaps being used to produce 3D output for the headset will always be up to date without demanding processing on the main render thread.
While at least one example embodiment has been presented in the foregoing detailed description, it should be appreciated that a vast number of variations exist. It should also be appreciated that the embodiment or embodiments are only examples, and are not intended to limit the scope, applicability, or configuration of the disclosure in any way. Rather, the foregoing detailed description will provide those of ordinary skill in the art with a convenient road map for implementing the example embodiment or embodiments. It should be understood that various changes can be made in the function and arrangement of elements without departing from the scope of the disclosure as set forth in the appended claims and the legal equivalents thereof. As an example of the wide variations of the systems and methods described herein, a system can be configured such that the virtualized desktop server maintains a list of windows (and their associated PIDs) opened by the user. The server requests application render updates from the host machine using a WM_Print message. In Windows versions 8.1 (6.3.9600) and later, the PW_RENDERFULLSCREEN flag is used. Because processes may render in a child process, a controller is used to determine the correct Window handle to issue WM_Print messages to. The server continually updates the bitmaps associated with the open windows and caches them. The server asynchronously responds to requests for current images by serving the latest cached value as the render time may be too slow to do during the request. In this example, images are served as Gzip—bitmap or PNG.
The virtualized desktop server may also listen for TCP/IP messages describing, keystrokes, mouse clicks, mouse moves, requests to open/close applications to provide interactivity with the images displayed to the client. The server converts these messages to operating system inputs and sends them to the associated applications.
The virtualized desktop client can operate on an edge device (e.g., a headset, a computer which provides rendering for the headset over HDMI/VGA/DVI/etc). In this example, the client sends requests to the server to open and interact with applications. At a rate of 20-30 Hz in this example, the client sends requests to the virtualized desktop server for an up-to-date rendering of all open applications. A rate of 1-60 Hz can also be used for a client to send requests to the server for an up-to-date rendering of all open applications. The rate can be based on a quality of service prioritization with respect to a user's field of view (e.g., based on what the user is currently viewing through a headset). To not interrupt the rendering loop, the client decompresses and/or decodes the images asynchronously into bitmaps representing pixels. As soon as decompression/decoding completes, the rendering loop sends the associated bitmaps to the GPU to be rendered on the next frame in the VR/AR device.
In this example, a full screen capture technique is used for rendering the entire desktop in VR or AR, and the WM_Print methods are used for application sharing in a non-VR/AR context and are used to capture an entire rendered screen. Additionally, real-time transfer of multiple applications from a server to an edge device are achieved for rendering multiple applications at the same time in Virtual and Augmented Reality.
A system and method can be configured as described herein to not render the entire desktop as a one-to-one pixel map which are almost transferred in memory as a single application instead of a client server model.
As another example of the wide variations of the systems and methods disclosed herein, the systems and methods can include utilization of the systems and methods for such uses as for office use, use by analysts, armored vehicles operators, submariners, pilots who have limited display space, etc.
Additionally, the systems' and methods' data (e.g., associations, mappings, data input, data output, intermediate data results, final data results, etc.) may be stored and implemented in one or more different types of computer-implemented data stores, such as different types of storage devices (e.g., memory) and programming constructs (e.g., RAM, ROM, Flash memory, flat files, databases, programming data structures, programming variables, IF-THEN (or similar type) statement constructs, etc.). It is noted that data structures describe formats for use in organizing and storing data in databases, programs, memory, or other computer-readable media for use by a computer program.
Still further, the systems and methods may be provided on many different types of computer-readable storage media including computer storage mechanisms (e.g., non-transitory media, such as CD-ROM, diskette, RAM, flash memory, computer's hard drive, etc.) that contain instructions (e.g., software) for use in execution by a processor to perform the methods' operations and implement the systems described herein.