A wireless communication link can be used to send a video stream from a computer (or other device) to a virtual reality (VR) headset (or head mounted display (HMD). Transmitting the VR video stream wirelessly eliminates the need for a cable connection between the computer and the user wearing the HMD, thus allowing for unrestricted movement by the user. A traditional cable connection between a computer and HMD typically includes one or more data cables and one or more power cables. Allowing the user to move around without a cable tether and without having to be cognizant of avoiding the cable creates a more immersive VR system. Sending the VR video stream wirelessly also allows the VR system to be utilized in a wider range of applications than previously possible.
Wireless VR video streaming applications typically have high resolution and high frame-rates, which equates to high data-rates. However, the link quality of the wireless link over which the VR video is streamed has capacity characteristics that can vary from system to system and fluctuate due to changes in the environment (e.g., obstructions, other transmitters, radio frequency (RF) noise). The VR video content is often rendered using a stereoscopic technique. As used herein, the term “stereoscopic” is defined as creating or enhancing the illusion of depth in a frame by displaying the same scene at slightly different angles such that when the two scenes are viewed together by the left and right eyes, the scene gains an impression of depth and solidity. It can be challenging to compress VR video for transmission over a low-bandwidth wireless link while minimizing any perceived reduction in video quality by the end user.
The advantages of the methods and mechanisms described herein may be better understood by referring to the following description in conjunction with the accompanying drawings, in which:
In the following description, numerous specific details are set forth to provide a thorough understanding of the methods and mechanisms presented herein. However, one having ordinary skill in the art should recognize that the various implementations may be practiced without these specific details. In some instances, well-known structures, components, signals, computer program instructions, and techniques have not been shown in detail to avoid obscuring the approaches described herein. It will be appreciated that for simplicity and clarity of illustration, elements shown in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements.
Various systems, apparatuses, methods, and computer-readable mediums for implementing stereoscopic interleaved compression techniques (generally referred to herein as an “interleaved transmission scheme”) are disclosed herein. In one implementation, a system includes a transmitter sending a video stream over a wireless link to a receiver. The transmitter compresses frames of the video stream prior to sending the frames to the receiver. For each pair of frames, the transmitter encodes a left-half of a first frame of the pair with an amount of compression less than a first threshold and encodes a right-half of the first frame with an amount of compression greater than a second threshold. In various embodiments, the pair of frames are consecutive frames of the video. For a second frame of the pair, the transmitter encode a right-half of the second frame with an amount of compression less than the first threshold and encodes a left-half of the second frame with an amount of compression greater than the second threshold. The transmitter conveys encoded half-frames and indications of an amount of compression for each half-frame to a receiver. The receiver receives, decodes, and drives the encoded half-frames to a display.
Referring now to
Wireless communication devices that operate within extremely high frequency (EHF) bands, such as the 60 GHz frequency band, are able to transmit and receive signals using relatively small antennas. However, such signals are subject to high atmospheric attenuation when compared to transmissions over lower frequency bands. In order to reduce the impact of such attenuation and boost communication range, EHF devices typically incorporate beamforming technology. For example, the IEEE 802.11ad specification details a beamforming training procedure, also referred to as sector-level sweep (SLS), during which a wireless station tests and negotiates the best transmit and/or receive antenna combinations with a remote station. In various implementations, transmitter 105 and receiver 110 perform periodic beamforming training procedures to determine the optimal transmit and receive antenna combinations for wireless data transmission.
In one implementation, transmitter 105 and receiver 110 have directional transmission and reception capabilities, and the exchange of communications over the link utilizes directional transmission and reception. Each directional transmission is a transmission that is beamformed so as to be directed towards a selected transmit sector of antenna 140. Similarly, directional reception is performed using antenna settings optimized for receiving incoming transmissions from a selected receive sector of antenna 160. The link quality can vary depending on the transmit sectors selected for transmissions and the receive sectors selected for receptions. The transmit sectors and receive sectors which are selected are determined by system 100 performing a beamforming training procedure.
Transmitter 105 and receiver 110 are representative of any type of communication devices and/or computing devices. For example, in various implementations, transmitter 105 and/or receiver 110 can be a mobile phone, tablet, computer, server, head-mounted display (HMD), television, another type of display, router, or other types of computing or communication devices. In one implementation, system 100 executes a virtual reality (VR) application for wirelessly transmitting frames of a rendered virtual environment from transmitter 105 to receiver 110. In other implementations, other types of applications can be implemented by system 100 that take advantage of the methods and mechanisms described herein.
In one implementation, transmitter 105 includes at least radio frequency (RF) transceiver module 125 including an interface configured to transmit data, processor 130, memory 135, and antenna 140. RF transceiver module 125 transmits and receives RF signals. In one implementation, RF transceiver module 125 is a mm-wave transceiver module operable to wirelessly transmit and receive signals over one or more channels in the 60 GHz band. RF transceiver module 125 converts baseband signals into RF signals for wireless transmission, and RF transceiver module 125 converts RF signals into baseband signals for the extraction of data by transmitter 105. It is noted that RF transceiver module 125 is shown as a single unit for illustrative purposes. It should be understood that RF transceiver module 125 can be implemented with any number of different units (e.g., chips) depending on the implementation. Similarly, processor 130 and memory 135 are representative of any number and type of processors and memory devices, respectively, that are implemented as part of transmitter 105. In one implementation, processor 130 includes encoder 132 to encode (i.e., compress) a video stream prior to transmitting the video stream to receiver 110. In other implementations, encoder 132 is implemented separately from processor 130. In various implementations, encoder 132 is implemented using any suitable combination of hardware and/or software.
Transmitter 105 also includes antenna 140 for transmitting and receiving RF signals. Antenna 140 represents one or more antennas, such as a phased array, a single element antenna, a set of switched beam antennas, etc., that can be configured to change the directionality of the transmission and reception of radio signals. As an example, antenna 140 includes one or more antenna arrays, where the amplitude or phase for each antenna within an antenna array can be configured independently of other antennas within the array. Although antenna 140 is shown as being external to transmitter 105, it should be understood that antenna 140 can be included internally within transmitter 105 in various implementations. Additionally, it should be understood that transmitter 105 can also include any number of other components which are not shown to avoid obscuring the figure. Similar to transmitter 105, the components implemented within receiver 110 include at least RF transceiver module 145, processor 150, decoder 152, memory 155, and antenna 160, which are analogous to the components described above for transmitter 105. It should be understood that receiver 110 can also include or be coupled to other components (e.g., a display).
Turning now to
Computer 210 and HMD 220 each include circuitry and/or components to communicate wirelessly. It is noted that while computer 210 is shown as having an external antenna, this is shown merely to illustrate that the video data is being sent wirelessly. It should be understood that computer 210 can have an antenna which is internal to the external case of computer 210. Additionally, while computer 210 can be powered using a wired power connection, HMD 220 is typically battery powered. Alternatively, computer 210 can be a laptop computer (or another type of device) powered by a battery.
In one implementation, computer 210 includes circuitry which dynamically renders a representation of a VR environment to be presented to a user wearing HMD 220. For example, in one implementation, computer 210 includes one or more graphics processing units (GPUs) executing program instructions so as to render a VR environment. In other implementations, computer 210 includes other types of processors, including a central processing unit (CPU), application specific integrated circuit (ASIC), field programmable gate array (FPGA), digital signal processor (DSP), or other processor types. HMD 220 includes circuitry to receive and decode a compressed bit stream sent by computer 210 to generate frames of the rendered VR environment. HMD 220 then drives the generated frames to the display integrated within HMD 220.
In one implementation, computer 210 encodes the frames of the rendered VR environment in a manner that results in a decreased data-rate for the resultant encoded bitstream. For example, in one implementation, computer 210 encodes the rendered VR environment by interleaving half-frames of the rendered VR environment while dropping (or “discarding”) half a frame of each frame of the rendered VR environment. In one embodiment, dropping half a frame means content that would normally be included in that half of the frame (e.g., the video or image content) is not included and is replaced with other content. For example, for a first frame, computer 210 encodes the scene 225L on the left-side for transmission while dropping the scene 225R on the right-side. In this manner, little or no content of the original right-side of the frame is retained in the encoded frame that is transmitted to the receiver (e.g., the HMD 220). For the second frame, computer 210 encodes the scene 225R on the right-side of HMD 220 while dropping the scene 225L on the left-side of HMD 220. Computer 210 follows this pattern for subsequent frames of the rendered VR environment, alternating which half of the screen gets dropped. Computer 210 then sends the encoded half-frames and indications of which half-frames were dropped to HMD 220. HMD 220 decodes the encoded bitstream and reconstructs each frame by displaying one half-frame on the screen in the normal fashion and displaying all black or some other suitable representation on the screen for the dropped half-frame. In one implementation, computer 210 increases the resolution for the encoded half-frame using the savings from not sending the dropped half-frame. This will result in a higher quality image at HMD 220 than would otherwise be possible. As an alternative to dropping half of the frame, one half of the frame may be encoded as a constant value or otherwise encoded in a manner that consumes relatively little bandwidth when transmitted (e.g., one half of the frame may be more highly compressed than the other half). As used herein, a dropped portion (e.g., half in one embodiment) of a frame, more compressed portion of a frame, or any other approach that results in one portion of a frame being generated or encoded so that it will consume less bandwidth on transmission is referred to as having been encoded with a lower resolution than that of the other portion of the frame. Consequently, the other portion of the frame is deemed to have been encoded with a higher resolution than the other portion of the frame.
Referring now to
In one implementation, if a first condition is detected, then video stream 301 is encoded using an interleaved transmission scheme by interleaving half-frames rather than encoding and transmitting the entire frame. For example, one half of each frame is encoded while the other half is dropped and then the halves are swapped on the subsequent frame. In other words, the left-half of a first frame is encoded while the right-half of the first frame is dropped, and then the left-half of a second frame is dropped while the right-half of the second frame is encoded. In this example, the second frame follows immediately after the first frame in the video sequence. This pattern continues for subsequent pairs of frames, such that the left-half of the frame is encoded and displayed every other frame and the right-half of the frame is encoded and displayed every other frame, with only one half of the frame active for a given frame. It is noted that the half-frame that is displayed can be referred to herein as the “active half” while the half-frame that is dropped can be referred to herein as the “inactive half”.
In one implementation, the first condition is the frame rate being greater than a threshold. In another implementation, the first condition is the bandwidth requirements of the encoded video stream exceeding the link capacity of the wireless link. In a further implementation, the first condition is detecting a request to interleave half-frames by an application or a user. The dropped half of the frame can be presented to the user in a variety of ways, with the presentation technique varying according to the implementation. For example, in one implementation, the dropped half of the frame is displayed as a black half-screen. In another implementation, the dropped half of the screen is displayed as an average of the other half of the current frame or the same half of the previous frame. In a further implementation, a transfer function is applied to the intensity and/or brightness of the active half in order to generate the inactive half. In a still further implementation, the half of the previously received frame is replayed in place of the dropped half of the current frame. In other implementations, the dropped half of the frame is displayed in other suitable manners. Based on the way the human visual system works, the eyes and brain will combine the presented half-frame with the dropped half-frame in a non-conflicting manner such that the user will not notice the dropped half-frame. By sending only half of a frame, the bandwidth required to send the video stream is reduced by half.
Turning now to
An encoder receives a plurality of frames of a video sequence to encode (block 405). In one implementation, the encoder is part of a transmitter or coupled to a transmitter. The transmitter can be any type of computing device, with the type of computing device varying according to the implementation. In one implementation, the transmitter renders frames of a video stream as part of a virtual reality (VR) environment. In other implementations, the video stream is generated for other environments. In one implementation, the encoder and the transmitter are part of a wireless VR system. In other implementations, the encoder and the transmitter are included in other types of system. In one implementation, the encoder and the transmitter are integrated together into a single device. In other implementations, the encoder and the transmitter are located in separate devices.
For every pair of frames, the encoder encodes (i.e., compresses) a first frame of the pair by encoding the left-half portion of the first frame and dropping the right-half portion of the first frame (block 410). In one implementation, the left-half portion refers to the half of the frame that will be presented to the left eye while the right-half portion refers to the half of the frame that will be presented to the right eye. Next, the encoder encodes a second frame of the pair by encoding the right-half frame and dropping the left-half frame (block 415). Then, the encoder sends each encoded frame to a receiver to be displayed (block 420). Next, the receiver decodes each encoded half-portion and displays the decoded version of the encoded half-portion while displaying any of various representations for the dropped half-frame (block 425). For example, in one implementation, the receiver displays all black pixels for the dropped half-frame. Next, if there are there more frames to encode (conditional block 430, “yes” leg), then method 400 returns to block 410. Otherwise, if there are there more frames to encode (conditional block 430, “no” leg), then method 400 ends.
It is noted that in other implementations, the encoder can repeat this pattern at other intervals rather than every pair of frames. For example, in another implementation, the encoder could drop the left half-frame once every three frames and drop the right half-frame once every three frames. In another implementation, the encoder could drop the left half-frame once every four frames and drop the right half-frame once every four frames. In other implementations, the encoder could drop half-frames at other frequencies, such as twice every five frames, three times every eight frames, or any other desired scheduling pattern.
Referring now to
Turning now to
If the first condition has been detected (conditional block 610, “yes” leg), then the transmitter interleaves half-frames of the video stream when generating the encoded bitstream (block 615). Examples of different ways of interleaving half-frames of the video stream when generating the encoded bitstream are described in methods 400 and 500 (of
In various implementations, program instructions of a software application are used to implement the methods and/or mechanisms described herein. For example, program instructions executable by a general or special purpose processor are contemplated. In various implementations, such program instructions can be represented by a high level programming language. In other implementations, the program instructions can be compiled from a high level programming language to a binary, intermediate, or other form. Alternatively, program instructions can be written that describe the behavior or design of hardware. Such program instructions can be represented by a high-level programming language, such as C. Alternatively, a hardware design language (HDL) such as Verilog can be used.
In various implementations, the program instructions are stored on any of a variety of non-transitory computer readable storage mediums. The storage medium is accessible by a computing system during use to provide the program instructions to the computing system for program execution. Generally speaking, such a computing system includes at least one or more memories and one or more processors configured to execute program instructions.
It should be emphasized that the above-described implementations are only non-limiting examples of implementations. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.