This disclosure generally relates to streaming and playback of video, and more particularly to optimization of video encoding and decoding methods in video encoders and decoders for dynamic adaptive streaming of video.
In addition to the more traditional televisions and projector-based systems connected to Internet-provider networks at the home, many playback devices today are mobile devices, such as tablets, smartphones, laptops, and the like, which are usually connected to a network over an unreliable wireless connection with widely variable network conditions. Transmitting high-quality video over the network poses a great challenge. To cope with this problem, a solution called adaptive bitrate streaming has been used. A video presentation encoded for adaptive streaming is conventionally split into parts. Each part contains a certain number of frames and each part can be decoded independently. Each of these parts (or period of frames) is encoded in several versions where each version uses a different encoding bitrate. Depending on the varying bitrate available during streaming in the transmission channel, an adaption algorithm is used to decide which version of each period of frames should be transmitted and decoded according to variations in channel conditions.
The quality of the video in a period of frames generally increases with an increasing encoding bitrate. However, the reconstruction quality of different periods of frames encoded at the same bitrate is not constant and varies depending on the content of the video encoded within the period frames. In the prior art approach for adaptive streaming, a streaming adaption algorithm optimizes quality by selecting each period at the highest possible bitrate allowed by the cannel conditions. In this case, the perceived quality of the video over time may vary significantly depending on the video content, even when the period of frames are encoded at the same or even higher bitrates. This behavior over time is undesirable.
According to various embodiments, a method and system for streaming video presentations is provided. According to one embodiment, a method is provided for optimizing buffering of periods of frames of a streaming video presentation while minimizing variation in perceptual quality of the video presentation. In this embodiment, the method comprises buffering a plurality of periods of frames of a video presentation for transmission in a stream. Each period of frames in the plurality of periods of frames includes a metadata portion with metadata descriptive of an expected visual quality of the period of frames and a set of following periods of frames. The method further comprises analyzing the metadata in a current period of frames to determine a first transmission bitrate for the current period of frames and a second transmission bitrate for a period of frames in the set of following periods of frames. In this embodiment, the first transmission bitrate and the second transmission bitrate are selected to maintain a substantially uniform visual quality based on the expected visual quality.
According to this embodiment, the method also includes transmiting the current period of frames at the first transmission bitrate and transmitting the period of frames in the set of following periods of frames at the second transmission bitrate. In this embodiment, the the first transmission bitrate is different than the second transmission bitrate and at least one of the first transmission bitrate or the second transmission bitrate is lower than a highest bitrate that would be achievable given a current set of channel conditions.
According to another embodiment, a system is provided with a buffer configured to buffer a plurality of periods of frames of a video presentation for transmission in a stream. In this embodiment, each period of frames in the plurality of periods of frames includes a metadata portion with metadata descriptive of an expected visual quality of the period of frames and a set of following periods of frames. The system further includes a processor configured for controlling transmissions out of the buffer and to analyze the metadata in a current period of frames to determine a first transmission bitrate for the current period of frames and a second transmission bitrate for a period of frames in the set of following periods of frames. In this embodiment, the first transmission bitrate and the second transmission bitrate are selected to maintain a substantially uniform visual quality based on the expected visual quality. The system further includes a network interface for streaming the video presentation and that is configured to transmit the current period of frames at the first transmission bitrate and to transmit the period of frames in the set of following periods of frames at the second transmission bitrate;.
In this embodiment, the the first transmission bitrate is different than the second transmission bitrate and at least one of the first transmission bitrate or the second transmission bitrate is lower than a highest bitrate that would be achievable given a current set of channel conditions.
According to embodiments, the metadata portion may be signaled within a video bitstream at a beginning of each period of frames.
In embodiments, the metadata includes a quality indicator calculated from one or more of the quality metrics consisting of PSNR, SSIM, and VMAF.
According to other aspects of some embodiments, each period of frames is represented in a plurality of bitrate versions. In these embodiments, the metadata portion may be signaled within a video bitstream at a beginning of each version of each period of frames.
According to these embodiments, a method may also include determining the current set of channel condition, determining the highest bitrate that would be achievable for the current set of channel conditions, and determining a version of the current period of frames to be transmitted and decoded according to the current set of channel conditions. In such embodiments, the analyzing of the metadata in the current period of frames is based on the version of the current period of frames to be transmitted. Similarly, in systems according to these embodiments, the processor may be configured to determine the current set of channel conditions, the highest bitrate that would be achievable for the current set of channel conditions and a version of the current period of frames to be transmitted and decoded according to the current set of channel conditions. The processor may also be configured to analyze the metadata in the current period of frames based on the version of the current period of frames to be transmitted.
Non-transitory computer readable media is also provided containing computer program code, which can be executed by a computer processor for performing any or all of the steps, operations, or processes described.
The following description describe certain embodiments by way of illustration only. One of ordinary skill in the art will readily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles described herein. Reference will now be made in detail to several embodiments.
The above and other needs are met by the disclosed methods, a non-transitory computer-readable storage medium storing executable code, and systems for streaming and playing back video content.
To address the problem identified above, in one embodiment, a streaming adaption algorithm in a video streaming system optimizes the buffering of periods of frames of a video presentation in order to achieve a more constant perceptual quality throughout the entire video presentation. For this, the adaption algorithm chooses to transmit some periods at a lower bitrate that the channel conditions of the channel may allow while transmitting other periods at a higher bitrate in order to optimize the bitrate and the expected perceptual quality of each version of each period over time. In one embodiment, the adaption algorithm can then optimize the overall viewing experience for the entire video presentation or stream.
Referring to
In order to perform the proposed adaption, in one embodiment, the adaption algorithm uses information descriptive of the expected visual quality of each bitrate version for the current period and within a set of following or future periods. These quality indicators are signaled within the video bitstream at the beginning of each version of each period. At first, the ID of the current representation as well as the length of the period in frames and the number of different representations are signaled. The quality indicator may be signaled for every representation of the current period. Finally, the quality indicators for a selected number of subsequent periods is also signaled and considered by the adaptation algorithm to achieve a more uniform presentation quality.
According to this embodiment, the method also includes transmiting 204 the current period of frames at the first transmission bitrate. For example, the current period of frames may be transmitted at a bitrare that is below the maximum bitrate achievable under current channel conditions. The method also includes transmitting 205 the period of frames in the set of following periods of frames at the second transmission bitrate, which in one embodiment may be at the maximum bitrate for the current channel conditions. In one embodiment, the the first transmission bitrate is different than the second transmission bitrate and at least one of the first transmission bitrate or the second transmission bitrate is lower than a highest bitrate that would be achievable given a current set of channel conditions.
According to one embodiment, the following metadata signaling elements may be used in the bitstream:
In this embodiment,
The foregoing description of the embodiments has been presented for the purpose of illustration; it is not intended to be exhaustive or to limit the patent rights to the precise forms disclosed. Persons skilled in the relevant art can appreciate that many modifications and variations are possible in light of the above disclosure.
Some portions of this description describe the embodiments in terms of algorithms and symbolic representations of operations on information. These algorithmic descriptions and representations are commonly used by those skilled in the data processing arts to convey the substance of their work effectively to others skilled in the art. These operations, while described functionally, computationally, or logically, are understood to be implemented by computer programs or equivalent electrical circuits, microcode, or the like. Furthermore, it has also proven convenient at times, to refer to these arrangements of operations as modules, without loss of generality. The described operations and their associated modules may be embodied in software, firmware, hardware, or any combinations thereof.
Any of the steps, operations, or processes described herein may be performed or implemented with one or more hardware or software modules, alone or in combination with other devices. In one embodiment, a software module is implemented with a computer program product comprising a computer-readable medium containing computer program code, which can be executed by a computer processor for performing any or all of the steps, operations, or processes described.
Embodiments may also relate to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, and/or it may comprise a general-purpose computing device selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a non-transitory, tangible computer readable storage medium, or any type of media suitable for storing electronic instructions, which may be coupled to a computer system bus. Furthermore, any computing systems referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
Finally, the language used in the specification has been principally selected for readability and instructional purposes, and it may not have been selected to delineate or circumscribe the inventive subject matter. It is therefore intended that the scope of the patent rights be limited not by this detailed description, but rather by any claims that issue on an application based hereon. Accordingly, the disclosure of the embodiments is intended to be illustrative, but not limiting, of the scope of the patent rights.
This application claims the benefit of U.S. Provisional Application No. 62/587,184, filed Nov. 16, 2017 and which is incorporated herein by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2018/061214 | 11/15/2018 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62587184 | Nov 2017 | US |