This invention relates to streaming of video content over the Internet, specifically relating to optimizations of client-side adaptive bitrate streaming players to maximize user quality-of-experience (QoE).
With more and more content providers delivering video stream services over the Internet, user-perceived quality-of-experience has become an important differentiator. The quality-of-experience metric includes duration of rebuffering, startup delay, average playback bitrate, and the stability of that bitrate. There is little to no support in the network for optimizing or controlling these characteristics, forcing the client player unit to cope with the intermittent congestion, diverse bottlenecks, and other complexities of the Internet.
Modern client video players use bitrate adaptation logic in order to achieve a high quality-of-experience. Many proprietary implementations of video players have been fielded, but the first adaptive bit-rate HTTP-based streaming solution that is an international standard is Dynamic Adaptive Streaming over HTTP (DASH), also known as MPEG-DASH. The logic that performs the MPEG-DASH bit-rate adaptation within the video player unit, while currently superior to non-bit-rate adaptive players, has not been thoroughly optimized for quality-of-experience.
MPEG-DASH works by breaking the video content into a sequence of small HTTP-based file segments, each segment containing a short interval of playback time of a content that is potentially many hours in duration, such as a movie or the live broadcast of a sports event. The content is made available at a variety of different bit rates. In other words, alternative segments encoded at different bit rates covering aligned short intervals of playback time are made available. In order to seamlessly adapt to changing network conditions and provide high quality play back with fewer stalls or re-buffering events, the MPEG-DASH client selects the next segment to download and play back from the different bit rate alternatives based on either current network conditions or current playback buffer occupancy.
This invention offers an alternative method to choose the next segment from the different bit rate alternatives via model predictive control (MPC), a systematic combination of buffer occupancy and bandwidth predictions. This novel technique creates a video playback system whose performance is near optimal.
For purposes herein, a “video player” shall be defined as any device capable of streaming video from a network connection, including, for example, via WiFi, BlueTooth, a cellular data connection such as LTE, a hardwired connection or via any means of connecting to a server capable of serving video at mixed bitrates. Such devices include, but are not limited to smart televisions, projectors, video streaming devices (AppleTV, ChromeCast®, Amazon Fire Stick, Roku™, etc.), video gaming systems, smart phones, tablets and software-based video players running on generic computing devices.
For a user to perceive the client-side video player, many components are required, including a video display screen, a video display subsystem with buffering, a networking interface, a processor of some sort in order to perform the networking functions and HTTP processing, and logic to perform the bitrate adaptation method described in detail following (implemented in either an integrated circuit module and or in software on a general purpose processor). A component model of the adaptive video player is illustrated in
Video player 100 makes HTTP requests 102 to an internet-based video server 101, requesting video segments 104 at a specific bitrate R. As video segments 104 are received they are placed in playback buffer 106. Buffer occupancy is determined by the difference between the rate at which video segments 104 are downloaded in to playback buffer 106 and the rate at which video segments 104 are removed from playback buffer 106 for rendering on a video display screen.
Video can be modeled as a set of consecutive video segments or chunks, V={1, 2, . . . , K}, 104, each of which contains L seconds of video and encoded with different bitrates. Thus, the total length of the video is K×L seconds. The video player can choose to download video segment k with bitrate Rk∈R, where R is the set of all available bitrate levels. The amount of data in segment k is then L×Rk. The higher bitrate is selected, the higher video quality is perceived by the user. Let q(19 ):R→R+ be the function which maps selected bitrate Rk to video quality perceived by user q(Rk). The assumption is that q(·) is increasing.
The video segments are downloaded into a playback buffer, 106 as shown in
The buffer occupancy B(t) evolves as the chunks are being downloaded and the video is being played. Specifically, the buffer occupancy increases by L seconds after chunk k is downloaded and decreases as the user watches the video. Let Bk=B(tk) denote the buffer occupancy when the player starts to download chunk k. The buffer dynamics can then be formulated as:
An example of buffer dynamics is shown in
The determination of waiting time Δtk, also referred as chunk scheduling problem, is an equally interesting and important problem in improving fairness of multi-player video streaming. It is assumed that the player immediately starts to download chunk k+1 as soon as chunk k is downloaded. The one exception is when the buffer is full, at which time the player waits for the buffer to reduce to a level which allows chunk k to be appended. Formally,
The ultimate goal of bitrate adaptation is to improve the QoE of users to achieve higher long-term user engagement. A flexible QoE model, as opposed to a fixed notion of QoE is therefore used. While users may differ in their specific QoE functions, the key elements of video QoE are enumerated as:
Average Video Quality—The average per-chunk quality over all chunks:
Average Quality Variations—This tracks the magnitude of the changes in the quality from one chunk to another:
Total rebuffer Time—For each chunk k rebuffering occurs if the download time dk(Rk)/Ck is higher than the playback buffer level when the chunk download started (i.e., Bk). Thus the total rebuffer time is:
Alternatively, the number of rebufferings could be used in lieu of total rebuffer time:
Lastly, Startup Delay Ts, assuming Ts<<Bmax.
As users may have different preferences on which of four components is more important to them, the QoE of video segment 1 through K is defined by a weighted sum of the aforementioned components:
Here λ, μ and μs are non-negative weighing parameters corresponding to video quality variations and rebuffering time, respectively. A relatively small λ indicates that the user is not particularly concerned about video quality variability; the large λ is, the more effort is made to achieve smoother changes of bitrates. A large μ, relative to the other parameters, indicates that a user is deeply concerned about rebuffering. In cases where users prefer low startup delay, a large μs is employed
This definition of QoE is very general and allows customization so it can easily take into account user's preference, and could be extended as needed to incorporate other factors. As can be seen if
The problem of bitrate adaptation for QoE maximization can therefore be formulated in the following way:
This can be denoted as QoE_MAX1K.
The bandwidth trace Ct, t∈[t1, tK+1] serves as input to the problem. The outputs of QoE_MAX1K are bitrate decisions bitrate decisions R1, . . . , RK, and startup time TS.
Note that the problem QoE_MAX1K is formulated assuming the video playback has not started at the time of this optimization so the start-up delay TS is a decision variable. However, this QoE maximization can also take place during video playback at time tk
A source of randomness is the bandwidth Ct: At time tk when the video player chooses bitrate Rk, only the past bandwidth {Ct, t≦tk} is available while the future values {Ct, t>tk} are not known. However a throughput predictor 110 can be used to obtain predictions for future available bandwidth 114 based on past throughput 112, defined as {Ĉt, t>tk}. Based on such predictions 114, and on buffer occupancy information 108 (which is instead known precisely) and the QoE preferences 120 of the user, the bitrate controller 116 selects bitrate 118 of the next segment k:
R
k
=f(Bk, {Ĉt, t>tk}, {Ri, i<k}). (12)
Note that the basic MPC algorithms assume the existence of an accurate throughput predictor. However, in certain severe net work conditions, e.g., in cellular networks or in prime time when the Internet is congested, such accurate predictors may not be available. For example, if the predictor consistently overestimates the throughput, it may induce high rebuffering. To counteract the prediction error, a robust MPG algorithm is presented. Robust MPC optimizes the worst-case QoE assuming that the actual throughput can take any value in a range [̂Ct, ̂Ct] in contrast to a point estimate ̂Ct. Robust MPC entails solving the following optimization problem at time tk to get bitrate Rk:
subject to the constraints in paragraph [0028].
In general, it may be non-trivial to solve such a max-min robust optimization problem. In this case, however, the worst case scenario takes place when the throughput is at its lower bound Ct=̂Ct. Thus, the implementation of robust MPC is straightforward. Instead of ̂Ct, the lowest possible ̂Ct is used as the input to the MPC QoE maximization problem.
To verify the inventions improved QoE over current methods, a normalized QoE metric was defined to compare performance of available video playback systems. These systems, along with the invention, were compared to the optimal possible performance, that which could be achieved if the future bandwidth of the network was known.
For a given bandwidth trace {C, t ∈[t, tK+1]}, the offline optimal QoE, denoted by QoE(OPT), is the maximum QoE that can be achieved with perfect knowledge of future bandwidth over the entire time horizon.
Technically, it is calculated by solving problem QoE_MAX1K. While the assumption of knowing the entire future is not true in reality, the offline solution provides a theoretical upper bound for all systems for a particular bandwidth trace.
On the other hand, online QoE with bitrate selection system A is calculated under the assumption that at time tk, the bitrate controller only knows the past bandwidth {Ct, t∈[t1, tk]. Based on this, Rk (i.e., the bitrate 118 for the next video segment) is selected. The online QoE achieved by algorithm A can be denoted by QoE(A).
Because offline optimal solution assumes perfect knowledge about the future, for any video playback system the online QoE is always less than the offline optimal QoE. In other words, QoE(OPT) is an upper bound of online QoE achieved by any video playback system. To this end, QoE of A (n-QoE(A)) is defined as the performance metric for an system A:
At iteration k, the player maintains a moving horizon from chunk k to k+N−1 and carries out the following three key steps, as shown in Algorithm 1.
1. Predict: Predict throughput Ĉ[t
2. Optimize: This is the core of the MPC algorithm: Given the current buffer occupancy Bk, previous bitrate Rk−1 and throughput prediction Ĉ[t
QOE_MAX_STEADYkk+N−1
In the start-up phase, it also optimizes start-up time TS as:
[Rk, Ts]=fmpcst(Rk−1, Bk, Ĉ[t
implemented by solving
QOE_MAXkk+N−1
If practical details about computational overhead, are ignored, off-the-shelf solvers such as CPLEX can be used to solve these discrete optimization problems.
3. Apply: Start to download chunk k with Rk and move the horizon forward. If the player is in start-up phase, wait for Ts before starting playback.
This workflow has several qualitative advantages compared with buffer-based (BB), rate-based (RB). First, the MPC algorithm uses both throughput prediction and buffer information in a principled way. Second, compared to pure RB approaches, MPC smooths out prediction error at each step and is more robust to prediction errors. Specifically, by optimizing several chunks over a moving horizon, large prediction errors for one particular chunk will have lower impact on the performance. Third, MPC directly optimizes a formally defined QoE objective, while in RB and BB the tradeoff between different QoE factors is not clearly defined and therefore can only be addressed in an ad hoc qualitative manner.
Experimentation using this invention over a wide variety of network conditions have shown a higher normalized QoE compared to existing video playback systems.
Lastly, as opposed to rate-based and buffer-based algorithms, which need relatively minor computations, the challenge with MPC is that a discrete optimization problem needs to be solved at each time step. There are two practical concerns here.
(1) Computational overhead: The high computational overhead of MPC is especially problematic for low-end mobile devices, which are projected to be the dominant video consumers going forward. Since the bitrate adaptation decision logic is called before the player starts to download each chunk, excessive delay in the bitrate adaptation logic will negatively affect the QoE of the player.
(2) Deployment: Because there is no closed-form or combinatorial solution for the QoE maximization problem, a solver (e.g., CPLEX or Gurobi) will need to be used. However, it may not be possible for video players to be bundled with such solver capabilities; e.g., licensing issues may preclude distributing such software or it may require additional plugin or software installations which poses significant barriers to adoption.
Therefore, it is evident that the solution should be lightweight and combinatorial (i.e., not solving a LP or ILP online). As such, also presented herein is a fast and low-overhead FastMPC design that does not require any explicit solver capabilities in the video player.
At a high level, FastMPC algorithms essentially follow a table enumeration approach. Here, an offline step of enumerating the state-space and solving each specific instance is performed. Then, in the online step, these stored optimal control decisions mapped to the current operation conditions are used. That is, the algorithm will be reduced to a simple table lookup indexed by the key value closest to the current state and the output of the lookup is the optimal solution for the selected configuration.
As shown in
Unfortunately, directly using this idea will be very inefficient because of the high dimensional state space. For instance, if there are 100 possible values for the buffer level, 10 possible bitrates, a horizon of size 5, and 1000 possible throughput values, there will be 1018 rows in the table. There are two obvious consequences of this large state space. First, it may not be practical to explicitly store the full table in the memory, causing any implementation to have a very high memory footprint along with a large startup delay, as the table will need to be downloaded to the player module. Second, it will incur a non-trivial offline computation cost that may need to be rerun as the operating conditions change.
There are two key optimizations that will make FastMPC practical.
Compaction via binning: To address the offline exploration cost, it should be realized that very fine-grained values for the buffer and the throughput levels may not be needed. As a consequence, these values may be suitably coarsened into aggregate bins. Moreover, with binning, row keys do not need to be explicitly stored the as these are directly computed from the bin row indices. The challenge is to balance the granularity of binning and the loss of optimality in practice. In practice, using approximately 100 bins for buffer level and 100 bins for throughput predictions works well and yields near-optimal performance.
Table compression: The decision table learned by the offline computation has significant structure. Specifically, the optimal solutions for several similar scenarios will likely be the same. Thus, this can be exploited this structure in conjunction with the binning strategy to explore a simple lossless compression strategy using a run-length encoding to store the decision vector. The optimal decision can then be retrieved online using binary search. In practice, with compression, the table occupies less than 60 kB with 100 bins for buffer levels, 100 bins for throughput predictions and 5 bitrate levels.
The invention may be implemented in any video player 100, as defined herein, as, for example, a built-in feature, an add-on, a downloadable app, a piece of software, etc., or in any other way of implementation, currently known or yet to be developed.
Although the invention is illustrated and described herein with reference to specific embodiments, the invention is not intended to be limiting to the details shown. Rather, various modifications may be made in the details without departing from the invention.
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 62/177,904, filed Mar. 26, 2015.
This invention was made with government support under National Science Foundation No. ECCS0925964. The government has certain rights in this invention.
Number | Date | Country | |
---|---|---|---|
62177904 | Mar 2015 | US |