The present invention generally relates to the streaming live media. More particular, the invention relates to a media client and a computer-implemented method performed by the respective media client.
File downloading is a process in which the end-user obtains the entire file for the media content before watching or listening to it. On the other hand, streaming media refers to the process of constantly delivering or obtaining media content over a communication network, such as for example the Internet, by allowing viewers to start watching media content without the need to completely download the media content beforehand. Live streaming more particularly corresponds to the delivery of media content in real-time over the Internet much as a media content providers broadcasts live media content via live signals. Streaming of digital video and/or digital audio via online streaming platforms is widely popular nowadays. It is estimated that by 2020, live streaming will account for more than 80% of all Internet traffic worldwide. A large portion of the Internet traffic today already consists of such media streamed from media services to clients, typically from a media content provider such as a Content Distribution Network, also referred to as CDN, over a communication network to a large variety of media clients, such as for example a media player comprised in a web browser, running on a PC, a tablet, a smartphone, a set-up box, a TV etc. An end-user usually uses a media client installed on a client device to start playing the live media before the entire file has been transmitted from the streaming media provider to the client device.
End-to-end latency is critical when it comes to streaming live media, such as sports, interactive content and breaking news. End-to-end latency corresponds to the delay or the time between the moment when live content is captured and the moment when that same moment is displayed back by a media client on the display of an end-user device. One of the main contributions to the end-to-end delay is a network delay resulting from the transport of the live media over the communication network infrastructure between the media content provider and the media client.
For playback of videos, most media clients tend to keep a playback buffer aimed at handling uncertainties and variations in the communication network. This playback buffer is required to ensure video data is available when it is necessary to compensate for instabilities in loading video data, for example when it is necessary to compensate for instabilities of the communication network. Examples of such playback buffers are for example described in U.S. Pat. No. 9,525,641B1, WO2011/038028A2 and US2017/223361A1.
Playback conditions of the media can however change over time during the streaming of the live media. For example, a media client can decide to switch to an alternative quality for streaming the live media, requiring for example higher or lower bitrates. Alternatively, the properties of the communication network can change, for example when interferences and/or network loads are introduced from another media client. Alternatively, the type of communication network can change, for example when the communication network switches from streaming over wireless Internet to streaming via a 4G connection. Alternatively, the client device may change, for example when remote screens are connected for streaming the live media and/or when casting sessions are started. All these changes in the playback conditions can additionally coexist dynamically during the streaming of the live media by the media client. It remains a challenge for media content providers to develop media clients which take changes in the communication network into account when streaming live media.
Media content providers further constantly strive to reduce latency when streaming live media. With the playback buffer representing a significant portion of the end-to-end latency, the size of the buffer is typically kept as small as possible. However, minimizing the size of the playback buffer of a media client could trigger playback stalls at the side of an end-user of the media client when the playback buffer is depleted, which could severely impact the Quality of Experience, also know as QoE, for the end-user of the media client. In case of a playback stall for example, it is usual that a media client switches to lower quality streams by applying algorithms such as the technique of Adaptive Bitrate streaming, which results in a reduced QoE for the end-user. On the other hand, maximizing the size of the playback buffer of a media client could cause high startup times before playback and high latency, which would also significantly jeopardize the QoE for an end-user of the media client.
The above-mentioned streaming solutions still suffer from shortcomings, especially in terms of latency and QoE for an end-user of the media client when streaming live media.
It is an objective of the present invention to disclose a media client and a computer-implemented method that overcome the above identified shortcomings of existing solutions. More particularly, it is an objective to disclose a media client and a computer-implemented method which allow the best possible user experience regardless of a playback environment and changes within this environment.
The scope of protection sought for the various embodiments of the invention is set out by the independent claims. The embodiments and features described in this specification that do not fall within the scope of the independent claims, if any, are to be interpreted as examples useful for understanding various embodiments of the invention.
The media clients described in the above-mentioned streaming solutions comprise playback buffers with fixed sizes, wherein the sizes of the buffers are fixed and pre-configured for each media client. Relying on fixed sizes for playback buffers prevents media content providers from taking changes in the communication network into account when streaming live media. Additionally, with Adaptive Bitrate streaming techniques, neither the size of a playback buffer nor an optimum size of playback buffer needed to stream live media is considered. As a result, the experience of an end-user of the media client can be greatly impacted when live media is streamed with the above-mentioned streaming solutions.
According to a first aspect of the present disclosure, the above defined objectives are realized by a media client configured to stream live media received from a communication network, wherein the media client comprises a playback buffer configured to temporarily store the live media, and wherein the media client is configured to adapt a size of the playback buffer when the media client streams the live media.
The media client according to the present disclosure measures a size of its own current playback buffer and determines what the ideal size of its own current playback buffer is for streaming the live media. In other words, the media client according to the present disclosure increases or decreases a size of its own playback buffer when necessary and when playing the live media by speeding up or slowing down playback or jumping forward in the playback when necessary. The media client therefore comprises an adaptive buffer controller which monitors and measures the current size of its own playback buffer and determines the ideal size of its own playback buffer for streaming the live media. This way, the media client according to the present disclosure can dynamically alter its behavior when necessary while streaming live media by modifying a size of its own playback buffer in function of a state of the communication network. With the media client according to the present disclosure, an optimal balance between Quality of Experience and latency is provided to an end-user of the media client. The media client according to the present disclosure provides the best possible user experience regardless of possible changes over the communication network. When the conditions of the communication network are stable, the media client according to the present disclosure provides ultra-low latency when streaming the live media with a minimized size of the playback buffer. When the conditions of the communication network are unstable, the media client according to the present disclosure further provides a playback buffer which is large enough to ensure stable playback of the live media for an end-user of the media client.
A media client according to the present disclosure is for example a video player. Alternatively, a media client according to the present disclosure is for example any type of media player software or any type of application software for playing multimedia computer files like audio and video files.
Optionally, the media client further comprises an Adaptive Rate streaming technique to not only dynamically determine the right size of the playback buffer for streaming the live media, but also to dynamically provide feedback for potential bitrate switching decisions.
The live media received by the media client is a combination of ordered still pictures or frames that are decoded or decompressed and played one after the other within the media client. To this respect, the media client, or client device, may be any device capable of receiving a digital representation of such media over a communication network and capable of decoding the representation into a sequence of frames that can be displayed on a screen to a user. Examples of media clients that are suitable as a media client are desktop and laptop computers, smartphones, tablets, setup boxes and TVs. A media client may also refer to a media player application running on any of such devices. Streaming of live media refers to the concept that the media client can request the live media from a remote media service such as for example a remote server and start the playback of the media upon receiving the first frames without having received all the frames of the compete stream of media. A streaming service is then a remote service that can provide such live media streaming upon request of the media client to the remote server over a communication network, for example over the Internet, over a Wide Area Network (WAN) or a Local Area Network (LAN).
According to an optional aspect of the disclosure, the media client is further configured to:
This way, the media client can dynamically alter its behavior while streaming live media by modifying a size of its own playback buffer in function of conditions of the communication network. Monitoring a live playback environment of the communication network allows the media client to determine conditions of the communication network. When the live playback environment of the communication network is stable or favorable for streaming live media, the media client provides ultra-low latency when streaming the live media with a small size of the playback buffer. When the live playback environment of the communication network is unstable or unfavorable for streaming live media, the media client further provides a playback buffer which is large enough to ensure stable playback of the live media for an end-user of the media client. It is therefore possible to provide the best possible user experience regardless of possible changes in the live playback environment of the communication network.
According to an optional aspect of the disclosure, the live playback environment comprises one or more of the following:
The media client monitors the live playback environment of the communication network when streaming the live media and determines an ideal size of the playback buffer for the live playback environment of the communication network. For example, the media client monitors network fluctuations in the communication network. For example, the media client monitors a type of the communication network when streaming the live media. It is possible that the communication network switches between a type of wireless communication network and a type of 3G connection or 4G connection or 5G connection communication network. The type of communication network as well as possible switches between different types greatly influences the live playback environment of the communication network. For example, a type of 3G connection usually demonstrates a higher packet loss than a type of 4G or 5G connection. A playback buffer is therefore needed to cancel the packet losses out. Also, each type of communication network has an inherent data latency. Larger sizes of playback buffers can compensate for example the higher packet losses and/or the data latency. Additionally, the media client monitors a throughput of the communication network. When the throughput is not constant but varies, the size of the playback buffer can be used to compensate the variation of the throughput. For example, a larger size of playback buffer can be necessary to compensate the variations of the throughput. Also, when for example switching between two different playback qualities, there is a build-up time during which the TCP window comprises the correct size of playback buffer. The media client further monitors performance of the client device on which the media client is implemented. For example, the media client further monitors a CPU usage of a client device on which the media client runs. The media client for example also monitors a memory usage of a client device on which the media client runs. The media client for example further monitors the number of packet losses which occur when one or more packets of data of the live media travelling across the communication network fail to reach their destination, for example the media client. Packet loss is either caused by errors in data transmission, typically across wireless networks, or by network congestion. For example, the media client further monitors a number of dropped video frames and/or of dropped audio samples, wherein the dropped video frames are video frames of the live media which could not be decoded on time by the media client. The media client for example further monitors the round-trip delay time, also referred to as RTD, and/or the round-trip time, also referred to as RTT. The media client therefore monitors and/or estimates an end-to-end latency of the live media being played on the media client and dynamically adapts the size of its own size of the playback buffer as a function of the latency of the live media being played on the media client. Additionally, frame drops, for example drops of video frames drops and/or audio samples, can occur when the time difference between the time when a video frame is placed in the playback buffer and the time when that video frame is shown on the media client is insufficient for the media client to process the video frame and/or the audio sample. This can happen more frequently for example in situations where there is high CPU load and/or high memory usage of the client device on which the media client runs. When this situation is detected by the media client, it is desired to increase the size of the playback buffer.
According to an optional aspect of the disclosure, the media client is further configured to increase, respectively decrease, the size of the playback buffer when the live playback environment exceeds one or more predetermined thresholds.
This way, by applying rules on the live playback environment in function of one or more predetermined thresholds, the media client is further configured to increase, respectively decrease, the size of the playback buffer.
For example, let's consider playing back live media with an average bandwidth of 2 Mbps with the media client according to the present invention. In this scenario, it is assumed that the available network connection is larger than 2 Mbps. However, due to packet losses and/or other disturbances and/or loads on the network communication degrading the live playback environment, there are regular lapses of 1 second from 2 Mbps to 1 Mbps. Initially, there is an unknown initial number of seconds b0 in the playback buffer of the media client. A size of the playback buffer of the media client generally increases over time t according to equations (1), (2) and more particularly in this specific scenario according to equation (3):
A consumption of playback buffer over time t is equal to t, i.e. a second of live media is played back per second by the media client. The evolution bt of the size of the playback buffer of the media client over the time t is given by equation (4):
When the playback buffer is empty, i.e. when the size of the playback buffer bt is equal to 0, equations (5), (6), (7) and (8) can be obtained for the particular scenario presented above:
For the scenario presented above, when a lapse lasts 1 second maximum, it is possible to prevent stalling of the playback of the live media by relying on a playback buffer with a size equal to 500 milliseconds.
For the scenario presented above, when a lapse lasts 2 seconds maximum, it is possible to prevent stalling of the playback of the live media by relying on a playback buffer with a size equal to 1000 milliseconds. On the other hand, it could then be more interesting to play the live media in a lower quality instead.
According to an optional aspect of the disclosure, the media client is further configured to play the live media at a lower playback rate, respectively at a higher playback rate, when the live playback environment exceeds one or more predetermined thresholds.
This way, the media client dynamically alters its playback behavior in real-time while streaming the live media. A playback rate is for example a frame rate when the live media comprises live video. A playback rate is for example an audio sample rate when the live media comprises live audio.
For example, the media client demonstrates a desired or acceptable end-to-end latency of 500 milliseconds. With a round trip time equal to 0, an ideal size for the playback buffer would then be 500 milliseconds. The media client typically comprises a range of sizes of the playback buffer which are acceptable, for example comprises a range of 300 milliseconds to 700 milliseconds for the playback buffer in the scenario presented here. If the size of the playback buffer increases until it is larger than 700 milliseconds, for example because of the presence of issues with the performance of a client device on which the media client runs, then the speed of the playback of the live media must be increased by the media client to reduce the size of the playback buffer. If the size of the playback buffer reduces until it is smaller than 300 milliseconds, for example in presence of a drop in the throughput of the communication network, then the speed of the playback of the live media must be reduced by the media client to increase the size of the playback buffer. When the size of the playback buffer is comprised within the range of sizes which are acceptable, i.e. in this example between 300 milliseconds to 700 milliseconds, then the speed of the playback of the live media can be set back to 1.
According to an optional aspect of the disclosure, the media client is further configured to skip one or more frames of the live media and/or to skip one or more audio samples of the live media, thereby decreasing the size of the playback buffer.
This way, the media client can reduce a size of its playback buffer. For example, when the live media comprises audio, the pitch of speech of the live media can be increased, thereby decreasing the size of the playback buffer. This happens according to the process of time stretching which changes the speed or the duration of an audio signal without affecting its pitch. Alternatively, the process of pitch scaling can be used according to which the pitch is changed without affecting the speed. Alternatively, the process of pitch shifting—which affects both pitch and speed by slowing down or speeding up the live media comprising audio—can be used.
According to an optional aspect of the disclosure, the media client is further configured to monitor how often the playback buffer is empty when playing the live media and wherein the media client is further configured to increase the size of the playback buffer when a number of times that the playback buffer is empty when playing the live media exceeds a limit threshold.
This way, the media client dynamically alters its playback behavior in real-time while streaming the live media. When the number of times during the playback of the live media that the playback buffer is empty exceeds a predetermined number of times, or limit threshold, this is an indication for the media client that the playback buffer runs empty too fast and that the size of the playback buffer is too small. The media client therefore increases the QoE of a user of the media client by increasing the size of the playback buffer, thereby preventing playback freezes or playback stalls during playback of the live media.
According to an optional aspect of the disclosure, the media client is further configured to a number of dropped frames of the live media; and the media client is further configured to increase the size of the playback buffer when the number of dropped frames of the live media exceeds a dropped frame threshold.
Dropped frames of the live media are to be understood as video frames of the live media and/or of dropped audio samples of the live media which are dropped by the media client, i.e. video frames of the live media and/or audio samples which could not be decoded on time by the media client to be played back. Frame drops, for example drops of video frames drops and/or audio samples, can occur when the time difference between the time when a video frame is placed in the playback buffer of the media client and the time when that video frame is shown on the media client is insufficient for the media client to process the video frame and/or the audio sample. This can happen more frequently for example in situations where there is high CPU load and/or high memory usage of the client device on which the media client runs. When this situation is detected by the media client, the media client increases the size of the playback buffer.
According to a second aspect of the present disclosure, there is provided a computer-implemented method for streaming live media over a communication network to a media client comprising a playback buffer configured to temporarily store the live media, wherein the method comprises the following step performed by the media client: adapting a size of the playback buffer when the media client streams the live media.
The computer-implemented method according to the present disclosure comprises the steps of measuring the current playback buffer and determining what the ideal size of the current playback buffer is for streaming live media. In other words, the method according to the present disclosure comprises the steps of increasing or decreasing a size of the playback buffer by speeding up or slowing down playback or jumping forward in the playback when necessary and when playing the live media. The method therefore comprises the steps of monitoring and measuring the current size of its own playback buffer and determines the ideal size of the playback buffer for streaming the live media. This way, the method according to the present disclosure can dynamically alter the behavior of the media client while the media client is streaming live media by modifying a size of the playback buffer in function of a state of the communication network. With the method according to the present disclosure, an optimal balance between Quality of Experience and latency is provided to an end-user of the media client. The method according to the present disclosure provides the best possible user experience regardless of possible changes over the communication network. When the conditions of the communication network are stable, the method according to the present disclosure provides ultra-low latency when streaming the live media with a small size of the playback buffer. When the conditions of the communication network are unstable, the method according to the present disclosure further provides a playback buffer which is large enough to ensure stable playback of the live media for an end-user of the media client.
According to an optional aspect of the disclosure, the method further comprises the steps of:
According to a third aspect of the disclosure, the disclosure relates to a media client comprising at least one processor and at least one memory including computer program code, the at least one memory and computer program code configured to, with the at least one processor, cause the controller to perform the method according to the second aspect of the disclosure.
According to a fourth aspect of the disclosure, the disclosure relates to a computer program product comprising computer-executable instructions for causing a media client to perform at least the method according to the second aspect of the disclosure.
According to a fifth aspect of the disclosure, the disclosure relates to a computer readable storage medium comprising computer-executable instructions for performing the method according to the second aspect of the disclosure when the program is run on a computer.
According to an embodiment shown in
According to an embodiment shown in
According to an embodiment shown in
According to an embodiment of the method steps shown in
According to an embodiment of the method steps shown in
As used in this application, the term “circuitry” may refer to one or more or all of the following:
(a) hardware-only circuit implementations such as implementations in only analog and/or digital circuitry and
(b) combinations of hardware circuits and software, such as (as applicable):
(c) hardware circuit(s) and/or processor(s), such as microprocessor(s) or a portion of a microprocessor(s), that requires software (e.g. firmware) for operation, but the software may not be present when it is not needed for operation.
This definition of circuitry applies to all uses of this term in this application, including in any claims. As a further example, as used in this application, the term circuitry also covers an implementation of merely a hardware circuit or processor (or multiple processors) or portion of a hardware circuit or processor and its (or their) accompanying software and/or firmware. The term circuitry also covers, for example and if applicable to the particular claim element, a baseband integrated circuit or processor integrated circuit for a mobile device or a similar integrated circuit in a server, a cellular network device, or other computing or network device.
Although the present invention has been illustrated by reference to specific embodiments, it will be apparent to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied with various changes and modifications without departing from the scope thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the scope of the claims are therefore intended to be embraced therein.
It will furthermore be understood by the reader of this patent application that the words “comprising” or “comprise” do not exclude other elements or steps, that the words “a” or “an” do not exclude a plurality, and that a single element, such as a computer system, a processor, or another integrated unit may fulfil the functions of several means recited in the claims. Any reference signs in the claims shall not be construed as limiting the respective claims concerned. The terms “first”, “second”, third”, “a”, “b”, “c”, and the like, when used in the description or in the claims are introduced to distinguish between similar elements or steps and are not necessarily describing a sequential or chronological order. Similarly, the terms “top”, “bottom”, “over”, “under”, and the like are introduced for descriptive purposes and not necessarily to denote relative positions. It is to be understood that the terms so used are interchangeable under appropriate circumstances and embodiments of the invention are capable of operating according to the present invention in other sequences, or in orientations different from the one(s) described or illustrated above.
Number | Date | Country | Kind |
---|---|---|---|
19187362.9 | Jul 2019 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2020/069339 | 7/9/2020 | WO |