Various example embodiments relate to controlled video streaming from a remote video service to a video playback client wherein the remote video service is configured to make a video stream available to the client upon request in at least a temporal independent version and a temporal dependent version.
Media streaming is immensely popular nowadays. It allows viewers to start watching media content without the need to completely download the content beforehand. A large portion of the Internet traffic consists of such media streamed from media services to clients, typically from a content distribution network, a CDN, to a video player application running on a PC, a tablet, a smartphone, a set-up box, a TV etc. In media streaming, the video is further delivered on demand or on request of the client. The request then specifies a certain starting point in time upon which the video should start. This starting point may be explicit, e.g. by specifying it in the request, or implicit, e.g. where the starting point is derived from the time of the request which is the case for live streaming. From a technology point of view, media streaming is completely different from traditional broadcasting technology. Media streaming is a request or pull based unicast technology whereas traditional broadcasting is a push based broadcast.
While both technologies are different, they share the common aim of video control from the origin side. This means that the stream provider wants to keep control over the video stream that is delivered to the client playing the video and, hence, over the video as watched by the viewer. More particular, the provider needs to be able to interrupt an ongoing video stream at some point in time, switch to another video stream and then return to the initial video stream. This is useful for dynamically introducing personalized commercials, news flashes or other video content, or for switching from one video stream to another one.
One available streaming technology is chunked or segmented streaming. The media is then divided in smaller chunks or segments which are downloaded and played by the client one after the other. Such protocols may also offer adaptive bitrate streaming allowing the client to switch between different bit rates, resolutions or codec depending on the available resources. To achieve this, versions of the streams or representations, each with a different bit rate, resolution of codec, are made available on the server for the client. Information on the different representations and their segmenting is then available by means of a manifest file that is updated regularly. Examples of such streaming protocols are HTTP Adaptive Streaming, HAS, protocols are MPEG-DASH published as ISO/IEC 23009-1:2012, HTTP Dynamic Streaming by Adobe, HTTP Live Streaming (HLS) by Apple and Smooth Streaming, a Microsoft IIS Media Services extension.
Different solutions have been proposed for providing stream control in segment based streaming protocols. One of them is referred to as server-side-add-insertion, SSAI wherein a personalized manifest file is dynamically created within the content distribution network, CDN, whenever another intermediate stream is to be inserted. As the client is fetching regular updates of the manifest file, it will retrieve this personalized manifest file and play the intermediate stream as specified in the personalized manifest file.
Another available streaming technology is disclosed in EP3515075 wherein a stream is not further subdivided into smaller and smaller independently playable chunks or segments. Instead, a stream is made available in an independent version or representation and one or more dependent versions or representations. The independent version then provides a stream of temporal independent frames, i.e. frames that are decodable independently from each other. A certain dependent version then provides a compressed stream according to a certain representation with a certain bit rate and can have any type of frames. Upon playback, a client playing the stream then first retrieves by a first independent request a first video packet from the independent version to build up the image in the video player and then retrieves the subsequent frames by a single dependent request from an available dependent version. An advantage of this method is that a video player can start playback from any moment in time and is not limited to the boundaries of segments. This greatly reduces the start-up time for live streaming and reducing skipping times for video on demand. Further, there is no need to continuously update manifest files as the stream progresses because the dependent request may be performed by a single byte range request. It therefore suffices that the client retrieves the manifest one time when setting up the stream. A problem with this streaming technology is that it does not allow for stream control and, hence, does not allow for server-side-add-insertion.
The scope of protection sought for various embodiments of the invention is set out by the independent claims.
The embodiments and features described in this specification that do not fall within the scope of the independent claims, if any, are to be interpreted as examples useful for understanding various embodiments of the invention.
Amongst others, it is an object of embodiments of the invention to provide controlled video streaming from a remote video service to a video playback client wherein the remote video service is configured to make a video stream available to the client upon request in at least a temporal independent version and a temporal dependent version.
This object is achieved, according to a first example aspect of the present disclosure, by a method for controlled video streaming from a remote video service to a video playback client. The remote video service is configured to make a first video stream available to the client upon request in at least a temporal independent version and a temporal dependent version. The method further comprises:
In other words, the remote video service makes at least two versions of the first video stream available to the video playback client. The temporal independent version comprises key frames. A key frame is a frame that is decodable independently from other frames in the video stream. A key frame does not comprise temporal dependencies but may comprise spatial dependencies. A key frame is sometimes referred to as an I-frame. The dependent version of the video stream also comprises dependent frames, i.e. frames for which information of other frames is needed in order to decode them. Frames of the dependent version may thus have temporal dependencies in order to decode them. The video service makes these two versions available to the video playback client, i.e. the video playback client may retrieve any chosen frame from the two versions upon request. When a video playback client requests a stream of the video at an arbitrary point in time, the server provides at least the first frame in an independent version as an initial video packet and, the following frames from the dependent version of the video. Further information on the video stream is available through a manifest file, e.g. information on the location and quality metrics of one or more independent versions and dependent versions of the first stream. The location of the manifest file, e.g. its URL, may be provided by the video service, e.g. by including the URL in a HTML document.
As upon the second request the first video stream is provided in a continuous manner, there is no need for the video playback client to retrieve updated versions of the manifest file on a regular basis as it is the case for segmented video streaming. Therefore, the video service notifies the video playback client of an update of the manifest file. Thereupon, the video playback client retrieves the updated manifest files. The above method thus provide a way for stream control by means of an updated manifest file for video streaming methods that do not rely on the need for fast updating of the manifest file. In other words, updates of the manifest file is determined by changes in the control of the video stream rather than the continuous updating of references to individual chunks or segments.
According to an example embodiment, the notifying is performed by an in-stream information field contained in the first video stream.
In other words, this field is contained within the served continuous stream of dependent frames. This may for example be done by foreseeing such a field in a video container format holding a dependent frame. Alternatively, this may be done by inserting a separate packet containing the information field in between the video packets holding the respective dependent frames. In any case, when the video playback client encounters the information field, it is notified of the updated manifest file and retrieves the manifest file accordingly. By foreseeing an in-stream information field, no further control channel between the video client and video service is required thereby minimizing the overhead.
According to an example embodiment, the notifying is performed by a notification message outside the first video stream. This way of notification may be advantageous for unforeseen stream control because no manipulation of the video stream is required for the notification.
According to an example embodiment, the updated manifest file comprises an identification of at least one other video stream in at least a temporal independent version and a temporal dependent version and comprises an instruction for the playback of the at least one other video stream; the method further comprising, by the client, streaming the at least one other video stream.
In other words, the updated manifest file is used by the video service to notify the video playback client that it needs to switch to another, e.g. a second, video stream. This second video stream being identified by another temporal independent version and temporal dependent versions.
This updated manifest file may further comprise an instruction for resuming the first video stream after the playback of the at least one other video stream. The method then further comprises, by the client, resuming the streaming of the first video stream back to the first video stream after the playback of the second video stream.
This way, by a single information field and accompanying updated manifest file, the video service can control the insertion of a second stream into the playback of the first stream.
Alternatively, the switching back from the at least one other stream may be performed as follows:
In other words, the video service notifies the video playback client again during the playback of the second stream that another update of the manifest is available.
According to an example embodiment, the method further comprises:
According to example embodiments, a video stream may be subdivided in separate portions, i.e. the initial manifest file only allows the playback of a first portion of the first video stream, e.g. the first 10 or 20 minutes. The updated manifest file then comprises an identification of a subsequent portion of the first video stream. The video playback client then streams the subsequent portion of the first video stream by, upon a first request, receiving and playing an initial video packet of the subsequent portion from the temporal independent version, and, upon a second request continuously receiving and playing frames of the subsequent portion from the temporal dependent version.
This allows controlling the playback of a single video stream by the video service. For example, certain conditions may be checked first by the video service before making the updated manifest file available.
According to example embodiments, the updated manifest file comprises an identification of a metadata event. The video playback client then fetches the metadata event while continuing the continuously receiving and playing frames from the temporal dependent version.
The metadata event may be any kind of event that does not interrupt the playback of the first video stream, e.g. a notification for the viewer, information on subtitles, availability of other qualities etc.
According to a second example aspect, the disclosure relates to method for controlled video streaming from a remote video service to a video playback client; and wherein the remote video service is configured to make a first video stream available to the client upon request in at least a temporal independent version and a temporal dependent version; the method comprising, by the video playback client:
According to a third example aspect, the disclosure relates to a method for controlled video streaming from a remote video service to a video playback client; and wherein the remote video service is configured to make a first video stream available to the client upon request in at least a temporal independent version and a temporal dependent version; the method comprising, by the remote video service:
According to a fourth example aspect, the disclosure relates to a networking device comprising at least one processor and at least one memory including computer program code, the at least one memory and computer program code configured to, with the at least one processor, cause the controller to perform any of the steps performed by the video playback client or the remote video service according to the first to third example aspect.
According to a fifth example aspect, the disclosure relates to a computer program product comprising computer-executable instructions for causing the performance of the networking device according to the fourth example aspect.
According to a sixth example aspect, the disclosure relates to a computer readable storage medium comprising computer-executable instructions for performing the steps by the video playback client and/or the video service according to first to third example aspect when the program is run on a computer.
Some example embodiments will now be described with reference to the accompanying drawings.
The present disclosure relates to the streaming of video from a video service to a video playback client, further also referred to as client. A video received by a client is a combination of ordered still pictures or frames that are decoded or decompressed and played one after the other within a video application. To this respect, a client may be any device capable of receiving a digital representation of a video over a communication network and capable of decoding the representation into a sequence of frames that can be displayed on a screen to a user. Examples of devices that are suitable as a client are desktop and laptop computers, smartphones, tablets, setup boxes and TVs. A client may also refer to a video player application running on any of such devices. Streaming of video refers to the concept that the client can request a video from a server and start the playback of the video upon receiving the first frames without having received all the frames of the video. A streaming server is then a server that can provide such streaming of videos upon request of a client to the client over a communication network, for example over the Internet, over a Wide Area Network (WAN) or a Local Area Network (LAN).
Video received from a streaming server may be compressed according to a video compression specification or standard such as H.265/MPEG-H HEVC, H.264/MPEG-4 AVC, H.263/MPEG-4 Part 2, H.262/MPEG-2, SMPTE 421M (VC-1), AOMedia Video 1 (AV1) and VP9. According to such standards, the video frames are compressed in size by using spatial image compression and temporal motion compensation. Frames on which only spatial image compression is applied or no compression is applied are referred to as temporal independent frames, key frames, independent frames or I-frames. A key frame is thus a frame that is decodable independently from other frames in the video. Frames to which temporal motion compensation is applied, either in combination with image compression, are referred to as temporal dependent frames or, shortly dependent frames. Dependent frames are thus frames for which information of other frames is needed to decompress them. Dependent frames are sometimes further categorized in P frames and B frames. P frames can use data from previous frames to decode and are thus more compressible than I frames. B frames can use both previous and forward frames to decode and may therefore achieve the highest amount of data compression.
Thereupon, the server 100 receives the request at step 110. The server then determines the key frame which corresponds to the requested starting time 121 from a temporal independent version 170 of the video. In the embodiment of
Then, the client 150 proceeds to step 154 in which it requests the subsequent frames of the dependent version 160 of the video. Alternatively, step 154 may also be done in parallel with the first request 152 to further ensure the timely delivery of the dependent frames. At the server 100, the request is received at step 112 upon which the server proceeds to step 113 to retrieve the requested dependent frames. To this respect, the server retrieves the first dependent frame 164 subsequent to the key frame 173 and, thereafter, sends the dependent frame 164 to the client in response. Steps 113 and 114 are then continuously repeated until the last dependent frame 166 of the request is received by the client 150. If there is no end frame or time specified in the request of the client 150, then the server sends the subsequent depending frames up to the end of the video or up to a certain predefined maximum playing time before the end of the video. A request 154 for subsequent frames or a temporal dependent version is further referred to as a dependent request.
At the client 150 side, similar steps 155 and 156 are continuously repeated, i.e. in step 155, the client 150 receives the next dependent frame from the server 100 and forwards the frame to the player 159. As a result, the video player 159 receives a video stream 180 comprising a first key frame 173 followed by the dependent frames 164 to 166.
Advantageously, the requests and responses between the client 150 and the server are performed according to the Hypertext Transfer Protocol (HTTP), i.e. by an HTTP GET request from the client and HTTP response from the server. More advantageously, the second request 154 for the subsequent frames establishes a chunked transfer encoding session with the sever allowing the dependent frames to be streamed over a single persistent connection. Support for chunked transfer encoding was introduced in HTTP/1.1. Even more advantageously the dependent request 154 for the subsequent frames is a byte range request wherein the requested byte range corresponds with the range of dependent frames starting after the requested key frame 173. Support for byte range requests was also introduced in HTTP/1.1 and is specified in detail in the IETF' s RFC 7233 of June 2014. Information on the availability of the video in both the independent and dependent version may be provided in the form of a URL to a manifest file that is available on the server, for example a manifest file following the Common Media Application Format (CMAF) for segmented media according to ISO/IEC 23000-19.
The independent request 152 contains address information for addressing both the video service 101, identifying the media that is to be streamed and the starting time within the media. The address information may be in the form of a uniform resource identifier, URI, or a uniform resource locator, URL. All three components may be embedded in the path of the URL, e.g. as ‘http://streaming.service.com/media_identification/starting_time’. The starting time and/or the media identification may also be provided as a query in the URL. The starting time may be implicit, e.g. the starting time is the beginning when it is not specified. The starting time may also be specified in a predetermined format, e.g. in a certain unit relative from the beginning of the media. For live streaming, a specific ‘now’ starting time may be defined, i.e. to retrieve the latest available ‘first package’ for the identified media, e.g. as ‘http://streaming.service.com/media_identification/now’. Advantageously, the starting time is provided to the video service in the form of a sequence number. If the player received the starting time as absolute or relative time, then the client 150 first converts the time value to the appropriate sequence number. The information for performing the conversion may be derived from information provided in the manifest file, e.g. by deriving the sequence number from the framerate.
Frames from the independent version 170 may further be packaged in a initialization or initial video packet. Apart from the independent frame 173, such packet may further comprise i) a field with a binary pointer to the subsequent portion of the video, ii) a field with timing information needed for the playback of the frame in the first package, iii) one or more dependent frames. The timing information may also be provided separately for each frame by embedding it within the frame itself. The initial video packet may also comprise only the first independent frame 173 and no further dependent frames. In that case, the pointer refers to the location of the frame subsequent to the independent frame. The initial video packet may also comprise a URL to the manifest file that is available on the video service 101. The creation of the initial video packet my be done by video service upon the reception of the request. Alternatively, the initial video packet may also be stored onto the storage 120 by storing each independent frame 171-176 already as such an initial video packet. In that case, video service 100 only retrieves the first package from the storage 120.
The above embodiments illustrate the streaming of a single video stream, i.e. a single continuous stream intended for complete playback when starting the playback from the video playback client 150. The playback of such a single video stream is performed based on the two requests 152, 154 and by retrieval of the manifest file to obtain further information on the available representations. As a result, regardless of the length of the single video stream, the streaming may be performed by only downloading the manifest file once. As there is no notion of short independently playable chunks or segments, there is no need to constantly retrieve updates of the manifest file. According to an embodiment, the manifest file comprises information on one or more video streams or video portions of such video streams that are to be played one after the other by the video player. Per video stream or portion, further also referred to as a presentation, the manifest file may then further comprise:
Upon request 322, service 351 starts sending 322 the sequence of frames to the video player 350. Near the end of the sequence, the video service updates the manifest file 241. To notify this to the video player 350, the video service 351 adds the information field 231 (M1) to the sequence. Upon receiving such information field 231, the video player 350 starts execution block 330 and parses 331 the received information field 231. As the information field indicates the update of the manifest file, the video player 350 requests the updated manifest file 242 upon which the video service 351 sends 322 the updated manifest file 242 to the video player 350. By the updated manifest file, the video player 350 is made aware of the new portion 220 of the video stream, e.g. by its start time 243 and reference to the stream. Based on this information, the video player 350, issues a new request 341 to video service 351 for the initial video packet 221 and initializes 343 the player after receival 342 of the packet 221. Then, video player 350 also requests 344 the subsequent frames 222-224 from the dependent version of the stream. Execution block 320 may be repeated during playback of the video, i.e. a sequence of presentations may be played by notifying the video player 350 when an updated version of the manifest file is available.
The combination of
Upon request 521, first video service 591 starts sending 522 the sequence of frames to the video player 590. At a certain moment in time, it may be decided to interrupt the playback of the first video stream and to insert a second video stream. In order to do so, an updated version of the manifest file, e.g. manifest file 442, is created by second video service 592. In order to inform the video player of the updated manifest file, an in-stream information field, e.g. field 451, is present in the first video stream before the interruption. Upon receiving the in-stream information field, the video player 590 starts execution block 530 and parses 531 the received in-stream information field. As the information field indicates the update of the manifest file, the video player 590 requests the updated manifest file upon which the second video service 592 sends 533 the updated manifest file to the video player 590. By the updated manifest file, the video player 590 is made aware of the second video stream, e.g. by its start time and reference to the second stream. Based on this information, the video player 590 starts execution block 534 by issuing an independent request 534 to second video service 592 for the initial video packet and initializes 536 the player after receival 535 of the packet. Then, video player 590 also requests the subsequent frames from the dependent version of the stream by depend request 541 and receives them by means of response 542.
After the playback of the second video stream, the video player 590 should resume playback of the first video stream. In order to do so, an updated version of the manifest file, e.g. manifest file 443, is created by second video service 592. In order to inform the video player 590 of the updated manifest file, an in-stream information field, e.g. field 451, is present in the first video stream before the end of the second stream. Upon receiving the in-stream information field, the video player 590 starts execution block 546 and parses 543 the received in-stream information field. As the information field indicates the update of the manifest file, the video player 590 requests 544 the updated manifest file upon which the second video service 592 sends 545 the updated manifest file to the video player 590. By the updated manifest file, the video player 590 is made aware of the second video stream, e.g. by its start time and reference to the second stream. Based on this information, the video player 590 starts execution block 547 by issuing an independent request 548 to first video service 591 for the initial video packet of the first stream and initializes 550 the player after receival 549 of the packet. Then, video player 590 also requests the subsequent frames from the dependent version of the first stream by depend request 561 and receives them by means of response 562 thereby resuming the playback 560 of the first video stream. During the playback 560, further updates of the manifest file may be made available by means of in-stream information fields whereupon the video player 590 retrieves the update by parsing step 564, requesting step 565 and receiving step 566.
By the steps according to sequence diagram 500, video playback in video player 590 may be controlled by the video service 593. By means of the in-stream information fields, there is no need for the video player 590 to constantly check for updates of the manifest file. Further, the manifest file may be dynamically updated, e.g. the second video stream may be selected by the second video service 592 according to received metrics of the video player 590, e.g. based on a user profile, a location, and a time of viewing. Sequence diagram 500 allows for server-side stream insertion by the second video service 592. The second video stream may for example correspond to advertisements, thereby achieving server-side advertisement insertion. Execution block 540 may be executed more than once in order to show more than one video stream from second video service 592 before resuming the playback 560 of first video stream.
The combination of
Upon request 721, first video service 791 starts sending 722 the sequence of frames to the video player 790. At a certain moment in time, it may be decided to interrupt the playback of the first video stream and to insert a second video stream. In order to do so, an updated version of the manifest file, e.g. manifest file 742, is created by second video service 792. In order to inform the video player of the updated manifest file, an in-stream information field, e.g. field 751, is present in the first video stream before the interruption. Upon receiving the in-stream information field, the video player 790 starts execution block 730 and parses 731 the received in-stream information field. As the information field indicates the update of the manifest file, the video player 790 requests the updated manifest file upon which the second video service 792 sends 733 the updated manifest file to the video player 790. By the updated manifest file, the video player 790 is informed and instructed on the playback of the second video stream and the resumption of the first video stream thereafter. Based on this information, the video player 790 starts execution block 734 by issuing an independent request 734 to second video service 792 for the initial video packet and initializes 736 the player after receival 735 of the packet. Then, video player 790 also requests the subsequent frames from the dependent version of the stream by depend request 741 and receives them by means of response 742 as shown in execution block 740.
After the playback of the second video stream, the video player 790 should resume playback of the first video stream as per instruction in the already received manifest file. In order to do so, video player 790 issues an independent request 744 to first video service 791 for the initial video packet of the first stream and initializes 746 the player after receival 745 of the video packet. Then, video player 790 also requests the subsequent frames from the dependent version of the first stream by depend request 751 and receives them by means of response 752 thereby resuming the playback 750 of the first video stream. During the playback 750, further updates of the manifest file may be made available by means of in-stream information fields whereupon the video player 790 retrieves the update by parsing step 754, requesting step 755 and receiving step 756.
By the steps according to sequence diagram 700, video playback in video player 790 may be controlled by the video service 793. By means of the in-stream information fields, there is no need for the video player 790 to constantly check for updates of the manifest file. Further, the manifest file may be dynamically updated, e.g. the second video stream may be selected by the second video service 792 according to received metrics of the video player 790, e.g. based on a user profile, a location, and a time of viewing. Sequence diagram 700 allows for server-side stream insertion by the second video service by a single update of the manifest file. The second video stream may for example correspond to advertisements, thereby achieving server-side advertisement insertion. Execution block 740 may be executed more than once in order to show more than one video stream from second video service 792 before resuming the playback 730 of the first video stream.
The above described embodiments illustrated solutions for the controlled playback of video streams by use of in-stream notification fields signalling an updated manifest file. This mechanism may also be used for signalling other updates available through the manifest file. To this respect,
The above embodiments described an in-stream information field either included in a video packet or in a separate dedication packet for notification of an updated manifest file. According to an embodiment, an in-stream information field may be replaced with an out-of-band notification, i.e. a message that is not included in the video stream.
The above embodiments have been described with reference to video streams. It should be understood that the present disclosure may be applied to any kind of media stream, including but not limited to audio and metadata such as subtitles. Thus, a media stream may comprise any one of: one or more audio streams, one or more metadata streams, and one or more video streams.
Although the present invention has been illustrated by reference to specific embodiments, it will be apparent to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied with various changes and modifications without departing from the scope thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the scope of the claims are therefore intended to be embraced therein.
It will furthermore be understood by the reader of this patent application that the words “comprising” or “comprise” do not exclude other elements or steps, that the words “a” or “an” do not exclude a plurality, and that a single element, such as a computer system, a processor, or another integrated unit may fulfil the functions of several means recited in the claims. Any reference signs in the claims shall not be construed as limiting the respective claims concerned. The terms “first”, “second”, third”, “a”, “b”, “c”, and the like, when used in the description or in the claims are introduced to distinguish between similar elements or steps and are not necessarily describing a sequential or chronological order. Similarly, the terms “top”, “bottom”, “over”, “under”, and the like are introduced for descriptive purposes and not necessarily to denote relative positions. It is to be understood that the terms so used are interchangeable under appropriate circumstances and embodiments of the invention are capable of operating according to the present invention in other sequences, or in orientations different from the one(s) described or illustrated above.
Number | Date | Country | Kind |
---|---|---|---|
20170744.5 | Apr 2020 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/059467 | 4/12/2021 | WO |