This invention relates to the field of multicast streaming, and in particular to the generating of a multicast stream comprising a plurality of chunks for synchronising with a unicast stream.
Currently live television delivered over IP networks uses one of two quite different networking technologies: one based on multicast and the other based on unicast. With multicast transmission, a single multicast stream carrying the content is pushed from a content server to multiple network nodes simultaneously, with those network nodes duplicating the content and forwarding to any subsequent nodes or clients as required. With unicast transmission, multiple streams of content are pulled from the server, one for each device consuming the content, typically using HTTP over TCP and Adaptive Bit Rate technology.
Multicast makes efficient use of the network when delivering the same content at the same time to many end devices, but often requires continual allocation of network resources regardless of the amount of viewing. In addition, many end devices such as some tablets and smartphones, do not currently support multicast.
Unicast suffers from sending multiple copies of the same content through the network, but requires no usage-independent allocation of network resources. Moreover, unicast is capable of delivering to all end devices, even in the presence of low or variable network throughput to the end device, which is a frequent occurrence for devices connected by wireless technology for example.
US patent application 2013/0024582 describes a system and method for dynamically switching between unicast and multicast delivery of media content in response to changes in concurrent demand for access to the media content. Furthermore, sequence numbers included in the video frames are used to align between unicast and multicast stream content.
It is the aim of embodiments of the present invention to provide a method of generating multicast streams for carrying video content that supports improved switching to and from unicast streams.
According to one aspect of the present invention, there is provided a method of multicast video delivery comprising:
The first segment identifier may be a sequence number associated with a segment, wherein the value of the sequence number is different for different segments, and each transport protocol packet carrying a given segment is marked with the sequence number associated with that segment.
The method may further comprising marking each transport protocol packet with a second segment identifier, wherein the second identifier is an offset comprising a numerical value that is incremented with each transport protocol packet carrying a given segment, and is reset for the first packet of a new segment. The offset used to mark a given packet may indicate the total number of bytes of data carried in preceding packets for the given segment.
The segment identifier may be a transport protocol payload header field. The transport protocol may be a real time transport protocol.
The multicast stream may comprise the transport protocol packets encapsulated with the user datagram protocol in an IP packet.
Each of the segments may carried in the form of a transport stream chunk, and wherein each transport stream chunk comprises a plurality of transport stream packets.
Examples of the invention allow multicast and unicast to be used together to deliver live TV content more smoothly and effectively than using either technology alone. Switching between multicast and unicast is improved by the marking chunk boundaries, which is done at the transport layer level, and thus avoids the need to inspect the video content itself, and the need to synchronise at the frame or group of picture level.
In alternative examples, a proxy is introduced in the path between the content server and the client, and allows for delivery of the content to that proxy by unicast or multicast. The proxy may be located in a router or hub. The choice of whether to use multicast or unicast can be made according to various factors, such as the network conditions, as well as the popularity of the content being viewed in terms of the total number of clients viewing the content. The proxy communicates with the content server to determine whether the content requested by the client is available by unicast and/or multicast. The proxy determines which is the most suitable form to use, based on its knowledge of such factors as the network throughput to the client, and in the case of selecting multicast delivery, performs the necessary functions, e.g. IGMP join, to receive the multicast stream, buffers it, and can then present it to the client as a unicast source. By doing this it is possible to use multicast delivery to the proxy for popular content where unicast would make inefficient use of network capacity, but also allows for subsequent delivery from the proxy to clients by unicast if multicast is not supported by those clients.
For a better understanding of the present invention reference will now be made by way of example only to the accompanying drawings, in which:
The present invention is described herein with reference to particular examples. The invention is not, however, limited to such examples.
Examples of the present invention present a method of generating a multicast stream for transporting video content such as live TV. First, the video content is encoded, and segmented into temporal chunks. Each chunk is then encapsulated in one or more RTP packets, depending on the size of the chunk, and each RTP packet is marked with a chunk marker to indicate which of the packets the boundaries between chunks lie. The multicast stream is then generated by encapsulating the RTP packets, preferably using UDP in IP packets. The chunk marker is provided for by a special field in the RTP payload header. The chunk marker can be a chunk index or a chunk offset. Both, individually and in combination, can be used to determine the boundary between chunks.
The content server 104 also includes a mechanism for switching between unicast and multicast delivery methods, and generating of a multicast stream, during the delivery of any given encoded content, such as a TV program or film.
The content generator 102 and content server 104 are shown in more detail in
As shown in
Next in step 302, the encoded video stream and encoded audio stream (or each encoded video and audio steam if the content was encoded at multiple bit rates) are segmented by the segmentation module 210 into discrete video and audio segments or chunks. It is envisaged that each chunk is equivalent to between 2 and 15 seconds in duration of the uncompressed video/audio, though longer or shorter durations could be used. Whilst the segmentation module 210 is shown as operating after the encoders 206 and 208, the segmenting can be performed on the uncompressed video and audio streams prior to their encoding. Thus, the uncompressed video and audio can first be segmented, and then the resulting uncompressed segments can be encoded to generate the encoded video and audio segments.
The segmentation module 210 may select the segment duration taking into account service requirements. For example, shorter segments allow switching between streams to occur quicker, both between unicast and multicast streams, or between different encoded bit rates. However, longer segments are more easily processed by system components, particularly by CDN (Content Delivery Network) nodes, but could cause slower switches between delivery modes and may introduce more end to end latency for live services.
In step 304, the video and audio segments are handled by the packaging module 212. In this example, the output of the packaging module 212 is in a so-called multiplexed format, such as the MPEG-2 Transport Stream as specified in IS 13818-1. MPEG-2 transport streams are often used for delivery of digital television in real time. The packaging module could also output in a so-called non-multiplexed format, such as the ISO Base Media File Format, as specified in IS 14496-12. MP4 fragments could also be output instead.
The MPEG-2 transport stream comprises a number of transport stream packets. Each transport stream packet carries 184 bytes of payload data, prefixed by a 4 byte header. The encoded video and audio segments are carried in the transport stream payloads, where each payload usually carries a single media type—audio, video or subtitle data for example. Typically, several transport stream packets will be required to carry each segment of audio and video. The precise number of transport stream packets required will depend on the duration of each segment of audio and video created by the segmentation module 210. The packaging module 112 will thus outputs multiple transport stream chunks to carry the respective video and audio segments, and with each transport stream chunk comprising a one or more transport stream packets. If MP4 fragments are used, then several MP4 fragments might be used to carry the same segments.
A person skilled in the art will appreciate that the functions performed by the encoders, segmentation module and packaging module can be performed by a single, suitably configured, general video encoder module.
The transport stream chunks are passed to the output interface 214, where they are in turn delivered to the content server 120 in step 306.
In addition, the content generator 102 also generates a manifest file, which describes the encoded content (the transport stream chunks in this example), and how it can be obtained, and also passes this to the content server 104. Under MPEG-DASH, the manifest is referred to as an MPD, Media Presentation Description. Apple's adaptive video streaming technology, HLS (HTTP Live Streaming), provides a manifest in the form of a playlist file (.m3u file).
As will be described later, the manifest file can be modified by the content server in an example of the invention for signalling a switch to multicast from unicast. The manifest file describes the available bit rates for each transport stream chunk, and where each is located (an address of the location where the chunk is stored in the content server 104). The manifest file is used by a client for unicast streaming.
The content server 120 receives the encoded content in chunks, at an input interface 222, in the form of transport stream chunks, and any associated manifest file, from the content generator 102. The content server 104 comprises an input interface 222, a data store 224 for storing video content, a multicast stream generator 230, switch logic 232, and an output interface 234. The data store 224 may form part of a standard web server, which is able to serve individual transport stream chunks in response to unicast requests via the output interface 234. Content provided by unicast is effectively “pulled” from the web server on request by clients.
The transport stream chunks and manifest file are passed from the input interface 222 to the data store 224, where they are stored. The data store 224 can store multiple manifest files 228, one for each distinct item of video, and video content 226 (in the form of transport stream chunks). As suggested earlier, there can be multiple versions of the same video content, each encoded at different bit rates, which are reflected in an associated manifest file.
The multicast stream generator 230 is responsible for generating multicast streams, which will typically carry multiple transport stream chunks. Multicast streams are “pushed” out to clients.
A client can initiate unicast streaming by first making a request to the content server 104 for the appropriate manifest file associated with the desired content. After receipt of the manifest file, the client can make specific requests for encoded chunks, the transport stream chunks, using the location information associated with each chunk found in the manifest. The requests take the form of HTTP requests for each chunk and are handled by the content server 104, and specifically the web server component. The transport stream chunks are packaged by the web server as standard TCP/IP packets and delivered to the client over the network. The delivery mechanism is thus a reliable one. The client can also request updated manifest files as required from the content server 104. The process will be described in more detail later.
The switch logic 232 in the content server 104 determines whether to make the transport stream chunks available for delivery by multicast as well as unicast, and when necessary, will instruct the multicast stream generator 230 to generate a multicast stream and to signal that the multicast stream is available. The latter can be done by a suitable update to the manifest file as will be described below.
The switch logic 232, for each encoded content stream, determines which, if any, of the encoded chunks are to be made available by multicast delivery as well as by unicast delivery. For example, the switch logic 232 may at one point in time determine that all content chunks for given piece of content be made available only by unicast; and at a later point in time, it may determine that the content (or a specific stream encoded at one particular bit rate) should be made available additionally by multicast; and at an even later point in time, it may determine that all content chunks be again made available only by unicast.
The decision as to when to switch to multicast from unicast might be based on the number of clients requesting a particular piece of content. If the network only allows for a single multicast stream, then the most popular content might be selected for multicast delivery to reduce the overall bandwidth used in the network. However, it may not be quite that simple, as content is can be encoded at different bit rates, and the rate that the client can handle might also vary, so the switching decision can be more complicated. However, it is thus important for the switching logic 232 to know at all times how many clients are receiving which content via unicast, and which via multicast to be able to make an appropriate switching decision.
When the switch logic determines that some of the stored content should be made available via multicast delivery, the content server 104 modifies the manifest file to also indicate the possibility of multicast delivery and how to receive the multicast stream. The content server then transmits the encoded content in a multicast stream that the switch logic has indicated for multicast delivery.
In known systems, multicast streaming of video works by encoding the content and packaging the encoded content up into transport stream packets before using a delivery mechanism such as RTP over IP. However, this approach does not lend itself to switching to and from unicast carrying video, where there is a need for precise synchronisation between the encoded content to avoid disrupting the playback of the content.
In examples of this invention, the content has been divided into segments of predetermined duration, and carried in transport stream chunks, as described above. Then the transport stream chunks are encapsulated using a transport protocol such as RTP (real time transport protocol). Specifically, the transport stream chunks are carried in the packets, specifically in RTP payloads, with the RTP packets being encapsulated using UDP (user datagram protocol) in an IP packet for multicast transmission.
To illustrate, if the content is a 1 Mbit/s video stream, and segmented into 2 second chunks, each chunk will contain 2 Mbits or 250 Kbytes. Thus, each chunk would be carried over about 190 RTP payloads each containing up to seven Transport Stream packets of 188 bytes.
The format of the UDP header 424 is shown in more detail in
The format of the RTP header 422 is shown in more detail in
In an example of the invention, there is proposed the use of additional RTP headers to help identify chunk boundaries in the multicast stream. This is required by the receiving client to identify individual chunks, in order to enable switching to be made cleanly between the chunks delivered over unicast and those delivered over multicast. In an example of the invention, there is proposed using some additional marking to indicate which RTP payloads are carrying which chunks, and where the chunk boundaries lie. In practice, each transport stream chunk will be carried over many RTP payloads, and so chunk boundaries will occur after many RTP payloads (see above where an example of a 2 second chunk requires about 190 RTP payloads). In the simplest solution, the RTP packet that carries the end of a chunk can be marked to indicate the end of the chunk.
However, multicast delivery is usually performed using RTP/UDP, as in this example, and is therefore unreliable: some packets transmitted by the content server 104 may not be received by the client. Usually with multicast delivery though, a retransmission server is used to retransmit lost packets, as requested by the client, using reliable TCP transmission. Failures are still possible though, as losses in retransmission may result in lost multicast data being delivered to the client, but delivered too late to be usefully decoded.
Therefore, some resilience in the signalling of which packets a chunk ends in is required for the multicast stream, as the single packet marker might reside in one of the lost multicast packets. The solution proposed is to include additional information in each RTP packet of the multicast stream, giving information about the chunk number as well as the chunk boundary by using a modified header. The additional information can be carried in the RTP Payload Header Format.
In this application, the additional information includes two additional numerical parameters, a CHUNK_INDEX parameter and a CHUNK_OFFSET parameter. The CHUNK_INDEX parameter and CHUNK_OFFSET parameter are both shown in the RTP payload header format of
The CHUNK_INDEX parameter is a sequence number that identifies which chunks are being carried in which packets, and also indicates chunk boundaries. The CHUNK_INDEX is also used to match chunks in the multicast stream with the chunks in an associated unicast stream.
In unicast, chunks are associated, in the manifest file, with a URL to access the file, but also in some cases, are additionally associated with a numerical parameter, for example the EXT-X-MEDIA-SEQUENCE parameter used by Apple HLS. In this invention, each unicast chunk is associated with a numerical parameter determined by analysis of the manifest file. This numerical parameter is equal to the explicit numerical value in the manifest file, derived for example from the EXT-X-MEDIA-SEQUENCE parameter, if an explicit numerical value is present. Otherwise this numerical parameter is derived from the URL of the chunk, this numerical parameter being equal to the numerical file name suffix part of the URL of the chunk, where the URL in its entirety consists of the concatenation of a file path, a root file name, and a numerical file name suffix.
This numerical value associated with a chunk corresponds in a one to one fashion with the value of the CHUNK_INDEX parameter associated with the chunk when it is transmitted by multicast. One example of such a one to one mapping is to use the numerical value as the value of CHUNK_INDEX.
The following is an example HLS manifest—EXT-X-MEDIA-SEQUENCE indicates the value associated with the first chunk in the file (2680), and thus the remaining values are derived from this first value (2681 and 2682). Note, that these values are consistent with the values that can be derived from the numerical suffix of the corresponding file (which in example below are also 2680, 2681 and 2682):
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:8
#EXT-X-MEDIA-SEQUENCE:2680
#EXTINF:7.975,
https://priv.example.com/fileSequence2680.ts
#EXTINF:7.941,
https://priv.example.com/fileSequence2681.ts
#EXTINF:7.975,
https://priv.example.com/fileSequence2682.ts
Thus, this numerical value acts a sequence number of sorts in unicast, and in the multicast stream, assigning the CHUNK_INDEX value also follows the same convention, with a packet carrying a chunk or part of a chunk being assigned a CHUNK_INDEX equal to the sequence number assigned to the equivalent unicast chunk. The content server 104 marks the payload header of each packet with this CHUNK_INDEX.
To illustrate, if the chunk sequence number is 2680 in unicast, then all the packets used to carry that chunk are marked with a CHUNK_INDEX of 2680 for the multicast stream. Then when the next chunk, which has a sequence number of 2681 in unicast, processed, the packets carrying that chunk have a CHUNK_INDEX of 2681.
Turning now to the CHUNK_OFFSET parameter. In a first example, the CHUNK_OFFSET parameter takes a numerical value that increases by one with each packet of a given chunk and is set to zero in the first packet of a new chunk. In this case, the CHUNK_OFFSET parameter can then be used to identify chunk boundaries, not only by identifying packets with the value zero as the first of a chunk, but also in the case of such a packet being lost, identifying a chunk boundary by a decrease in the value of CHUNK_OFFSET. To illustrate, the CHUNK_OFFSET for the first packet carrying a chunk can be set to 0, and then the second packet which is carrying part of the same chunk will have a CHUNK_OFFSET set to 1, and a third packet carrying the final part of the same chunk will have a CHUNK_OFFSET set to 2. Then the next packet after that, which is carrying a new chunk, will have the CHUNK_OFFSET reset to 0, or any value lower than 2. Thus, either a CHUNK_OFFSET parameter of 0 or simply a decrease from a previous CHUNK OFFSET parameter signals the start of a new chunk.
In a second example, the CHUNK_OFFSET parameter can be used to indicate the total number of bytes of data in the payloads of all the preceding packets that carry a given chunk. The first packet of a chunk would therefore carry the value of 0, and subsequent packets would carry monotonically increasing values. As in the first example, content chunk boundaries can be identified by a CHUNK_OFFSET equal to zero, or by a decreased value of CHUNK_OFFSET.
The use of the CHUNK_INDEX with the CHUNK_OFFSET parameter addresses the unlikely problem of losing precisely the number of packets that carry a single chunk, which would mean that the CHUNK_OFFSET parameter alone would still increment as expected. The CHUNK_INDEX acts as a sequence number for the chunk, and would highlight missing chunks, as well as providing synchronisation with the unicast stream chunks.
The example where the CHUNK_OFFSET indicates the total bytes of data carried in the payloads of the preceding packets for a given chunk, additional benefits are realised. In particular if the multicast stream is used to deliver encoded content in the ISO Base Media File Format and there isn't a retransmission service for packets lost during multicast transmission, or if the retransmission fails. For transport stream based packets, any lost packets can be handled by the client by seeking to the start of the next transport stream chunk using the CHUNK_INDEX for example. However, if the content is in ISO Base Media File Format, then this is not so simple, as the encoded video content is packed and requires an index table with offset values relative to the start of the chunk to unpack it. Thus, if some data is lost, then unless the amount of lost data is known, the data following the lost data cannot be used as the offset values are no longer valid. By setting the CHUNK_OFFSET parameter to indicate the number of bytes to date of content relating to a chunk, the loss of a packet does not result in an unknown amount of lost information, but rather the exact amount of lost information can be deduced, and the offsets in the index table remain usable for processing the subsequent packets of the content chunk.
The marking and generation of the IP packets for multicast transmission is handled by the multicast stream generator 230, and performed in step 308. The resulting multicast stream is output via the output interface 234, where it can be delivered to the network.
Marking at the level of the transport level of the chunks of video ensures the system is tolerant of any changes in video specifications. For example, chunk boundaries can still be determined using this method even if a new video/audio format is used.
More generally, marking chunk boundaries at the transport level avoids the need to process deeper into the chunk data, and thus requires no knowledge of video and audio bitstream specifications, and requires no knowledge of the transport container format, such as the MPEG-2 Transport Stream. It therefore supports additional and new video and audio formats. In addition, in the case that audio and/or video are encrypted, switching between unicast and multicast delivery can be performed seamlessly without the need for the client, or other processing device, to have knowledge of the decryption key.
The process of initiating a unicast stream, and then switching over to a multicast stream will now be explored with reference to one of the clients.
Processing starts with the client making an initial request for the manifest file associated with the content from the content server 104. The content server 104 returns the manifest file, which contains information identifying the location, in the data store 224, of the encoded content.
The client then starts requesting encoded content chunks via unicast in the form of HTTP requests for specific chunks as set out in the manifest from the content server 104, or more specifically the data store 224 (or web server). Thus, the client effectively pulls the content from the web server hosting the encoded content. The chunks requested are the individual transport stream chunks in this example.
The client may also make regular requests for an updated manifest from the content server 104. The content sever 104 can update the manifest file associated with any given content as it receives further transport stream chunks for that content. An updated manifest is created to reflect these additional chunks received from the content generator 102, and provided to the client when requested.
After a while, the switch logic may decide to make the content currently being retrieved by unicast also available by multicast. Note that the content will remain available for unicast from the data store 224, as there may be clients that are not able to or configured to receive multicast. The content server 104 updates the manifest with an indicator of a switch to multicast. In the case of a .m3u8 manifest file, the indicator could be of the form:
Where EXT-X-SWITCH indicates there is a switch of some kind, and udp://239.1.2.3:4321 indicates that it is multicast, giving the multicast address 239.1.2.3, port number 4321.
At the same time, the multicast stream generator 230 will start generating a multicast stream as described above with special transport layer packet headers identifying chunk boundaries. Based on this indicator in the manifest above, the resulting multicast stream is output by the output interface on port 4321, with address 239.1.2.3.
The client will in time request this updated manifest including the switch indicator. However, if it is important to signal the switch to multicast immediately, then as soon as the manifest has been updated, the content server 104 can include an Event Message in the content chunks being delivered over unicast to signal an update to the manifest. The client can then make a request for the updated manifest.
Event messages for MP4 files are defined in ISO/IEC 23009-1, and are carried in the Event Message box (‘emsg’).
Event messages for Transport streams are defined in ISO/IEC 13818-1:2013 Amd.4, where it is defined that Transport Packets with PID value 0x0004 are used for carriage of adaptive streaming information data, the payload format of which is the same as for MP4 files and is therefore also specified in ISO/IEC 23009-1.
Upon reading the updated manifest file, the client will know that a multicast group is available, and attempt to join it by issuing an IGMP join request.
The client will have read and know the chunk sequence number or index of the current unicast chunk that has been delivered, and will inspect the now flowing multicast stream for the CHUNK_INDEX parameter to identify the subsequent chunk(s) to be delivered from this source.
When the client first joins a multicast stream, the first data that it receives may not be that from the start of a chunk. The client needs to identify a point in the multicast stream that corresponds to a point in the unicast data that it has already received. One such point is the start of a chunk, identified in this invention, as described above, by observing either a reduction in the value the CHUNK_OFFSET parameter or a change in the CHUNK_INDEX parameter. The client identifies the same point in the unicast data that it has received by a similar change in the numerical parameter associated with the unicast chunks to the change in value of the CHUNK_INDEX parameter in the multicast. The client processes unicast chunks up to the identified point, and then processes multicast chunks from that same point onwards.
A parameter can be used in the multicast stream to indicate that the multicast stream is about to terminate and that a request for the manifest for the unicast stream should be initiated.
One way to signal that multicast delivery will soon become unavailable is to signal this in the RTP payload header. This could be signalled as a one bit flag in each RTP packet, which when set to ‘1’ indicates that this content chunk is the last to be delivered by multicast; or it could be signalled as a multiple bit numerical value indicating the number of chunks, including the current one, that will be delivered by multicast, with the value of zero indicating that the end of multicast delivery is not imminent.
When the client is receiving content over unicast, it makes regular HTTP GET requests to the content server for manifest updates. These requests can be captured by the switch logic via HTTP logs, and used in helping determine whether and when to switch between unicast and multicast. However, when delivering via multicast, the client does not make regular requests, as all the information needed by the client to switch back to unicast is embedded as marker packets in the multicast stream.
Therefore, in a further example of the invention, the client is configured to make regular HTTP ‘HEAD’ requests to the content server 104 for updates to the manifest file. A HEAD request generally returns metadata associated with a requested file, in this case the manifest file. Whilst the manifest file is not actually needed during a multicast stream, forcing the client to make HEAD requests at regular intervals whilst receiving a multicast stream provides feedback to the content server 104 that the client is actively receiving the multicast stream. Thus, the switch logic is able to determine how many clients in the network are actively receiving any given content over a multicast channel.
By comparing the number of HEAD requests with the number of GET requests (made by unicast receiving clients for requesting specific chunks of content) at the content server 104, the switching logic 232 can determine at any time how many clients are receiving which content, and whether using multicast or unicast. In the light of this knowledge the switching logic 232 can make the appropriate choice of multicast or unicast for a particular piece of content.
Forcing receiving clients send HEAD requests instead of GET requests allows the content server to easily distinguish between feedback from unicast (GET) and feedback from multicast (HEAD).
Furthermore, another major advantage of this approach is that it is independent of any lower level multicast logic. For example, even of one made a count the IGMP joins for a multicast stream, there is no way of telling which clients are still consuming the content. Clients may explicitly leave the multicast group, but they may also simply stop listening. This approach provides a solution.
Whilst the above examples have been described in relation to streaming content directly to a client using unicast or multicast, an alternative example proposes use of a client proxy. A client proxy might reside in a suitably configured router or hub local to the client, which can provide a proxy service to more than one client. The primary purpose of a client proxy is to receive content chunk data by multicast, store it locally, and advertise it to the clients as being available by unicast from the client proxy. This would enable multicast delivery to be used to deliver to the client proxy, obtaining the network efficiency benefits of multicast delivery, and enables eventual delivery to clients that might not support multicast and/or are connected to the proxy using a technology that is not well suited to deliver data by multicast (such as WiFi).
In general, it is noted herein that while the above describes examples of the invention, there are several variations and modifications which may be made to the described examples without departing from the scope of the present invention as defined in the appended claims. One skilled in the art will recognise modifications to the described examples.
Number | Date | Country | Kind |
---|---|---|---|
14250065.1 | Mar 2014 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2015/050872 | 3/24/2015 | WO | 00 |