Systems and Methods for Encoding Video Content

Abstract
Systems and methods for encoding a plurality of alternative streams of video content using multiple encoders in accordance with embodiments of the invention are disclosed. An encoding system includes multiple encoders. Each of the encoders receives a source stream of video content that is divided into portions. Each of the encoders generates portions of the plurality of alternative streams from the portions of the source stream. The portions of the alternative streams generated by a particular encoder are stored in a container for the particular encoder. Each encoder also generates index information for the portion of the alternative stream generated by the encoder that is stored in a manifest for the encoder.
Description
FIELD OF THE INVENTION

The present invention generally relates to adaptive streaming and more specifically to systems that encode video data into streams having different maximum bitrates and playback devices that use the streams to obtain encoded video content from the encoded streams.


BACKGROUND

The term streaming media describes the playback of media on a playback device, where the media is stored on a server and continuously sent to the playback device over a network during playback. Typically, the playback device stores a sufficient quantity of media in a buffer at any given time during playback to prevent disruption of playback due to the playback device completing playback of all the buffered media prior to receipt of the next portion of media. Adaptive bit rate streaming or adaptive streaming involves detecting the present streaming conditions (e.g. the user's network bandwidth and CPU capacity) in real time and adjusting the quality of the streamed media accordingly. Typically, the source media is encoded at multiple bit rates and the playback device or client switches between streaming the different encodings depending on available resources.


Adaptive streaming solutions typically utilize either Hypertext Transfer Protocol (HTTP), published by the Internet Engineering Task Force and the World Wide Web Consortium as RFC 2616, or Real Time Streaming Protocol (RTSP), published by the Internet Engineering Task Force as RFC 2326, to stream media between a server and a playback device. HTTP is a stateless protocol that enables a playback device to request a byte range within a file. HTTP is described as stateless, because the server is not required to record information concerning the state of the playback device requesting information or the byte ranges requested by the playback device in order to respond to requests received from the playback device. RTSP is a network control protocol used to control streaming media servers. Playback devices issue control commands, such as “play” and “pause”, to the server streaming the media to control the playback of media files. When RTSP is utilized, the media server records the state of each client device and determines the media to stream based upon the instructions received from the client devices and the client's state.


In adaptive streaming systems, the source media is typically stored on a media server as a top level index file or manifest pointing to a number of alternate streams that contain the actual video and audio data. Each stream is typically stored in one or more container files. Different adaptive streaming solutions typically utilize different index and media containers. The Synchronized Multimedia Integration Language (SMIL) developed by the World Wide Web Consortium is utilized to create indexes in several adaptive streaming solutions including IIS Smooth Streaming developed by Microsoft Corporation of Redmond, Washington, and Flash Dynamic Streaming developed by Adobe Systems Incorporated of San Jose, California. HTTP Adaptive Bitrate Streaming developed by Apple Computer Incorporated of Cupertino, California implements index files using an extended M3U playlist file (.M3U8), which is a text file containing a list of URIs that typically identify a media container file. The most commonly used media container formats are the MP4 container format specified in MPEG-4 Part 14 (i.e. ISO/IEC 14496-14) and the MPEG transport stream (TS) container specified in MPEG-2 Part 1 (i.e. ISO/IEC Standard 13818-1). The MP4 container format is utilized in IIS Smooth Streaming and Flash Dynamic Streaming. The TS container is used in HTTP Adaptive Bitrate Streaming.


The Matroska container is a media container developed as an open standard project by the Matroska non-profit organization of Aussonne, France. The Matroska container is based upon Extensible Binary Meta Language (EBML), which is a binary derivative of the Extensible Markup Language (XML). Decoding of the Matroska container is supported by many consumer electronics (CE) devices. The DivX Plus file format developed by DivX, LLC of San Diego, California utilizes an extension of the Matroska container format (i.e. is based upon the Matroska container format, but includes elements that are not specified within the Matroska format).


To provide a consistent means for the delivery of media content over the Internet, the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) have put forth the Dynamic Adaptive Streaming over HTTP (DASH) standard. The DASH standard specifies formats for the media content and the description of the content for delivery of MPEG content using HTTP. In accordance with DASH, each component of media content for a presentation is stored in one or more streams. Each of the streams is divided into segments. A Media Presentation Description (MPD) is a data structure that includes information about the segments in each of the stream and other information needed to present the media content during playback. A playback device uses the MPD to obtain the components of the media content using adaptive bit rate streaming for playback.


As the speed at which streaming content has improved. Streaming of live events, such as sporting events and concerts has become popular. However, it is a problem to encode the video content from a live event into streams for adaptive bitrate streaming. To do so, encoder server systems typically use hardware encoders that are specifically designed to encode the video content into the various streams. These specialized encoders are expensive to obtain. Thus, those skilled in the art are constantly striving to find lower cost alternatives to the specialized encoders.


SUMMARY OF THE INVENTION

Systems and methods for encoding video content into multiple streams having different maximum bitrates and to obtaining the video content using playback devices in accordance with some embodiments of the invention are disclosed. The process in accordance to many embodiments is performed in the following manner. Each server in an encoding system receives portions of a source stream of video content from a content provider system. Each of the encoders encodes a portion of the alternative streams using the portions of the source stream received in each of the encoders. The portions of the alternative streams encoded by each particular one of the encoders is stored in a container for the particular one of the encoder. Each of the encoders then generates index information for the portions of the alternatives streams generated by the each encoder and stores the index information in a manifest for the portion of the alternative streams generated by the particular encoder.


In accordance with some embodiments of the invention, the portion of the alternative streams encoded by each of the encoders is one of the alternative streams and a stream is generated be each encoder in the following manner. The encoder receives each portion of the source stream and encodes a segment of an alternative stream from each portion of the source stream to generate the segments of the alternative stream. In accordance with a number of these embodiments, each particular encoder has a particular set of parameters for generating a stream. Each encoder also generates index information for each of the segments generated by the particular encoder and store the index information in a manifest for the particular encoder. In accordance with several of these embodiments, the alterative stream generated by a particular encoder has a particular maximum bitrate as a parameter. In accordance with still further of some of these embodiments, at least two alternative streams of the alternative streams generated by different ones of the encoders have a same maximum bitrate and at least one other parameter that is different. In accordance with many of these embodiments, the at least one other parameter is selected from a group of parameters consisting of aspect ratio, frame rate, and resolution.


In accordance with some embodiments of the invention, the system includes N encoders where N is an integer and each of the N encoders encode 1/N of the portions of the source streams into segments of each of the alternative streams. In accordance with some embodiments the encoding of 1/N portions of the source streams into segments in each of the alternative streams is performed in the following manner. Each encoder is assigned an Mth encoding order where M is an integer from 1 to N. Each encoder determines the Mth portion of the source stream received and every Nth portion received thereafter from the source stream as a set of portions of the source stream for the Mth encoder to encode. The encoding of a portion includes encoding the portion into a segment in each one of the alternative streams. The generating of the index information includes generating index information for each of segments generated for each of the portions in each of the alternative streams and storing the index information for each of the segments generated from each portion in the set of portions in a manifest for the Mth encoder. In accordance with some of these embodiments, each Mth encoder discards each of the portions that is not in the set of portions for the Mth encoder.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 illustrates a network diagram of an adaptive bitrate streaming system in accordance with an embodiment of the invention.



FIG. 2 illustrates a block diagram of components of an encoding server system in accordance with an embodiment of the invention.



FIG. 3 illustrates a block diagram of components of a processing system in an encoder server system that encodes the video content into streams having different maximum bitrates in accordance with an embodiment of the invention.



FIG. 4 illustrates a block diagram of components of a processing system in a playback device that uses the encoded streams having different maximum bitrates to obtain the video content via adaptive bitrate stream, encoding system in accordance with an embodiment of the invention.



FIG. 5 illustrates a flow diagram for a process performed by each encoder in an encoder server system to encode video content into one of the streams used in an adaptive bitrate streaming system in accordance with an embodiment of the invention.



FIG. 6 illustrates a flow diagram for a process performed by each of N encoders in an encoder server system to encode each Nth segment of the video content in accordance with an embodiment of the invention.



FIG. 7 illustrates a flow diagram of a process performed by a playback device to obtain the manifest information for the streams and use the streams to obtain the video content using an adaptive bitrate system in accordance with an embodiment of the invention.





DETAILED DISCLOSURE OF THE INVENTION

Turning now to the drawings, systems and methods for encoding video content into streams for adaptive bitrate steaming and obtaining the stream using a playback device in accordance with some embodiments of the invention are illustrated. In accordance with some embodiments of this invention, an encoding system includes more than one encoder. In accordance with some of these embodiments, the encoders maybe provided by software executed by a processing system in the encoding system. In accordance with many embodiments, the encoders may be provided by firmware in the encoding system. In accordance with a number of embodiments, the encoders are provided by hardware in the server system.


The encoding system receives a source stream of video content from a source that includes an embedded timestamp. In accordance with some embodiments, the video content is a live feed being recorded in real-time. In accordance with some of these embodiments, the source stream of video content includes a timestamp in accordance with universal time.


In accordance with some embodiments, each encoder is used to generate a single stream of a set of streams to be used for adaptive bitrate streaming of content. In accordance with some of these embodiments, all of the encoders begin receiving portions of the source stream of video content and are synchronized using the embedded timestamp within the received portions. As each encoder receives portions of the source stream of the video content from the source system, the encoders encode the received portions of the source stream of video content into segments of a stream having predefined parameters particular to each encoder. In accordance with some embodiments, the stream produced by each encoder has a different maximum bitrate (or different target average bitrate) than the streams being generated by the other encoders. In accordance with some other embodiments, other parameters including, but not limited to, aspect ratio, resolution, and frame rate may be varied in the stream being generated by the various encoders.


Each encoder stores the generated portions in one or more container files for the generated stream in accordance with some embodiments of the invention. The encoder also generates index or manifest information for each of the generated portions of the streams and adds the generated index or manifest information to an index file or manifest in accordance with many embodiments of the invention. The process is repeated until the end of the source stream is received.


In accordance with some other embodiments, the encoding system includes a number of encoders (N) and each encoder encodes a portion (e.g. 1/N) of the source stream multiple times using different sets of encoding parameters to create segments for each of the streams in an adaptation set of streams. In accordance with some of these embodiments, each encoder is assigned position in the processing order. Each encoder then begins to receive the source stream of the video content. As portions of the source stream are received by each of the encoders, the encoder determines whether a portion is an Nth segment of the source stream assigned to the encoder. If a portion is not an Nth segment, the encoder discards the segment. If a portion is an Nth segment, the encoder encodes the received portion into segments in accordance with each of the profiles for the various streams in the set of streams and stores the segments in the container files for the appropriate streams. The encoder then generates index or manifest information for each of the generated segments and adds the information to an appropriate index file or manifest. In accordance with many of these embodiments, the index or manifest information is added to a manifest for segments produced by the encoder. In accordance with some other embodiments, the index or manifest information for each segment produced for the various streams is added to the index file or manifest maintained for the specific stream and/or stored in a database in memory for future use in generating a manifest file. The process is repeated by each of the encoders until each of the encoders receive the end of the source stream.


In accordance with some embodiments, the media content is stored in streams in accordance with the DASH standards. However, one skilled in the art will recognize that the formats such as but not limited to, a Matroska (MKV) container file format may be used to store streams of the media content without departing from this invention.


The performance of an adaptive bitrate streaming system in accordance with some embodiments of the invention can be significantly enhanced by encoding each portion of the source video in each of the alternative streams in such a way that the segment of video is encoded in each stream as a single (or at least one) closed group of pictures (GOP) starting with an Instantaneous Decoder Refresh (IDR) frame, which is an intra frame. The playback device can switch between the alternative streams used during playback at the completion of the playback of a video segment and, irrespective of the stream from which a video segment is obtained, the first frame in the video segment will be an IDR frame that can be decoded without reference to any encoded media other than the encoded media contained within the video segment.


In a number of embodiments, the playback device obtains information concerning each of the available streams from the MPD and selects one or more streams to utilize in the playback of the media. The playback device can also request index information that indexes segments of the encoded video content stored within the relevant container files. The index information can be stored within the container files or separately from the container files in the MPD or in separate index files. The index information enables the playback device to request byte ranges corresponding to segments of the encoded video content within the container file containing specific portions of encoded video content via HTTP (or another appropriate stateful or stateless protocol) from the server. The playback device uses the index information to request segments of the video content from the alternative streams in accordance with some embodiments. Playback is continued with the playback device requesting segments of the encoded video content from a stream having video content that is encoded at a maximum bitrate that can supported by the network conditions.


In accordance with some embodiments of the invention, the playback device operates in the following manner to use the streams generated by the multiple encoders in the encoding system. The playback device requests the media content that includes the video content. In response to the request, the playback device receives the MPD or index file maintained and/or generated by each encoder. The playback devices uses embedded timestamps to then join the MPD or index files from the various encoders into a combined adaptation set of index information. The playback devices then uses the index information from the combined adaptation set to perform adaptive bitrate streaming to obtain the video content. In accordance with some other embodiments, the server generates a MPD from the MPD or index files generated by each encoder using the embedded time stamps and provides the MPD to the playback devices. The playback devices then uses MPD to perform adaptive bitrate streaming to obtain the video content.


The encoding of video content into multiple streams for use in adaptive bitrate streaming using multiple encoders and the process for obtaining the video content from the generated streams by a playback device using adaptive bitrate streaming in accordance with some embodiments of the invention is discussed further below.


Adaptive Streaming System Architecture

Turning now to the FIG. 1, an adaptive streaming system that includes an encoding system that generates streams using multiple encoders in accordance with an embodiment of the invention is illustrated. The adaptive streaming system 10 includes a source encoding system 12 configured to encode source media content including video content as a number of alternative streams. In the illustrated embodiment, the source encoder is a single server. In other embodiments, the source encoder can be any processing device or group of processing devices including a processor and sufficient resources to perform the transcoding of source media (including but not limited to video, audio, and/or subtitles) into the alternative streams. Typically, the source encoding server 12 generates an MPD that includes an index indicating container files containing the streams and/or metadata information, at least a plurality of which are alternative streams. Alternative streams are streams that encode the same media content in different ways. In many instances, alternative streams encode media content (such as, but not limited to, video content and/or audio content) at different maximum bitrates. In a number of embodiments, the alternative streams of video content are encoded with different resolutions and/or at different frame rates. However, the source encoder system 12 uses multiple encoders to generate the alternative streams and each particular encoder generates an MPD for the segments of the stream or streams generated by the particular encoder. The MPDs generated by the various encoders and the container files are uploaded to an HTTP server 14. A variety of playback devices can then use HTTP or another appropriate stateless protocol to request portions of the MPDs, index files, and the container files via a network 16 such as the Internet.


In the illustrated embodiment, playback devices that can perform adaptive bitrate streaming using the MPDs from the various encoders include personal computers 18, CE players, and mobile phones 20. In accordance with some other embodiments, playback devices can include consumer electronics devices such as DVD players, Blu-ray players, televisions, set top boxes, video game consoles, tablets, virtual reality headsets, augmented reality headsets and other devices that are capable of connecting to a server via a communication protocol including (but not limited to) HTTP and playing back encoded media. Although a specific architecture is shown in FIG. 1, any of a variety of architectures including systems that perform conventional streaming and not adaptive bitrate streaming can be utilized that enable playback devices to request portions of the MPDs and the container files in accordance with various embodiments of the invention.


Encoder System

An encoder system that uses multiple encoders to encode video content into alternative streams for use in adaptive bitrate streaming in accordance with an embodiment of the invention is shown in FIG. 2. Encoding system 200 includes a router 205 and an encoding server 202 communicatively connected to router 205. One skilled in the art will recognize that any number of servers or processors may be connected to router 205 without departing from this invention and that only one server is shown for clarity and brevity. The encoder includes multiple encoders 215-218. In accordance with some embodiments, each of the encoders 215-218 is an instantiation of software that is being executed by the processor from instructions stored in a memory to perform the decoding and/or encoding of the source content. In accordance with some other embodiments, one or more of encoders 215-218 is a particular hardware component in the server that encodes received content. In still other embodiments, one or more of the encoders may be a firmware component in which hardware and software are used to provide the encoder. The router provides an incoming source stream of video content to each of the encoders 215-218 of the server 210. In accordance with some embodiments, the router transmits a copy of the stream to each of the encoders. In accordance with some other embodiments, the server 210 receives the source stream and provides a copy of the incoming source stream to each of the encoders 215 as the source stream is received. The source stream includes embedded timing information.


Although a specific architecture of a server system is shown in FIG. 2, any of a variety of architectures including systems that encode video content from a received stream can be utilized in accordance with various embodiments of the invention.


Playback Device

Processes that provide the methods and systems for using the alternative streams generated by multiple encoders in accordance with some embodiments of this invention are executed by a playback device. The relevant components in a playback device that can perform the processes in accordance with an embodiment of the invention are shown in FIG. 3. One skilled in the art will recognize that playback devices may include other components that are omitted for brevity without departing from described embodiments of this invention. The playback device 300 includes a processor 305, a non-volatile memory 310, and a volatile memory 315. The processor 305 is a processor, microprocessor, controller, or a combination of processors, microprocessor, and/or controllers that performs instructions stored in the volatile memory 315 or non-volatile memory 310 to manipulate data stored in the memory. The non-volatile memory 310 can store the processor instructions utilized to configure the playback device 300 to perform processes including processes for using alternative streams encoded by multiple encoders to obtain video content using adaptive bit rate streaming in accordance with some embodiments of the invention. In accordance with various other embodiments, the playback device may have hardware and/or firmware that can include the instructions and/or perform these processes. In accordance with still other embodiments, the instructions for the processes can be stored in any of a variety of non-transitory computer readable media appropriate to a specific application.


Servers

Process in a method and system of encoding video content into streams for adaptive bitrate streaming using multiple encoders in accordance with an embodiment of this invention are performed by an encoder such as an encoding server. The relevant components in an encoding server that perform these processes in accordance with an embodiment of the invention are shown in FIG. 4. One skilled in the art will recognize that a server may include other components that are omitted for brevity without departing from the described embodiments of this invention. The server 400 includes a processor 405, a non-volatile memory 410, and a volatile memory 415. The processor 405 is a processor, microprocessor, controller, or a combination of processors, microprocessor, and/or controllers that performs instructions stored in the volatile 415 or non-volatile memory 410 to manipulate data stored in the memory. The non-volatile memory 410 can store the processor instructions utilized to configure the server 400 to perform processes including processes for encoding media content and/or generating marker information in accordance with some embodiments of the invention and/or data for the processes being utilized. In accordance with various embodiments, these instructions may be in server software and/or firmware can be stored in any of a variety of non-transitory computer readable media appropriate to a specific application. Although a specific server is illustrated in FIG. 4, any of a variety of servers configured to perform any number of processes can be utilized in accordance with various embodiments of the invention.


Encoding of Video Content Into Alternative Streams for Adaptive Bitrate Streaming Using Multiple Encoders in an Encoding System

In accordance with some embodiments, an encoding system encodes video content into alternative streams for adaptive bitrate streaming using multiple encoders. In accordance with some embodiments of the invention, the encoders are software encoders that are instantiations of software instructions read from a memory that can be performed or executed by a processor. Software encoders may be used when it is desirable to reduce the cost of the encoders and/or to improve the scalability of the system as only processing and memory resources are needed to add additional encoders to the system. In accordance with many embodiments, one or more of the multiple encoders are hardware encoders. Hardware encoders are circuity that is configured to perform the processes for encoding the received content into one or more streams. In accordance with a number of embodiments, one or more of the encoders may be firmware encoders. A firmware encoder combines some hardware components and some software processes to provide an encoder.


The video content may be received as a source stream from a content provider. In accordance with some embodiments, the video content is a live broadcast meaning the video content is being captured and streamed in real time. The video content may include time information. The time information may include, but is not limited to, a broadcast time, a presentation time and/or a time recorded. Each of the encoders receives the source stream of video content and generates portions of the alternative streams. In accordance with some embodiments, each of the multiple encoders produces a single stream having encoder specific parameters from the source stream. In accordance with some other embodiments, the encoding system includes a number of encoders (N) and each encoder encodes a portion (e.g. 1/N) of the source stream multiple times using different sets of encoding parameters to create segments for each of the streams in an adaptation set of streams. Processes for encoding alternative streams of video content from a source stream of video content using multiple encoders in accordance with some different embodiments of the invention are shown in FIGS. 5 and 6.


A flow chart of a process performed by at least one of a set of multiple encoders to generate a single stream of the alternative streams from the source stream of video content in accordance with an embodiment of the invention is shown in FIG. 5. In process 500, the encoder receives a portion of a source stream of video content that includes timing information (505). In accordance with some embodiments, the encoders may use time information received with the portion to determine at what point in the stream the encoder is to start encoding the stream. As the encoders are using the same timing information, the encoding performed by the encoders is synchronized such that the segments produced by each encoder include the same duration of video content to present in terms of presentation time and the segments are aligned. The encoder uses the portion of the source stream of video content to encode a segment of a stream of video content that has specified parameters particular to the encoder (510). In accordance with some embodiments, the specified parameters of the stream generated by each encoder include a different maximum bitrate.


In accordance with some other embodiments, the streams from two or more encoders have the same maximum bitrate and different aspect ratios, resolutions, and/or frame rates. The encoder also generates index or manifest information for the generated segment (515). The generated segment is stored in the container of the stream being generated by the encoder (520) and the index or manifest information is added to the manifest or index file for the stream stored in memory and/or delivered to client playback devices as an update (525). Process 500 repeats until the encoder receives the end of the stream and/or reception of the stream is halted in some other manner (530).


Although various examples of processes performed by an encoder for encoding one of a set alternative streams of video content are described above, one skilled in the art will recognize that other processes for encoding the streams may be performed in accordance with some embodiments of the invention.


In accordance with some other embodiments of the invention, the encoding system includes a number of encoders (N) and each encoder encodes a portion (e.g. 1/N) of the source stream multiple times using different sets of encoding parameters to create segments for each of the streams in an adaptation set of streams. Each encoder is assigned an encoder order position M where M is a number between 1 and M. The first encoder encodes the first portion of the source stream and every Nth portion received thereafter into segments of each of the alternative streams. The second encoder handles the second received portion of the source stream and every Nth portion received thereafter into segments of each of the alternative streams. Likewise, the remaining encoders in the encoding order 1 through N encode the Mth received portion of the every Nth portion received thereafter into segments of the various alternative streams where M is the encoding order position. Thus, each encoder only encodes 1/N of the total number of segments of the source streams into alternative streams. This type of encoding causes the availability of segments to be N*|segment duration| and not real time. Thus, the availability time of the segments may need to be added to information in the manifest to allow clients to know when the segments will be available. A flow diagram of a process performed by each of the N encoders to generate every Nth segment of the video content from the source stream in accordance with an embodiment of the invention is shown in FIG. 6.


In process 600, the encoder receives a portion of a source stream of video content that includes timing information (605). In accordance with some embodiments, the encoders may use time information received with the portion to determine at what point in the stream the encoder is to start encoding the stream. As the encoders are using the same timing information from the source stream, the encoding performed by the encoders is synchronized such that the segments produced by each encoder include the same amount of video content to present in terms of presentation time and the segments are aligned with subsequent segments.


The encoder then determines whether the received portion is one of the Nth portions of the source stream to handle (610). The determination may be performed by using a counter to count the received portions and compare the current count to M and determine whether the count is equal to or a factor of M where M is the encoder position order in accordance with some embodiments. In accordance with some other embodiments, metadata for the received portions of the source streams are used to make the determination.


If the received portion is not one of the portions the encoder is to handle, the encoder discards the received portion of the stream (615). If the received portion is determined to be one of the portions of the incoming streams the encoder is to encode, the encoder encodes the portion in segments for each of the alternative streams based upon the specific parameters for each of the alternative streams (620). The parameters of the streams include, but are not limited to, maximum bitrates, resolution, aspect ratio, and frame rate.


The encoder also generates index or manifest information for each of the generated segments (625). This includes generating manifest information for each of the alternative streams. Each of the generated segments is stored in the container(s) of the appropriate alternative stream (630) and the index or manifest information is added to an appropriate manifest or index file(s) (635). In accordance with some embodiments, manifest or index information is added to the MPD for the alternative streams stored in memory. In accordance with some other embodiments, the manifest or index information is added to an MPD for the segments encoded by the encoder. In still other embodiments, the manifest or index information is delivered to client playback devices as an update. Process 600 repeats until the encoder receives the end of the stream and/or reception of the stream is halted in some other manner (640).


Although various examples of processes performed by an encoder for encoding every Nth segment for each of the alternative streams of video content are described above, one skilled in the art will recognize that other processes for encoding the portions for the streams may be performed in accordance with some embodiments of the invention.


Process Performed by a Playback Device to Obtain Video Content Using Alternative Streams Generated by Multiple Encoders

In accordance with some embodiments of the invention, a playback device uses the streams generated by the multiple encoders to obtain the video content for playback. In accordance with some embodiments of the invention, the playback devices adaptive bit rate streaming to obtain the media content from the alternative streams generated using multiple encoders. To do so, the playback device must receive the MPD generated by each of the encoders to generate a combined adaptation set for use in obtaining the segments using adaptive bit rate streaming. In accordance with some embodiments, the combined adaptation set is generated based upon timestamps embedded in the MPD generated by each of the encoders. A process performed by a playback device to perform adaptive bitrate streaming in accordance with an embodiment of the invention is shown in FIG. 7.


In process 700, the playback device requests the index or manifest information for the video content (705). The playback device receives the MPD or index files generated by each encoders as the encoders generate the segments of the alternative streams (710). The playback device generates a combined adaptive set from the index or manifest information in the MPDs from the encoders using the embedded time stamps in each of the MPDs (715). In accordance with some embodiments, the combined adaptive set generated has the same format as a MPD and is generated by populating the combine adaptive set with the index or manifest information from the received MPD. The combined adaptive set is used by the playback device to perform adaptive bit rate streaming to obtain the video content for playback (720). In accordance with some embodiments, the playback device uses the combined adaptive set to request portions of the video content. In accordance with some embodiments of the invention, the playback device monitors the network bandwidth for communications over the network between the playback device and the content provider system; and selects streams of the audio and/or video content that are encoded at highest maximum bitrates that can be handled in accordance with the measured bandwidth. Systems and methods for selecting a stream and commencing playback include those disclosed in U.S. Patent Application Publication 2013/0007200 entitled “Systems and Methods for Determining Available Bandwidth and Performing Initial Stream Selection When Commencing Streaming Using Hypertext Transfer Protocol” and U.S. Pat. No. 8,832,297 entitled “Systems and Methods for Performing Multiphase Adaptive Bitrate Streaming,” the disclosures of which are hereby incorporated by reference in their entirety. More particularly, the processes performed by a playback device to obtain the video content using adaptive bit rate streaming described in these references are incorporated herein by reference.


Although a process performed by a playback device to obtain video content performing adaptive bit rate streaming using the alternative streams generated by multiple encoders in accordance with an embodiment of the invention is disclosed in FIG. 7, other processes may be performed by a playback device to obtain video content using alternative streams generated by multiple encoders in accordance with embodiments of the invention.


Although the present invention has been described in certain specific aspects, many additional modifications and variations would be apparent to those skilled in the art. Specifically, this invention may be used in association with trick play tracks where only certain frames of the trick-play track are shown in accordance with some embodiments of the invention. It is therefore to be understood that the present invention may be practiced otherwise than specifically described, including various changes in the implementation such as utilizing encoders and decoders that support features beyond those specified within a particular standard with which they comply, without departing from the scope and spirit of the present invention. Thus, embodiments of the present invention should be considered in all respects as illustrative and not restrictive.

Claims
  • 1. (canceled)
  • 2. A method for encoding a plurality of alternative streams of video content, the method comprising: receiving a plurality of portions of a source stream of video content in a plurality of encoders, wherein the source stream of video content comprises embedded timing information; andfor each received portion of the plurality of portions of the source stream: retrieving the embedded timing information from the received portion;encoding the received portion, using an encoder of the plurality of encoders, into a set of one or more output segments, wherein each segment of the set of one or more output segments comprises at least one group of pictures (GOP);generating index information for the set of one or more output segments;synchronizing the set of one or more output segments produced by the encoder with corresponding sets of one or more output segments generated by at least one other encoder of the plurality of encoders, wherein: each of the corresponding sets of one or more output segments generated by the at least one other encoder corresponds to the received portion; andsynchronizing the set of one or more output segments comprises using the retrieved timing information to: synchronize a starting frame for the set of one or more output segments produced by the encoder and starting frames of each of the corresponding sets of one or more output segments generated by the at least one other encoder; andsynchronize a starting frame for each GOP of the at least one GOP of the set of one or more output segments produced by the encoder and starting frames for each GOP of the at least one GOP of the corresponding sets of one or more output segments generated by the at least one other encoder;storing a reference to the index information generated for the set of one or more output segments in a manifest;storing the synchronized set of one or more output segments in a container corresponding to one of a plurality of alternate streams of the source stream; andstoring each of the synchronized corresponding sets of one or more output segments in their own individual container, wherein each individual container corresponds to an additional alternate stream of the plurality of alternate streams of the source stream.
  • 3. The method of claim 2, wherein, for each received portion of the plurality of portions of the source stream, the reference to the index information is stored in the manifest in response to a request for a particular alternate stream.
  • 4. The method of claim 2, wherein, for each received portion of the plurality of portions of the source stream, storing the synchronized set of one or more output segments further comprises storing the manifest in a data structure.
  • 5. The method of claim 2, wherein the embedded timing information comprises at least one of: timestamps; ortiming information based on universal time.
  • 6. The method of claim 2, wherein, for each received portion of the plurality of portions of the source stream, each corresponding set of one or more output segments has a different unique bitrate.
  • 7. The method of claim 2, wherein: the set of one or more output segments generated by one of the plurality of encoders comprises a particular set of parameters; andat least one parameter is selected from the group consisting of aspect ratio, frame rate, and resolution.
  • 8. The method of claim 2, wherein the source stream of video content is a live stream.
  • 9. The method of claim 2, wherein, for each received portion of the plurality of portions of the source stream, the received portion of the plurality of portions of the source stream is assigned to the encoder and the at least one other encoder on a sequential basis.
  • 10. An encoding system for encoding a source stream of video content comprising: a processor;a plurality of encoders;memory accessible by the processor; andinstructions stored in the memory that when read by the processor direct the processor to: receive a plurality of portions of a source stream of video content in the plurality of encoders, wherein the source stream of video content comprises embedded timing information; andfor each received portion of the plurality of portions of the source stream: retrieve the embedded timing information from the received portion;encode the received portion, using an encoder of the plurality of encoders, into a set of one or more output segments, wherein each segment of the set of one or more output segments comprises at least one group of pictures (GOP);generate index information for the set of one or more output segments;synchronize the set of one or more output segments produced by the encoder with corresponding sets of one or more output segments generated by at least one other encoder of the plurality of encoders, wherein synchronizing the set of one or more output segments comprises using the retrieved timing information to: synchronize a starting frame for the set of one or more output segments produced by the encoder and starting frames of each of the corresponding sets of one or more output segments generated by the at least one other encoder; andsynchronize a starting frame for each GOP of the at least one GOP of the set of one or more output segments produced by the encoder and starting frames for each GOP of the at least one GOP of the corresponding sets of one or more output segments generated by the at least one other encoder;store a reference to the index information generated for the set of one or more output segments in a manifest;store the synchronized set of one or more output segments in a container corresponding to one of a plurality of alternate streams of the source stream; andstore each of the synchronized corresponding sets of one or more output segments in their own individual container, wherein each individual container corresponds to an additional alternate stream of the plurality of alternate streams of the source stream.
  • 11. The encoding system of claim 10, wherein, for each received portion of the plurality of portions of the source stream: each corresponding set of one or more output segments has a different unique bitrate;the reference to the index information is stored in the manifest in response to a request for a particular alternate stream;storing the synchronized set of one or more output segments further comprises storing the manifest in a data structure; andthe received portion of the plurality of portions of the source stream is assigned to the encoder and the at least one other encoder on a sequential basis.
  • 12. The encoding system of claim 10, wherein: the embedded timing information comprises at least one of: timestamps; ortiming information based on universal time; andthe source stream of video content is a live stream.
  • 13. The encoding system of claim 10, wherein: the set of one or more output segments generated by one of the plurality of encoders comprises a particular set of parameters; andat least one parameter is selected from the group consisting of aspect ratio, frame rate, and resolution.
  • 14. A non-transitory computer-readable medium for encoding a plurality of alternative streams of video content, wherein program instructions are executable by one or more processors to perform a process that comprises: receiving a plurality of portions of a source stream of video content in a plurality of encoders, wherein the source stream of video content comprises embedded timing information; andfor each received portion of the plurality of portions of the source stream: retrieving the embedded timing information from the received portion;encoding the received portion, using an encoder of the plurality of encoders, into a set of one or more output segments, wherein each segment of the set of one or more output segments comprises at least one group of pictures (GOP);generating index information for the set of one or more output segments;synchronizing the set of one or more output segments produced by the encoder with corresponding sets of one or more output segments generated by at least one other encoder of the plurality of encoders, wherein: each of the corresponding sets of one or more output segments generated by the at least one other encoder corresponds to the received portion; andsynchronizing the set of one or more output segments comprises using the retrieved timing information to: synchronize a starting frame for the set of one or more output segments produced by the encoder and starting frames of each of the corresponding sets of one or more output segments generated by the at least one other encoder; andsynchronize a starting frame for each GOP of the at least one GOP of the set of one or more output segments produced by the encoder and starting frames for each GOP of the at least one GOP of the corresponding sets of one or more output segments generated by the at least one other encoder;storing a reference to the index information generated for the set of one or more output segments in a manifest;storing the synchronized set of one or more output segments in a container corresponding to one of a plurality of alternate streams of the source stream; andstoring each of the synchronized corresponding sets of one or more output segments in their own individual container, wherein each individual container corresponds to an additional alternate stream of the plurality of alternate streams of the source stream.
  • 15. The non-transitory computer-readable medium of claim 14, wherein, for each received portion of the plurality of portions of the source stream, the reference to the index information is stored in the manifest in response to a request for a particular alternate stream.
  • 16. The non-transitory computer-readable medium of claim 14, wherein storing the synchronized set of one or more output segments further comprises storing the manifest in a data structure.
  • 17. The non-transitory computer-readable medium of claim 14, wherein the embedded timing information comprises at least one of: timestamps; ortiming information based on universal time.
  • 18. The non-transitory computer-readable medium of claim 14, wherein each corresponding set of one or more output segments has a different unique bitrate.
  • 19. The non-transitory computer-readable medium of claim 14, wherein: the set of one or more output segments generated by one of the plurality of encoders comprises a particular set of parameters; andat least one parameter is selected from the group consisting of aspect ratio, frame rate, and resolution.
  • 20. The non-transitory computer-readable medium of claim 14, wherein the source stream of video content is a live stream.
  • 21. The non-transitory computer-readable medium of claim 14, wherein, for each received portion of the plurality of portions of the source stream, the received portion of the plurality of portions of the source stream is assigned to the encoder and the at least one other encoder on a sequential basis.
CROSS REFERENCE TO RELATED APPLICATIONS

The current application is a continuation of U.S. patent application Ser. No. 18/449,605, filed Aug. 14, 2023, entitled “Systems and Methods for Encoding Video Content” to Amidei et al., which is a continuation of U.S. patent application Ser. No. 18/049,256, filed Oct. 24, 2022 and issued as U.S. Pat. No. 11,729,451 on Aug. 15, 2023, entitled “Systems and Methods for Encoding Video Content” to Amidei et al., which is a continuation of U.S. patent application Ser. No. 17/343,453, filed Jun. 9, 2021 and issued as U.S. Pat. No. 11,483,609 on Oct. 25, 2022, entitled “Systems and Methods for Encoding Video Content” to Amidei et al., which is a continuation of U.S. patent application Ser. No. 16/819,865, filed Mar. 16, 2020 and issued as U.S. Pat. No. 11,064,235 on Jul. 13, 2021, entitled “Systems and Methods for Encoding Video Content” to Amidei et al., which is a continuation of U.S. patent application Ser. No. 16/208,210, filed Dec. 3, 2018 and issued as U.S. Pat. No. 10,595,070 on Mar. 17, 2020, entitled “Systems and Methods for Encoding Video Content” to Amidei et al., which is a continuation of U.S. patent application Ser. No. 15/183,562, filed Jun. 15, 2016 and issued as U.S. Pat. No. 10,148,989 on Dec. 4, 2018, entitled “Systems and Methods for Encoding Video Content” to Amidei et al., the disclosures of which are expressly incorporated by reference herein in their entirety.

Continuations (6)
Number Date Country
Parent 18449605 Aug 2023 US
Child 18892206 US
Parent 18049256 Oct 2022 US
Child 18449605 US
Parent 17343453 Jun 2021 US
Child 18049256 US
Parent 16819865 Mar 2020 US
Child 17343453 US
Parent 16208210 Dec 2018 US
Child 16819865 US
Parent 15183562 Jun 2016 US
Child 16208210 US