Particular embodiments are generally related to processing of video streams.
Broadcast and On-Demand delivery of digital audiovisual content has become increasingly popular in cable and satellite television networks (generally, subscriber television networks). Various specifications and standards have been developed for communication of audiovisual content, including the MPEG-2 video coding standard and AVC video coding standard. One feature pertaining to the provision of programming in subscriber television systems requires the ability to concatenate video segments or video sequences, for example, as when inserting television commercials or advertisements. For instance, for local advertisements to be provided in national content, such as ABC news, etc., such programming may be received at a headend (e.g., via a satellite feed), with locations in the programming allocated for insertion at the headend (e.g., headend encoder) of local advertisements. Splicing technology that addresses the complexities of AVC coding standards is desired.
Many aspects of the disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the disclosed embodiments. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.
Systems and methods that, in one embodiment, provide a video stream including a portion containing a first video sequence followed by a second video sequence, and that provide a first information in the video stream pertaining to pictures in the first video sequence, wherein the location of the first information provided in the video stream is in relation to a second information in the video stream, wherein the second information pertains to the end of the first video sequence, wherein the first information in the video stream corresponds to a first information type and the second information in the video stream corresponds to a second information type different than the first information type, and wherein the first information corresponds to auxiliary information.
In general, certain embodiments are disclosed herein that illustrate systems and methods (collectively, also referred to as video stream emitter) that provides a video stream (e.g., bitstream) that includes one or more concatenated video sequences (e.g., segments) and information pertaining to the one or more concatenations to other devices, such as one or more receivers coupled over a communications medium. The video stream emitter may include video encoding capabilities (e.g., an encoder or encoding device) and/or video splicing capabilities (e.g., a splicer). In one embodiment, the video stream emitter receives a video stream including a first video sequence and splices or concatenates a second video sequence after a potential splice point in the first video sequence. The potential splice point in the first video sequence is identified by information in the video stream, said information having a corresponding information type, such as a message. The video stream emitter may include information in the video stream that pertains to the concatenation of the first video sequence followed by the second video sequence. Included information may further provide information pertaining to the concatenation, such as properties of the pictures of the first video sequence and of pictures of the second video sequence.
In another embodiment, the video stream emitter receives a video stream including a first video sequence and replaces a portion of the first video sequence with a second video sequence by effectively performing two concatenations, one from the first video sequence to the second video sequence, and another from the second video sequence to the first video sequence. The two concatenations correspond to respective potential splice points, each identified in the video stream by information in the video stream having a corresponding information type. The video stream emitter may include information in the video stream that pertains to each respective concatenation of one of the two video sequences followed by the other of the two video sequences. Included information may further provide properties of pictures at the two adjoined video sequences.
An encoder, possibly in the video stream emitter, may inserts information in the video stream corresponding respectively to each of one or more potential splice points in the video stream, allowing for each of the one or more potential splice points to be identified by the splicer. Information provided by the encoder may further provide properties of one or more potential splice points, in a manner as described below.
It should be understood that terminology of the published ITU-T H.264/AVC standard is assumed.
Further, the MPEG-2 video coding standard can be found in the following publication, which is hereby incorporated by reference: (1) ISO/IEC 13818-2, (2000), “Information Technology—Generic coding of moving pictures and associated audio—Video.” A description of the AVC video coding standard can be found in the following publication, which is hereby entirely incorporated by reference: (2) ITU-T Rec. H.264 (2005), “Advanced video coding for generic audiovisual services.”
Additionally, it should be appreciated that certain embodiments of the various systems and methods disclosed herein are implemented at the video stream layer (as opposed to the system or MPEG transport layer).
The video stream emitter 100 and its corresponding components are configured in one embodiment as a computing device or video processing system or device. The encoding device 102 and/or splicer 104, for instance, can be implemented in software (e.g., firmware), hardware, or a combination thereof.
The video stream emitter 100 outputs plural video sequences of a video stream to the VSRAPD 108 over a communications medium (e.g., HFC, satellite, etc.), which in one embodiments may be part of a subscriber television network. The VSRAPD 108 receives and processes (e.g., decodes and outputs) the video stream for eventual presentation (e.g., in a display device, such as a television, etc.). In one embodiment, the VSRAPD 108 can be a set-top terminal, cable-ready television set, or network device.
The one or more processors that make up the encoding device 102 and splicer 104 of the video stream emitter 100 can each be configured as a hardware device for executing software, particularly that stored in memory or memory devices. The one or more processors can be any custom made or commercially available processor, a central processing unit (CPU), a graphics processing unit, a programmable DSP unit, an auxiliary processor among several processors associated with the encoding device 102 and splicer 104, a semiconductor based microprocessor (in the form of a microchip or chip set), a macroprocessor, or generally any device for executing software instructions. Any processor capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken are included.
The memory or memory devices can include any one or combination of volatile memory elements (e.g., random access memory (RAM, such as DRAM, SRAM, SDRAM, etc.)) and nonvolatile memory elements (e.g., ROM, hard drive, tape, CDROM, etc.). Moreover, the memory may incorporate electronic, magnetic, optical, and/or other types of storage media. Note that the memory can have a distributed architecture, where various components are situated remote from one another, but can be accessed by the respective processor.
The software in memory may include one or more separate programs, each of which comprises an ordered listing of executable instructions for implementing logical functions. When functionality of the encoding device 102 and/or splicer 104 is implemented in software, it should be noted that the software can be stored on any computer readable medium for use by or in connection with any computer related system or method.
In another embodiment, where the video stream emitter 100 is implemented in hardware, the encoding device 102 and splicer 104 can be implemented with any or a combination of the following technologies, which are each well known in the art: a discrete logic circuit(s) having logic gates for implementing logic functions upon data signals, an application specific integrated circuit (ASIC) having appropriate combinational logic gates, a programmable gate array(s) (PGA), a field programmable gate array (FPGA), etc.
It should be appreciated in the context of the present disclosure that the video stream emitter functionality described herein is implemented in one embodiment as a computer-readable medium encoded with computer-executed instructions that when executed by one or more processors of an apparatus/device(s) cause the apparatus/device(s) to carry out one or more methods as described herein.
Having described an example video stream emitter 100, attention is directed to
Given that a compressed picture buffer (CPB) is subject to the initial buffering delay and offset, and the different treatment of non-VCL NAL units in different models, there is need to specify the effective time of the end_of_stream NAL Unit 206. One consideration for the effective time of the end_of_stream NAL Unit 206 is immediately prior to the picture that follows the last decoded picture prior (in relation to the end_of_stream NAL Unit); in other words, in the first video sequence 202 at the end of the first video sequence (or what would be the end of the first video sequence when indicated as a potential slice point). Note that the information 206 is immediately prior to the first picture of the second video sequence 204, as illustrated in
Note that one having ordinary skill in the art would recognize, in the context of the present disclosure, that since a sequence in AVC begins with an IDR picture, the end_of_stream NAL Unit 206 is not required in all implementations to indicate the end of the first video sequence 202. Thus, the end_of_stream NAL unit, or information 206, can be used by encoding device 102 to identify to the splicer 104 a location in the first video sequence that is suitable for concatenation (i.e., a potential splice point). Furthermore, the information 206 can be used to identify a location in the video stream to the VSRAPD 108 corresponding to a concatenation from the first video sequence 202 to the second video sequence 204.
In another embodiment, illustrated by the block diagram of
In one embodiment, the effective time of the end_of_stream NAL Unit 206 can be understood in the following context:
second stream's (CPB delay+DPB delay) is<first stream's (CPB delay+DPB delay).
In one embodiment, it is beneficial if the same or different information (e.g., SEI message) further conveyed the output behavior of certain pictures of the first video sequence 202 in a decoded picture buffer (DPB) to properly specify a transition (e.g., a transition period) in which non-previously output pictures of the first video sequence 202 are output while pictures of the second video sequence 204 enter the CPB. Such behavior is preferably flexible to allow the specification of each non-previously output pictures in the DPB at the concatenation point to be output repeatedly for N output intervals, which gives the option to avoid a gap without outputting pictures, relieve a potential bump in the bit-rate, and extend some initial CPB buffering of the second video sequence 204. However, it should be noted that the encoding device 102 may opt to ignore providing this auxiliary information.
In one embodiment, the second and different auxiliary information 210 (e.g., different than 208) is beneficially used to signal a potential concatenation (or splice) point in the video stream 200 (e.g., 200a, 200b). In one version, the information conveys that M pictures away there is a point in the stream in which the DPB contains K non-previously output pictures with consecutive output times, which aids concatenation devices (e.g., the splicer 104) to identify points in the stream amenable for concatenation.
In another embodiment, auxiliary information conveys the maximum number of out-of-output-order pictures in a low delay (a first processing mode or low delay mode) stream that can follow an anchor picture. An anchor picture herein is defined as an I, IDR, or a forward predicted picture that depends only on reference pictures with output times that are in turn anchor pictures. Such a feature provided by this embodiment is beneficial for trick-modes in applications such as Video-on-Demand (VOD) and Personal Video Recoding (PVR).
In some embodiments, one or more of the above conveyed information can be complemented with provisions that extend the no_output_of_prior_pics_flag at the concatenation (or in some embodiments, the latter ability can stand alone). For instance, referring to
In view of the above-detailed description, it should be appreciated that one video stream emitter method embodiment, illustrated in
Another video stream emitter method embodiment, illustrated in
Another video stream emitter method embodiment, illustrated in
It should be appreciated that the methods described above are note limited to the architectures shown in and described in association with
Further, it should be appreciated in the context of the present disclosure that receive and processing functionality is implied from the various methods described above.
In addition, it should be appreciated that although embodiments of the invention have been described in the context of the JVT and H.264 standard, alternative embodiments of the present disclosure are not limited to such contexts and may be utilized in various other applications and systems, whether conforming to a video coding standard, or especially designed. Furthermore, embodiments are not limited to any one type of architecture or protocol, and thus, may be utilized in conjunction with one or a combination of other architectures/protocols.
Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining” or the like, refer to the action and/or processes of a computer or computing system, or similar electronic computing device, that manipulate and/or transform data represented as physical, such as electronic, quantities into other data similarly represented as physical quantities.
In a similar manner, the term “processor” may refer to any device or portion of a device that processes electronic data, e.g., from registers and/or memory to transform that electronic data into other electronic data that, e.g., may be stored in registers and/or memory. A “computer” or a “computing machine” or a “computing platform” may include one or more processors.
Note that when a method is described that includes several elements, e.g., several steps, no ordering of such elements (e.g., steps) is implied, unless specifically stated.
The methodologies described herein are, in one embodiment, performable by one or more processors (e.g., of encoding device 102 and splicer 104 or generally, of the video stream emitter 100) that accept computer-readable (also called machine-readable) logic encoded on one or more computer-readable media containing a set of instructions that when executed by one or more of the processors carry out at least one of the methods described herein. The processing system further may be a distributed processing system with processors coupled by a network.
The term memory unit as used herein, if clear from the context and unless explicitly stated otherwise, also encompasses a storage system such as a disk drive unit. The processing system in some configurations may include a sound output device, and a network interface device.
The memory subsystem thus includes a computer-readable carrier medium that carries logic (e.g., software) including a set of instructions to cause performing, when executed by one or more processors, one of more of the methods described herein. The software may reside in the hard disk, or may also reside, completely or at least partially, within the RAM and/or within the processor during execution thereof by the computer system. Thus, the memory and the processor also constitute computer-readable carrier medium on which is encoded logic, e.g., in the form of instructions. Furthermore, a computer-readable carrier medium may form, or be includes in a computer program product.
In alternative embodiments, the one or more processors operate as a standalone device or may be connected, e.g., networked to other processor(s), in a networked deployment, the one or more processors may operate in the capacity of a server or a client machine in server-client network environment, or as a peer machine in a peer-to-peer or distributed network environment. The one or more processors may form a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
Thus, one embodiment of each of the methods described herein is in the form of a computer-readable carrier medium carrying a set of instructions, e.g., a computer program that are for execution on one or more processors, e.g., one or more processors that are part of a video processing device. Thus, as will be appreciated by those skilled in the art, embodiments may be embodied as a method, an apparatus such as a special purpose apparatus, an apparatus such as a data processing system, a system, or a computer-readable carrier medium, e.g., a computer program product. The computer-readable carrier medium carries logic including a set of instructions that when executed on one or more processors cause a processor or processors to implement a method. Accordingly, embodiments of the present disclosure may take the form of a method, an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of carrier medium (e.g., a computer program product on a computer-readable storage medium) carrying computer-readable program code embodied in the medium.
The software may further be transmitted or received over a network via a network interface device. While the carrier medium is shown in an example embodiment to be a single medium, the term “carrier medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term “carrier medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by one or more of the processors and that cause the one or more processors to perform any one or more of the methodologies of the present disclosure. A carrier medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical, magnetic disks, and magneto-optical disks. Volatile media includes dynamic memory, such as main memory.
Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise a bus subsystem. Transmission media also may also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.
It will be understood that the steps of methods discussed are performed in one embodiment by an appropriate processor (or processors) of a processing (i.e., computer) system executing instructions stored in storage. It will also be understood that embodiments of the present disclosure are not limited to any particular implementation or programming technique and that the various embodiments may be implemented using any appropriate techniques for implementing the functionality described herein. Furthermore, embodiments are not limited to any particular programming language or operating system.
Reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to one of ordinary skill in the art from this disclosure, in one or more embodiments.
Similarly it should be appreciated that in the above description of example embodiments of the disclosure, various features of the disclosure are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various concepts. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claims requires more features than are expressly recited in each claim. Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention, and form different embodiments, as would be understood by those in the art. For example, in the following claims, any of the claimed embodiments can be used in any combination.
Furthermore, some of the embodiments are described herein as a method or combination of elements of a method that can be implemented by a processor of a computer system or by other means of carrying out the function. Thus, a processor with the necessary instructions for carrying out such a method or element of a method forms a means for carrying out the method or element of a method. Furthermore, an element described herein of an apparatus embodiment is an example of a means for carrying out the function performed by the element for the purpose of carrying out one or more of the disclosed embodiments.
Rather, as the following claims reflect, various inventive features lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the DESCRIPTION OF EXAMPLE EMBODIMENTS are hereby expressly incorporated into this DESCRIPTION OF EXAMPLE EMBODIMENTS, with each claim standing on its own as a separate embodiment of the disclosure.
Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the disclosure, and form different embodiments, as would be understood by those in the art. For example, in the following claims, any of the claimed embodiments can be used in any combination.
Furthermore, some of the embodiments are described herein as a method or combination of elements of a method that can be implemented by a processor of a computer system or device or by other means of carrying out the function. Thus, a processor with the necessary instructions for carrying out such a method or element of a method forms a means for carrying out the method or element of a method. Furthermore, an element described herein of an apparatus embodiment is an example of a means for carrying out the function performed by the element for the purpose of carrying out certain disclosed methods.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments may be practiced without these specific details. In other instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
As used herein, unless otherwise specified the use of the ordinal adjectives “first”, “second”, “third”, etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
Thus, while there has been described what are believed to be the preferred embodiments, those skilled in the art will recognize that other and further modifications may be made thereto without departing from the spirit of the disclosure, and it is intended to claim all such changes and modifications as fall within the scope of the embodiments. For example, any formulas given above are merely representative of procedures that may be used. Functionality may be added or deleted from the block diagrams and operations may be interchanged among functional blocks. Steps may be added or deleted to methods described within the scope of the present disclosure.
This application claims priority to copending U.S. provisional application entitled, “SPLICING AND PROCESSING VIDEO AND OTHER FEATURES FOR LOW DELAY,” having Ser. No. 60/980,442, filed Oct. 16, 2007, which is entirely incorporated herein by reference. This application is related to copending U.S. utility application entitled, “INDICATING PICTURE USEFULNESS FOR PLAYBACK OPTIMIZATION,” having Ser. No. 11/831,916, filed Jul. 31, 2007, which is entirely incorporated herein by reference. Application Ser. No. 11/831,916 has also published on May 15, 2008 as U.S. Patent Publication No. 20080115176A1.
Number | Date | Country | |
---|---|---|---|
60980442 | Oct 2007 | US |