This application claims the priority benefit of Korean Patent Application No. 10-2015-0013615, filed on Jan. 28, 2015, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein by reference.
1. Field of the Invention
Embodiments relate to a video stream processing method and apparatus that edits pictures sequence, and more particularly, to a video stream processing method and apparatus that easily edits pictures included in a video stream by configuring a group of pictures (GOP) included in the video stream with an intra coded picture (I-picture) and at least one bi-prediction coded picture (B-picture) referring to the I-picture.
2. Description of the Related Art
Pictures or video included in a video stream need to be edited to produce a broadcast program. A video stream refers to data including a plurality of pictures, and the pictures included in the video stream may be encoded using intra prediction or inter prediction. To edit an inter prediction based picture which refers to another picture, the picture which the corresponding picture refers to should be decoded together. Accordingly, although a relatively few number of pictures are edited, a computational complexity for decoding and encoding may considerably increase.
Recently, technology that applies intra prediction to all pictures included in a video stream is used to easily support editing on frame by frame basis. When all pictures included in the video stream are processed using intra prediction, a picture not to be edited does not need to be decoded or re-encoded. However, when all pictures are processed using intra prediction, a volume of the corresponding video stream may greatly increase, which is not suitable for a storage device with a restricted capacity.
An aspect provides a method and apparatus that may easily edit pictures and effectively increase an encoding efficiency by configuring a group of pictures (GOP) with an intra coded picture (I-picture) and at least one bi-prediction coded picture (B-picture) referring to the I-picture.
Another aspect also provides a method and apparatus that may output an editing result as an encoded video stream without performing re-encoding by configuring a GOP with an I-picture and at least one B-picture referring to the I-picture.
Still another aspect also provides a method and apparatus that may remove a computational complexity for re-encoding and enable fast processing when storing or outputting an editing result by outputting the editing result as an encoded video stream without performing re-encoding.
Yet another aspect also provides a method and apparatus that may encode a video stream in a prediction structure of ultra low delay by configuring a GOP with an I-picture and at least one B-picture referring to the I-picture.
According to an aspect, there is provided a video stream processing method including identifying a target picture to be edited among an intra coded picture (I-picture) and at least one bi-prediction coded picture (B-picture) subsequent to the I-picture, the I-picture and the at least one B-picture constituting a group of pictures (GOP) included in a video stream, and processing the target picture, wherein pictures included in the video stream may be decoded in a playback order.
Each of the at least one B-picture may be predicted by referring to the I-picture.
The processing may include decoding the target picture and a reference I-picture which the target picture refers to.
The processing may include setting a flag of the target picture and a flag of the reference I-picture as different values.
The target picture may be decoded and played back, and the reference I-picture may be decoded and not be played back.
A video stream including the processed target picture may be output, and the output video stream may include a flag indicating that a picture decoded and to not be played back is included.
The video stream may be encoded or decoded using high efficiency video coding (HEVC).
The I-picture may be decoded separately without referring to another picture.
According to another aspect, there is also provided a video stream processing apparatus including an identifier configured to identify a target picture to be edited among an I-picture and at least one B-picture subsequent to the I-picture, the I-picture and the at least one B-picture constituting a GOP included in a video stream, and a processor configured to process the target picture, wherein the pictures included in the video stream may be decoded in a playback order.
These and/or other aspects, features, and advantages of the invention will become apparent and more readily appreciated from the following description of embodiments, taken in conjunction with the accompanying drawings of which:
Hereinafter, embodiments are described in detail with reference to the accompanying drawings. The following specific structural or functional descriptions are exemplary to merely describe the embodiments, and the scope of the embodiments is not limited to the descriptions provided in the present specification. Various changes and modifications can be made thereto by those of ordinary skill in the art. Further, like reference numerals refer to the like elements throughout the drawings, and a known function and configuration are not described herein.
A video stream 100 provided in a prediction structure illustrated in
The video stream 100 may include at least one of an I-picture, a P-picture, and a B-picture.
The I-picture may be encoded using intra prediction. All pixels in the I-picture may be encoded. The I-picture may be encoded based on the corresponding picture only. In detail, the I-picture refers to a picture that may be encoded separately without referring to another picture positions in a vicinity of the corresponding picture. Similarly, the I-picture may be decoded based on the corresponding picture only. The I-picture may be referred to by the P-picture or the B-picture.
The P-picture may be encoded using forward inter-picture prediction. The P-picture may be predicted by referring to a P-picture or an I-picture which is positioned temporarily in advance. Thus, the P-picture may include a less amount of information than the I-picture. Similarly, the P-picture may be decoded using forward inter-picture prediction.
The B-picture may be encoded using bi-inter-picture prediction, for example, forward prediction and backward prediction, two forward predictions, and two backward predictions. The B-picture may be predicted by referring to an I-picture or a P-picture which is positioned temporarily in advance, and an I-picture or a P-picture which is positioned temporarily behind. Thus, the B-picture may include a less amount of information than the I-picture and the P-picture. Similarly, the B-picture may be decoded using bi-inter-picture prediction.
The I-picture, the P-picture, and the B-picture may constitute a single group of pictures (GOP). The GOP is a set of successive pictures. In general, a GOP may include a single I-picture, at least one P-picture, and at least one B-picture. Thus, the GOP may include a single I-picture and pictures included in a period before a subsequent I-picture.
For example, a GOP of
In detail, P1 is a P-picture which is predicted by referring to I1. To encode P1, I1 needs to be previously encoded. Similarly, to decode P1, decoding I1 needs to be previously performed. B5 is a B-picture which is predicted by referring to P2 and I2. Thus, to encode B5, P2 and I2 need to be previously encoded. Similarly, to decode B5, decoding P2 and I2 needs to be previously performed.
The GOP of
A video stream 200 provided in a prediction structure illustrated in
For example, a case that an editor wants to cut from B1 to B3 is assumed in
In order to play back to preview the cut editing of the editing period 1, B1, B2, P1, and B3 need to be decoded. To decode B1, B2, P1, and B3, decoding I1 and P2 need to be performed together. In detail, to decode P1, decoding I1 needs to be previously performed. To decode B3, decoding P2 needs to be previously performed.
To generate the output of the cut editing in
A video stream 300 provided in a prediction structure illustrated in
An I-picture refers to a picture to be encoded or decoded using intra prediction, and may be the same as the I-pictures illustrated in
However, the B-picture in
The video stream 300 may include GOPs, and each GOP may include an I-picture and at least one B-picture. The at least one B-picture included in each GOP may be predicted by referring to the I-picture included in the corresponding GOP. The I-picture included in each GOP may be disposed temporarily at a foremost position, among the pictures included in the corresponding GOP. The at least one B-picture may be predicted by referring to the I-picture which is positioned temporarily in advance.
For example, as shown in
A video stream 400 provided in a prediction structure illustrated in
First case is assumed a cut editing of the editing period 1, B3 and B4 in GOP1 in
Second case is assumed that pictures included in different GOPs may be edited together such as the editing period 2 in
For example, in a case of editing to cut and merge the editing period 1 and editing period 2 in
Because the configuring of a GOP included in a video stream with an I-picture and at least one B-picture having only one reference picture as the I-picture such as
Further, the configuring of a GOP such as
Referring to
For example, in a case of editing B3 and B4 corresponding to the editing period 1 in
Referring to
For example, in a case of editing B3 and B4 corresponding to the editing period 1 of
In another example, in a case of outputting a result of cut and merge editing the editing period 1 and the editing period 2 in
The video stream processing method may be performed by a processor included in a video stream processing apparatus according to an embodiment.
Referring to
In operation 720, the video stream processing apparatus processes the target picture. The target picture may be a picture to be actually edited, and include at least one of the I-picture and the at least one B-picture.
To edit the target picture, the video stream processing apparatus may decode the target picture and a reference I-picture which the target picture refers to. In a case in which decoding the reference I-picture needs to be previously performed to decode the target picture, the video stream processing apparatus may decode the reference I-picture and then decode the target picture.
In a case of outputting an editing result as an encoded video stream, the video stream processing apparatus may set a flag of the target picture and a flag of the reference I-picture as different values. Accordingly, when playing back the encoded video stream, the target picture may be played back and the reference I-picture may not be played back. The output encoded video stream may include a flag which indicates that a picture to be decoded and not to be played back is included.
Referring to
The identifier 810 identifies a target picture to be edited among an I-picture and at least one B-picture constituting a GOP included in a video stream. The at least one B-picture may be positioned subsequent to the I-picture, and predicted by referring to the I-picture. The I-picture may be separately decoded or encoded without referring to another picture. The pictures included in the video stream may be decoded in a playback order. The video stream may be encoded or decoded using HEVC.
The processor 820 processes the target picture. The target picture may be a picture to be actually edited, and include at least one of the I-picture and the at least one B-picture.
To edit the target picture, the processor 820 may decode the target picture and a reference I-picture which the target picture refers to. In a case in which decoding the reference I-picture needs to be previously performed to decode the target picture, the processor 820 may decode the reference I-picture and then decode the target picture.
In a case of outputting an editing result as an encoded video stream, the processor 820 may set a flag of the target picture and a flag of the reference I-picture as different values. Accordingly, when playing back the encoded video stream, the target picture may be played back and the reference I-picture may not be played back. The output encoded video stream may include a flag which indicates that a picture to be decoded and not to be played back is included.
According to an embodiment, pictures may be easily edited and an encoding efficiency may effectively increase by configuring a group of pictures (GOP) with an I-picture and at least one B-picture referring to the I-picture.
According to an embodiment, an editing result may be output as an encoded video stream without performing re-encoding by configuring a GOP with an I-picture and at least one B-picture referring to the I-picture.
According to an embodiment, a computational complexity for re-encoding may be removed and fast processing when storing or outputting an editing result may be enabled by outputting the editing result as an encoded video stream without performing re-encoding.
According to an embodiment, a video stream may be encoded in a prediction structure of ultra low delay by configuring a GOP with an I-picture and at least one B-picture referring to the I-picture.
The units and/or modules described herein may be implemented using hardware components and software components. For example, the hardware components may include microphones, amplifiers, band-pass filters, audio to digital convertors, and processing devices. A processing device may be implemented using one or more hardware device configured to carry out and/or execute program code by performing arithmetical, logical, and input/output operations. The processing device(s) may include a processor, a controller and an arithmetic logic unit, a digital signal processor, a microcomputer, a field programmable array, a programmable logic unit, a microprocessor or any other device capable of responding to and executing instructions in a defined manner. The processing device may run an operating system (OS) and one or more software applications that run on the OS. The processing device also may access, store, manipulate, process, and create data in response to execution of the software. For purpose of simplicity, the description of a processing device is used as singular; however, one skilled in the art will appreciate that a processing device may include multiple processing elements and multiple types of processing elements. For example, a processing device may include multiple processors or a processor and a controller. In addition, different processing configurations are possible, such as parallel processors.
The software may include a computer program, a piece of code, an instruction, or some combination thereof, to independently or collectively instruct and/or configure the processing device to operate as desired, thereby transforming the processing device into a special purpose processor. Software and data may be embodied permanently or temporarily in any type of machine, component, physical or virtual equipment, computer storage medium or device, or in a propagated signal wave capable of providing instructions or data to or being interpreted by the processing device. The software also may be distributed over network coupled computer systems so that the software is stored and executed in a distributed fashion. The software and data may be stored by one or more non-transitory computer readable recording mediums.
The methods according to the above-described example embodiments may be recorded in non-transitory computer-readable media including program instructions to implement various operations of the above-described example embodiments. The media may also include, alone or in combination with the program instructions, data files, data structures, and the like. The program instructions recorded on the media may be those specially designed and constructed for the purposes of example embodiments, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of non-transitory computer-readable media include magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROM discs, DVDs, and/or Blue-ray discs; magneto-optical media such as optical discs; and hardware devices that are specially configured to store and perform program instructions, such as read-only memory (ROM), random access memory (RAM), flash memory (e.g., USB flash drives, memory cards, memory sticks, etc.), and the like. Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter. The above-described devices may be configured to act as one or more software modules in order to perform the operations of the above-described example embodiments, or vice versa.
A number of example embodiments have been described above. Nevertheless, it should be understood that various modifications may be made to these example embodiments. For example, suitable results may be achieved if the described techniques are performed in a different order and/or if components in a described system, architecture, device, or circuit are combined in a different manner and/or replaced or supplemented by other components or their equivalents. Accordingly, other implementations are within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2015-0013615 | Jan 2015 | KR | national |