The present invention relates to decoding video frames, and more particularly, to a method and an apparatus for decoding a multi-level video bitstream.
The conventional video coding standards generally adopt a block based (or coding unit based) coding technique to exploit spatial redundancy and temporal redundancy. For example, the basic approach is to divide the whole source frame into a plurality of blocks (coding units), perform prediction on each block (coding unit), transform residues of each block (coding unit), and perform quantization and entropy encoding. Besides, a reconstructed frame is generated in a coding loop to provide reference pixel data used for coding following blocks (coding units). For certain video coding standards, in-loop filter(s) may be used for enhancing the image quality of the reconstructed frame. A video encoder is used to encode video frames into a bitstream for transmission or storage. A video decoder can be used to decode the bitstream generated from the video encoder to obtain decoded video frames.
Development of video compression is of high importance with the increased use of video. Having an effective compression of video will affect both storage space needed and transmission bandwidth used. There is a large variety of video playback devices today, with a great diversity in screen size, bandwidth and processing power. One multi-level video bitstream, including one video subset bitstream corresponding to a fundamental plane and at least one video subset bitstream corresponding to at least one augmented plane, can be generated from a video source device to meet different playback requirements (e.g., different screen sizes and/or different computing capabilities) of video playback devices. There is a need for an innovative video decoding design that is capable of decoding the multi-level video bitstream in an efficient way.
One of the objectives of the claimed invention is to provide a method and an apparatus for decoding a multi-level video bitstream.
According to a first aspect of the present invention, an exemplary video decoding method for decoding a multi-plane video bitstream is disclosed. The multi-plane video bitstream includes a first video subset bitstream corresponding to a fundamental plane (FP) and at least one second video subset bitstream corresponding to at least one augmented plane (AP). The exemplary video decoding method includes: decoding the first video subset bitstream, wherein decoding of a first FP frame is performed to generate a first decoded FP frame; performing resampling of one decoded FP frame to generate one resampled frame, wherein said one resampled frame serves as one reference frame for decoding one AP frame; and decoding said at least one second video subset bitstream, wherein decoding of a first AP frame is performed to generate a first decoded AP frame. A processing time of performing decoding of the first FP frame overlaps a processing time of performing resampling of said one decoded FP frame.
According to a second aspect of the present invention, an exemplary video decoding method for decoding a multi-plane video bitstream is disclosed. The multi-plane video bitstream includes a first video subset bitstream corresponding to a fundamental plane (FP) and at least one second video subset bitstream corresponding to at least one augmented plane (AP). The exemplary video decoding method includes: decoding the first video subset bitstream, wherein decoding of a first FP frame is performed to generate a first decoded FP frame; performing resampling of one decoded FP frame to generate one resampled frame, wherein said one resampled frame serves as one reference frame for decoding one AP frame; and decoding said at least one second video subset bitstream, wherein decoding of a first AP frame is performed to generate a first decoded AP frame. A processing time of performing decoding of the first AP frame overlaps a processing time of performing resampling of said one decoded FP frame.
According to a third aspect of the present invention, an exemplary video decoding apparatus for decoding a multi-plane video bitstream is disclosed. The exemplary multi-plane video bitstream includes a first video subset bitstream corresponding to a fundamental plane (FP) and at least one second video subset bitstream corresponding to at least one augmented plane (AP). The exemplary video decoding apparatus includes a video decoding circuit and a resampling circuit. The video decoding circuit is arranged to decode the first video subset bitstream, wherein decoding of a first FP frame is performed to generate a first decoded FP frame. The video decoding circuit is further arranged to decode said at least one second video subset bitstream, wherein decoding of a first AP frame is performed to generate a first decoded AP frame. The resampling circuit is arranged to perform resampling of one decoded FP frame to generate one resampled frame, wherein said one resampled frame serves as one reference frame for decoding one AP frame. A processing time of performing resampling of said one decoded FP frame overlaps at least one selected from a group of a processing time of performing decoding of the first FP frame and a processing time of performing decoding of the first AP frame.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
Certain terms are used throughout the following description and claims, which refer to particular components. As one skilled in the art will appreciate, electronic equipment manufacturers may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not in function. In the following description and in the claims, the terms “include” and “comprise” are used in an open-ended fashion, and thus should be interpreted to mean “include, but not limited to . . . ”. Also, the term “couple” is intended to mean either an indirect or direct electrical connection. Accordingly, if one device is coupled to another device, that connection may be through a direct electrical connection, or through an indirect electrical connection via other devices and connections.
The demultiplexer 102 is arranged to apply demultiplexing to the first video subset bitstream BSFP and the second video subset bitstream BSAP multiplexed in the same multi-plane video bitstream BSML. For example, a video content carried by the first video subset bitstream BSFP and a video content carried by the second video subset bitstream BSAP may have different resolutions (i.e., different frame sizes), different frame rates, different signal-to-noise ratios (i.e., different quality), different color formats, and/or different bit depths.
The video decoding circuit 106 is arranged to decode the first video subset bitstream BSFP to generate decoded FP frames IMGFP on the fundamental plane. The video decoding circuit 106 is further arranged to decode the second video subset bitstream BSAP to generate decoded AP frames IMGAP on the augmented plane. In some embodiments of the present invention, each FP frame decoded from the first video subset bitstream BSFP has a first resolution, and each AP frame decoded from the second video subset bitstream BSAP has a second resolution, where the first resolution is smaller than the second resolution. For example, the first resolution may be a Full High Definition (FHD) resolution, and the second resolution may be an Ultra High Definition (UHD) resolution.
The decoded FP frame IMGFP can be used to generate a reference frame used by prediction involved in decoding of one AP frame. The resampling circuit 104 is arranged to performing resampling of one decoded FP frame IMGFP to generate one resampled frame IMGRS, wherein the resampled frame IMGRS serves as one reference frame for decoding one AP frame. For example, a video content carried by the first video subset bitstream BSFP and a video content carried by the second video subset bitstream BSAP may have different resolutions and/or different bit depths. The resampling circuit 104 resamples the decoded FP frame IMGFP on the fundamental plane to generate the resampled frame IMGFP having the same resolution and bit depth possessed by the AP frame on the augmented plane. In other words, the resampled frame IMGF Pand the decoded AP frame IMGAP have the same resolution and bit depth. Moreover, the motion vector (MV) information involved in generating the decoded FP frame IMGFP may be resampled by the resampling circuit 104 for AP temporal MV prediction.
Since a decoded FP frame generated from decoding an FP frame is resampled to serve as a reference frame that can be used for decoding an AP frame to generate a decoded AP frame, there is dependency between decoding of an FP frame and resampling of a decoded FP frame and dependency between resampling of a decoded FP frame and decoding of an AP frame. For example, a resampling process (RP) is performed after decoding of an FP frame. For another example, the RP is performed before decoding of an AP frame.
If a resampling process of a decoding result of an FP frame (e.g., “FP pic 0”) on the fundamental plane does not start until a decoding process of the FP frame (e.g., “FP pic 0”) on the fundamental plane is completed and a decoding process of an AP frame (e.g., “AP pic 0”) on the augmented plane does not start until the resampling process of the decoding result of the FP frame (e.g., “FP pic 0”) on the fundamental plane is completed, the decoding time needed to obtain one decoded FP frame and one AP decoded frame is roughly equal to TFP+TRP+TAP, where TFP is FP decoding time, TRP is RP processing time, and TAP is AP decoding time. To effectively reduce the decoding time, the present invention proposes a parallel processing scheme for decoding the multi-level video bitstream BSML.
The parallel processing scheme is managed by the control circuit 108. In this embodiment, the control circuit 108 is arranged to control operations of the video decoding circuit 106 and the resampling circuit 104. For example, the control circuit 108 is arranged to trigger a decoding process of one FP frame, a decoding process of one AP frame, and a resampling process of one decoded FP frame. For example, the control circuit 108 may be a processor, and operations of the video decoding circuit 106 and the resampling circuit 104 may be managed by firmware FW running on the control circuit 108. Decoding the first video subset bitstream BSFP includes performing decoding of one FP frame to generate one decoded FP frame IMGFP. Decoding the second video subset bitstream BSAP includes performing decoding of one AP frame to generate one decoded AP frame IMGAP. When the parallel processing scheme is enabled by the control circuit 108, a processing time of performing resampling of one decoded FP frame overlaps at least one selected from a group of a processing time of performing decoding of one FP frame and a processing time of performing decoding of one AP frame. For example, a processing time of performing resampling of one decoded FP frame overlaps a processing time of performing decoding of one FP frame. For another example, a processing time of performing resampling of one decoded FP frame overlaps a processing time of performing decoding of one AP frame. For yet another example, a processing time of performing resampling of one decoded FP frame overlaps a processing time of performing decoding of one FP frame, and further overlaps a processing time of performing decoding of one AP frame. It should be noted that the term “overlap” may mean a processing time of one operation is fully hidden in a processing time of another operation, or a processing time of one operation is partially hidden in a processing time of another operation.
In one exemplary design, the video decoding circuit 106 shown in
The video decoder 300 with the video decoder architecture 400 can be managed by the control circuit 108 to perform FP frame decoding and AP frame decoding in a time-division manner. When the bitstream to be decoded is a part of the first video subset bitstream BSFP that is output from the demultiplexer 102, the video decoder architecture 400 is used to decode the part of the first video subset bitstream BSFP to generate one decoded frame being a decoded FP frame IMGFP, where the decoded FP frame IMGFP is stored into the DPB 412 and can be used as a reference frame for decoding FP frame(s). When the bitstream to be decoded is a part of the second video subset bitstream BSAP that is output from the demultiplexer 102, the video decoder architecture 400 is re-used to decode the part of the second video subset bitstream BSAP to generate one decoded frame being a decoded AP frame IMGAP, where the decoded AP frame IMGAP is stored into the DPB 412 and can be used as a reference frame for decoding AP frame(s). Moreover, after the decoded FP frame IMGFP is stored into the DPB 412, the resampling circuit 104 reads the decoded FP frame IMGFP from the DPB 412, resamples the decoded FP frame IMGFP to generate a resampled frame IMGRS, and stores the resampled frame IMGRS into the DPB 412, where the resampled frame IMGRS can be used as a reference frame for decoding one AP frame.
Regarding the video decoding circuit 106 implemented using a single hardware video decoder, a parallel processing scheme can be managed by the control circuit 108 to effectively reduce the decoding time. For example, the decoding time can be reduced by allowing a resampling process of a decoded FP frame to start before an end of a decoding process of an FP frame that is used for generating the decoded FP frame, and/or allowing a decoding process of an AP frame to start before an end of the resampling process of the decoded FP frame. In this way, the decoding time needed to obtain one decoded FP frame and one AP decoded frame is less than TFP+TRP+TAP, where TFP is FP decoding time, TRP is RP processing time, and TAP is AP decoding time.
Similarly, the resampling circuit 104 can start a resampling process of a decoded FP frame (which is derived from a decoding process of an FP frame “FP pic 1”) before the decoding process of the FP frame “FP pic 1” is completed, and the video decoder 300 can start a decoding process of an AP frame “AP pic 1” (which uses the resampled frame “RP pic 1” as a reference frame) before the resampling process of the decoded FP frame (which is derived from the decoding process of the FP frame “FP pic 1”) is completed. Hence, a portion of a resampled frame “RP pic 1” is generated by resampling a portion of the decoded FP frame (which is derived from the decoding process of the FP frame “FP pic 1”) before the decoding process of the FP frame “FP pic 1” is completed, and a portion of a decoded AP frame (which is derived from the decoding process of the AP frame “AP pic 1”) is generated before the resampling process of the decoded FP frame (which is derived from the decoding process of the FP frame “FP pic 1”) is completed.
Regarding the decoding time reduction scheme shown in
As shown in
In another exemplary design, the video decoding circuit 106 shown in
In some embodiments of the present invention, each of the video decoders 702 and 704 shown in
In addition, the video decoder 704 with the video decoder architecture 400 can be managed by the control circuit 108 to perform AP decoding. As shown in
Compared to the DPB 703 of the video decoder 702, the DPB 705 of the video decoder 704 further store resampled frames IMGRS generated from resampling decoded FP frames IMGFP. Hence, after the decoded FP frame IMGFP is stored into the DPB 703, the resampling circuit 104 reads the decoded FP frame IMGFP from the DPB 703, resamples the decoded FP frame IMGFP to generate a resampled frame IMGRS, and stores the resampled frame IMGRS into the DPB 705, where the resampled frame IMGRS stored in the DPB 705 can be used as a reference frame for decoding one AP frame.
Regarding the video decoding circuit 106 implemented using multiple hardware video decoders, a parallel processing scheme can be managed by the control circuit 108 to effectively reduce the decoding time. For example, the decoding time can be reduced by allowing a resampling process of a decoded FP frame to start before an end of a decoding process of an FP frame that is used for generating the decoded FP frame, and/or allowing a decoding process of an AP frame to start before an end of the resampling process of the decoded FP frame. In this way, the decoding time needed to obtain one decoded FP frame and one AP decoded frame is less than TFP+TRP+TAP, where TFP is FP decoding time, TRP is RP processing time, and TAP is AP decoding time.
In addition, the resampling circuit 104 can start a resampling process of a decoded FP frame (which is derived from a decoding process of an FP frame “FP pic 1”) before the decoding process of the FP frame “FP pic 1” is completed, and the video decoder 704 can start a decoding process of an AP frame “AP pic 1” (which uses a resampled frame “RP pic 1” as a reference frame) before the resampling process of the decoded FP frame (which is derived from the decoding process of the FP frame “FP pic 1”) is completed. Hence, a portion of the resampled frame “RP pic 1” is generated by resampling a portion of the decoded FP frame (which is derived from the decoding process of the FP frame “FP pic 1”) before the decoding process of the FP frame “FP pic 1” is completed, and a portion of a decoded AP frame (which is derived from the decoding process of the AP frame “AP pic 1”) is generated before the resampling process of the decoded FP frame (which is derived from the decoding process of the FP frame “FP pic 1”) is completed.
Similarly, the resampling circuit 104 can start a resampling process of a decoded FP frame (which is derived from a decoding process of an FP frame “FP pic 2”) before the decoding process of the FP frame “FP pic 2” is completed, and the video decoder 704 can start a decoding process of an AP frame “AP pic 2” (which uses a resampled frame “RP pic 2” as a reference frame) before the resampling process of the decoded FP frame (which is derived from the decoding process of the FP frame “FP pic 2”) is completed. Hence, a portion of the resampled frame “RP pic 2” is generated by resampling a portion of the decoded FP frame (which is derived from the decoding process of the FP frame “FP pic 2”) before the decoding process of the FP frame “FP pic 2” is completed, and a portion of a decoded AP frame (which is derived from the decoding process of the AP frame “AP pic 2”) is generated before the resampling process of the decoded FP frame (which is derived from the decoding process of the FP frame “FP pic 2”) is completed.
Regarding the decoding time reduction scheme shown in
As shown in
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.
This application claims the benefit of U.S. provisional application No. 62/555,149, filed on 9/7/2017 and incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62555149 | Sep 2017 | US |