The present invention relates generally to video coding. More particularly, the present invention relates to scalable video coding.
Conventional video coding standards, such as the Moving Pictures Expert Group (MPEG)-1, H.261/263/264 standards, incorporate motion estimation and motion compensation in order to remove temporal redundancies between video frames. The scalable extension to the H.264/AVC (which stands for Advanced Video Coding) standard currently enables fine-grained scalability, according to which the quality of a video sequence may be improved by increasing the bit rate in increments of ten percent or less. Currently, fine granularity scalability (FGS) information is not considered to be a separate “layer,” but instead is stored along with the “base layer” it is encoded relative to. However, when forming subsequent enhancement layers, it would be beneficial to have the option of basing the enhancement upon the base layer either with or without FGS.
Conventional systems, though moderately useful, include at least two substantial problems. First, scalability does not always follow a “linear” path. For example, it may be desirable to have a low spatial resolution base layer encoded at some minimal acceptable quality, with FGS used to enhance the quality. Furthermore, it may also be desirable to have a spatial enhancement encoded relative to the base layer (excluding FGS). This could be desired, for example, due to bit rate constraints on a transmission channel that does not permit the “expense” of transmitting the extra FGS data when only a spatial enhancement is desired.
In the currently-planned H.264/AVC scalability extension, the FGS information is not considered to be a separate layer. In the slice header, the syntax element base_id_plus is used to indicate the base layer picture of an enhancement layer picture. However, there is no mechanism of specifying whether a subsequent enhancement layer is encoded relative to the base layer with or without FGS and, if with FGS, with which FGS layers. In other words, the operation must be “hard wired”.
Second, the progressive enhancement/refinement slices (i.e., FGS slices) and the corresponding base layer picture are currently envisioned as being in the same picture and therefore the same access unit. These items also have the same value for the DependencyId. This architecture is less than optimal for system-layer operations. In the media file format, e.g. the AVC file format specified in ISO/IEC 14496-15, metadata information is typically stored for each sample containing a picture or an access unit. The above picture (access unit) definition therefore requires a streaming server to parse into samples, even for non-FGS scalable streaming (i.e. when truncation of FGS slices is not needed to reach the desired scalable presentation point). From this point of view, the current design enforces a media file format for storage of scalable video content with increased complexity, which implies streaming server operations with increased complexity.
The present invention involves coding FGS information in a separate layer to its corresponding base information. According to one embodiment of the present invention, each FGS enhancement layer is made into its own picture and is assigned a unique DependencyId value. In this sense, each FGS enhancement plane or layer is treated in the same manner as other enhancement layers, such as spatial enhancement layers. The base layer picture of the FGS enhancement layer is made into another picture with its own DependencyId value. Subsequent enhancement layers will be coded relative to either the quality base layer or an FGS enhancement layer. This system of the present invention provides an improved level of flexibility in scalable video coding while also possessing a low level of complexity.
According to another embodiment of the present invention, each FGS enhancement layer is not made into its own picture and therefore is not assigned a unique DependencyId value. However, the QualityLevel value that is associated with each FGS enhancement layer is used to identify whether a subsequent enhancement layer is encoded relative to the base layer with or without FGS and, if with FGS, with which FGS layers. This can be accomplished by including a new syntax element in the bitstream, e.g., in the slice header, to indicate the QualityLevel value of the corresponding FGS slice is referenced in the encoding of a subsequent enhancement layer. In this case, the base_id_plus1 in the slice header is still used to indicate the DepdencyId value of the quality base layer that is referenced by both the first FGS layer and a subsequent enhancement layer.
According to another embodiment of the present invention, each FGS enhancement layer is made into its own picture and is assigned a unique DependencyId value. The DependencyId value associated with each FGS enhancement layer is used to identify whether a subsequent enhancement layer is encoded relative to the base layer with or without FGS and, if with FGS, with which FGS layers. This can be accomplished by including a new syntax element in the bitstream, e.g. in the slice header, to indicate the DependencyId value with which the associated FGS slice is referenced in the encoding of a subsequent enhancement layer. In this case, the base_id_plus1 in the slice header is still used to indicate the DepdencyId value of the quality base layer that is referenced by both the first FGS layer and a subsequent enhancement layer.
These and other objects, advantages and features of the invention, together with the organization and manner of operation thereof, will become apparent from the following detailed description when taken in conjunction with the accompanying drawings, wherein like elements have like numerals throughout the several drawings described below.
For exemplification, the system 10 shown in
The exemplary communication devices of the system 10 may include, but are not limited to, a mobile telephone 12, a combination PDA and mobile telephone 14, a PDA 16, an integrated messaging device (IMD) 18, a desktop computer 20, and a notebook computer 22. The communication devices may be stationary or mobile as when carried by an individual who is moving. The communication devices may also be located in a mode of transportation including, but not limited to, an automobile, a truck, a taxi, a bus, a boat, an airplane, a bicycle, a motorcycle, etc. Some or all of the communication devices may send and receive calls and messages and communicate with service providers through a wireless connection 25 to a base station 24. The base station 24 may be connected to a network server 26 that allows communication between the mobile telephone network 11 and the Internet 28. The system 10 may include additional communication devices and communication devices of different types.
The communication devices may communicate using various transmission technologies including, but not limited to, Code Division Multiple Access (CDMA), Global System for Mobile Communications (GSM), Universal Mobile Telecommunications System (UMTS), Time Division Multiple Access (TDMA), Frequency Division Multiple Access (FDMA), Transmission Control Protocol/Internet Protocol (TCP/IP), Short Messaging Service (SMS), Multimedia Messaging Service (MMS), e-mail, Instant Messaging Service (IMS), Bluetooth, IEEE 802.11, etc. A communication device may communicate using various media including, but not limited to, radio, infrared, laser, cable connection, and the like.
One embodiment of the present invention involves the removal of the QualityLevel information from the decodability_dependency_information. Instead, the present invention assigns a distinct DependencyId value to each FGS enhancement layer. Therefore, whenever an enhancement layer specifies the DependencyId value of the base layer on which it depends, either a base-quality layer or any FGS enhancement to that base-quality layer can be specified, as each has a unique value of DependencyId.
One embodiment of the invention for decoding scalable video data is discussed below and is depicted in
The following is a basic example showing how the embodiment of the present invention discussed above is implemented. A QCIF 48 kbps layer, which is the base quality layer, can have a DependencyID of 0, while having no BaseDependencyID (a base dependency identifier) which is used to indicate the corresponding base layer, because it is not relative to another layer. A QCIF 64 kbps layer (i.e., a 16 kbps FGS layer), can have a DependencyID of 1 and a BaseDependencyID of 0, meaning that it is encoded relative to the QCIF 48 kpbs layer. A CIF 84 kbps layer (a spatial enhancement layer) can have a DependencyID of 2 and a BaseDependencyID of 0, meaning that it is also encoded relative to the QCIF 48 kbps layer. On the other hand, the CIF 84 kbps layer could alternatively have a BaseDependencyID of 1, in which case it would be encoded relative to the QCIF 64 kpbs layer. By the FGS enhancement layer having a different DependencyID than the base quality layer, subsequent enhancement layers are able to be encoded relative to either the base layer or to a FGS enhancement layer.
Another embodiment of the present invention involves the use of the QualityLevel value from the decodability_dependency_information in order to identify whether a subsequent enhancement layer is encoded relative to the base layer with or without FGS and, if with FGS, with which FGS layers. This can be accomplished by including a new syntax element in the bitstream, e.g. in the slice header, to indicate the QualityLevel value with which the associated FGS slice is referenced in the encoding of a subsequent enhancement layer. In this case, the base_id_plus1 in the slice header is still used to indicate the DepdencyId value of the quality base layer that is referenced by both the first FGS layer and a subsequent enhancement layer.
Yet another embodiment of the present invention involves the removal of the QualityLevel information from the decodability_dependency_information. Instead, the present invention assigns a distinct DependencyId value to each FGS enhancement layer. Furthermore, the DependencyId value associated with each FGS enhancement layer is used to identify whether a subsequent enhancement layer is encoded relative to the base layer with or without FGS and, if with FGS, with which FGS layers. This can be accomplished by including a new syntax element in the bitstream, e.g. in the slice header, to indicate the DependencyId value with which the associated FGS slice is referenced in encoding of a subsequent enhancement layer. In this case, the base_id_plus1 in the slice header is still used to indicate the DepdencyId value of the quality base layer that is referenced by both the first FGS layer and a subsequent enhancement layer.
The present invention can be implemented directly in software using any common programming language, such as C/C++, or an assembly language. The present invention can also be implemented in hardware and used in a wide variety of consumer devices.
The present invention is described in the general context of method steps, which may be implemented in one embodiment by a program product including computer-executable instructions, such as program code, executed by computers in networked environments. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of program code for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.
Software and web implementations of the present invention could be accomplished with standard programming techniques with rule based logic and other logic to accomplish the various database searching steps, correlation steps, comparison steps and decision steps. It should also be noted that the words “component” and “module,” as used herein and in the claims, is intended to encompass implementations using one or more lines of software code, and/or hardware implementations, and/or equipment for receiving manual inputs.
The foregoing description of embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the present invention to the precise form disclosed, and modifications and variations are possible in light of the above teachings or may be acquired from practice of the present invention. The embodiments were chosen and described in order to explain the principles of the present invention and its practical application to enable one skilled in the art to utilize the present invention in various embodiments and with various modifications as are suited to the particular use contemplated.
The present application is a continuation-in-part of U.S. patent application Ser. No. 11/105,312, entitled “FGS Identification in Scalable Video Coding” and filed Apr. 13, 2005. This application is also related to U.S. patent application Ser. No. 60/676,269, entitiled “FGS Identification in Scalable Video Coding” filed on Apr. 29, 2005.
Number | Date | Country | |
---|---|---|---|
60676269 | Apr 2005 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11105312 | Apr 2005 | US |
Child | 11402410 | Apr 2006 | US |