The present invention relates to scalable video coding (SVC) employing a fine grain SNR scalability (FGS) motion refinement technique and an adaptive reference (AR) FGS technique.
In scalable video coding (SVC), FGS is an important feature to finely control video quality in SNR dimension. When an FGS layer is removed, picture quality degradation can be propagated to a subsequent picture due to the inter-frame prediction structure of an SVC video signal.
The picture quality degradation propagation can be controlled by an adaptive reference (AR) FGS technique for improving coding efficiency. Furthermore, an FGS motion refinement technique for setting a motion vector in each FGS layer can be used to improve coding efficiency of FGS layers. However, when the AR FGS and FGS motion refinement techniques are used together, the AR FGS technique is not working appropriately because a residual signal of an FGS layer block is not predicted from a base layer (i.e. a base quality layer or a lower FGS layer) block corresponding to the FGS layer block according to the FGS motion refinement technique.
This work was supported by the IT R&D program of MIC (Ministry of Information and Communication)/IITA (Institute for Information Technology Advancement) [2005-S-103-02, “Development of Ubiquitous Content Access Technology for Convergence of Broadcasting and Communications”].
The FGS motion refinement technique in SVC can be used to improve coding efficiency of FGS layers. The FGS motion refinement technique allows the FGS layers to have motion information and a block mode different from that of a base quality layer.
In this case, a residual signal of an FGS layer block may not be predicted from the co-located block in its base layer and a residual signal of a base quality layer is not suitable to control adaptability of AR-FGS. Furthermore, current AR-FGS considers only the property of the residual signal of the base quality layer, and thus a problem may be generated when the AR-FGS technique and the FGS motion refinement technique are simultaneously used.
Accordingly, the present invention provides alternatives for solving problems that may occur when the AR-FGS and FSG motion refinement techniques are simultaneously applied, thereby improving adaptability of AR-FGS.
According to an aspect of the present invention, there are provided alternatives capable of improving coding efficiency when the AR-FGS and FGS motion refinement techniques are simultaneously applied to SVC.
When a residual signal of a block in an FGS layer is not predicted, a prediction signal of the block in the FGS layer is predicted in the same manner as predicting a prediction signal of a base quality layer.
A scaling factor can have a non-zero value if required, and a residual signal of an FGS block for which residual signal prediction is not performed is used to determine a scaling factor of a higher FGS layer.
When interlayer residual signal prediction is always activated, an adaptation process is determined based on the residual signal of the base quality layer.
The FGS and FGS motion refinement techniques are not simultaneously used for key pictures.
According to an aspect of the present invention, there is provided an SVC encoder using improved AR-FGS and FGS motion refinement techniques, comprising: a prediction signal determination unit determining a prediction signal of a current FGS layer block according to a scaling factor of a current FGS layer when interlayer prediction is not performed between a base quality layer or a lower FGS layer and the current FGS layer; and a scaling factor determination unit determining a scaling factor used to predict a higher FGS layer block corresponding to the current FGS layer block based on a residual signal of the current FGS layer block.
According to another aspect of the present invention, there is provided an SVC encoder using improved AR-FGS and FGS motion refinement techniques, comprising: an interlayer prediction setting unit setting that interlayer prediction is inevitably performed between a base layer (i.e. a base quality layer or a lower FGS layer) and each FGS layer; and a scaling factor determination unit determining a scaling factor of a higher FGS layer based on a residual signal of the base layer.
According to another aspect of the present invention, there is provided an SVC encoder using improved AR-FGS and FGS motion refinement techniques, comprising an FGS-MR inactivation unit preventing the FGS motion refinement technique from being applied to a key picture when a picture in a GOP of an input bit stream corresponds to the key picture.
According to another aspect of the present invention, there is provided an SVC encoder using improved AR-FGS and FGS motion refinement techniques, comprising a selective FGS-MR inactivation unit preventing the FGS motion refinement technique from being applied to a key picture only when the AR-FGS technique is [A1]applied to the key picture in the case where a picture in a GOP of an input bit stream corresponds to the key picture.
According to another aspect of the present invention, there is provided an SVC encoder using improved AR-FGS and FGS motion refinement techniques, comprising an FGS-MR inactivation unit preventing the FGS motion refinement technique from being applied to a key picture and an AR-FGS inactivation unit blocking the AR-FGS technique from being applied to the key picture when a picture in a GOP of an input bit stream corresponds to the key picture.
According to another aspect of the present invention, there is provided an SVC decoder using improved AR-FGS and FGS motion refinement techniques, wherein a prediction signal of a current FGS layer block is decoded according to a scaling factor of a current FGS layer when the FGS motion refinement technique is applied to the current FGS layer and interlayer prediction is not performed between the current FGS layer and a base quality layer or a lower FGS layer in an operation of decoding each FGS layer, and the scaling factor is determined by an SVC decoder based on a residual signal of the current FGS layer block.
According to another aspect of the present invention, there is provided an SVC decoder using improved AR-FGS and FGS motion refinement techniques, wherein the SVC decoder inevitably determines a scaling factor of a higher FGS layer based on a residual signal of a base layer when a received bitstream is configured such that interlayer prediction is inevitably performed between the base layer and each FGS layer.
According to another aspect of the present invention, there is provided an SVC decoder using improved AR-FGS and FGS motion refinement techniques, wherein the SVC decoder does not check a flag that represents that the FGS motion refinement technique is applied to a key picture of a GOP in a received bit stream.
According to another aspect of the present invention, there is provided an SVC decoder using an improved AR-FGS technique, wherein the SVC decoder determines a scaling factor used to predict a higher FGS layer block corresponding to a current FGS layer block based on a residual signal of a current FGS layer block when receiving a bit stream including an interlayer prediction setting signal that represents that interlayer prediction is set to be performed between a base layer and each FGS layer.
According to another aspect of the present invention, there is provided an SVC decoding method using improved AR-FGS and FGS motion refinement techniques, the SVC decoding method comprising: determining whether a current frame corresponds to a key picture; and determining whether the AR-FGS technique is applied when the current frame corresponds to the key picture and determining whether the FGS motion refinement technique is applied when the current frame does not correspond to the key picture.
The present invention improves coding efficiency when AR-FGS and FGS motion refinement are simultaneously applied to SVC. Furthermore, the present invention can solve problems generated when the AR-FGS and the FGS motion refinement are simultaneously used because adaptation of current AR-FGS considers only the property of the residual signal of the base layer.
The present invention will now be described more fully with reference to the accompanying drawings, in which exemplary embodiments of the invention are shown. The invention may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the invention to those skilled in the art. Throughout the drawings, like reference numerals refer to like elements.
SVC is an important technique for video communication in a heterogeneous environment. The SVC technology allows, under constraints of a terminal or networks, truncation of original video bitstream to provide output bitstreams corresponding to different presentations of the original content. The scalability of SVC video is supported in three dimensions, namely spatial, temporal, and SNR.
In SVC, FGS can finely control video quality. For each spatial resolution, a base quality layer is first encoded by a method similar to H.264/AVC. Then, up to three FGS layers can be added to the base quality layer in order to enhance the SNR quality of the corresponding base quality layer. These FGS layers can be extracted from an arbitrary point in order to meet a bit rate condition.
Video quality (SNR) degradation can be propagated to following pictures because of the influence of a removed FGS layer and an inter-frame prediction structure. This propagation is referred to as a drift error in SVC. To avoid the drift error, inter-frame prediction of a key picture can be obtained using only information of the base quality layer of a previous frame. However, this solution results in low coding efficiency as the best inter-frame prediction is not used.
To provide a flexible tradeoff between coding efficiency and error robustness, the AR-FGS has been suggested. The AR-FGS technique adaptively controls a portion of FGS information which is used to compose the inter-frame prediction based on the characteristics of base quality layer.
Furthermore, the FGS motion refinement technique also increases coding efficiency of FGS layers. The FGS motion refinement technique allows each FGS layer to have a motion vector such that a block mode is different from the base quality layer.
However, because of the property that a residual signal of an FGS layer block may not be predicted from the co-located block in its base layer, there may be a problem to control adaptability of AR-FGS that considers only the property of a residual signal of the base quality layer.
Problems generated when the FGS motion refinement technique and the AR-FGS technique are simultaneously used in conventional SVC are described with reference to
A reconstructed signal of a block 101 includes a prediction signal 102 and a residual signal 103. The prediction signal 102 corresponds to the sum of a prediction signal that is motion-compensated from a reconstructed signal 104 of a previous picture block of the base quality layer and a predicted signal that is motion-compensated from a difference between a reconstructed signal 105 of the first FGS layer 110 and the reconstructed signal 104 of the previous picture.
The predicted signal that is motion-compensated from the difference between the reconstructed signal 105 of the first FGS layer 110 and the reconstructed signal 104 of the previous picture is multiplied by a first scaling factor S1 in an adaptive scaling unit 106. When the first scaling factor S1 is 0, the prediction signal 102 is obtained only from the base quality layer and video quality degradation does not occur even when FGS information is extracted from the block 105 of the first FGS layer 110. When the first scaling factor S1 is not zero, however, the prediction signal 102 will have better video quality if FGS information is not extracted from the block 105 of the first FGS layer 110. Two cases in which the first scaling factor S1 is controlled are explained below.
Firstly, inter-layer prediction from the base quality layer (or a lower FGS layer) to a higher FGS layer is considered. When the switch K11 (111) is connected to the first FGS layer 110 and thus prediction occurs from the base layer 100 to a higher FGS layer, the first scaling factor S1 is determined based on the coefficient of a residual signal 107 of the base quality layer 100. When the coefficient of the residual signal 107 is not 0 (when a switch K21 (121) is switched to a node 1), a corresponding coefficient of the prediction signal 102 is obtained by setting the first scaling factor S1 to 0. When the coefficient of the residual signal 107 is 0, the corresponding coefficient of the prediction signal 102 is determined by setting the first scaling factor S1 to a non-zero value. The non-zero value of the first scaling factor S1 depends on contents and application programs.
When all the coefficients of the residual signal 107 are 0, scaling occurs in a spatial domain. When any coefficient of the residual signal 107 is non-zero, scaling is performed in a transform domain. That is, a differentiate signal is converted from the spatial domain to the transform domain and then scaled.
The second case that there is not inter-layer prediction from the base quality layer (or a lower FGS layer) to a higher FGS layer is considered. For example, when the switch K11 (111) is not connected to the first FGS layer 110, and thus prediction from the base quality layer 100 to a higher FGS layer is not carried out, the first scaling factor S1 is determined by the same method as in the first case except that the switch K21 (121) is switched to a node 2 and the first scaling factor S1 is set based on coefficients of the residual signal 103 of the first FGS layer 110. Accordingly, a problem that the residual signal 103 of the current FGS layer is used to determine the scaling factor S1 is generated.
Alternatives proposed in the present invention in order to solve problems generated when the AR-FGS technique and the FGS motion refinement technique are used together will now be described with reference to
When a connection switch K1i is opened and thus an interlayer residual signal prediction is not performed, a scaling factor Si of an ith FGS layer is set to 0 such that a prediction signal of a related block in the ith FGS layer corresponds to a prediction signal of a base quality layer 200. For example, a prediction signal of a related block 202 in the first FGS layer 210 becomes identical to the prediction signal of the base quality layer.
When the switch K1i is opened and a switch K1i+1 is closed, a scaling factor S(i+1) of an (i+1)th FGS layer is determined based on a residual signal of the ith FGS layer. Additionally, When the switch K1i is opened, the switch K1i+1 is closed and a switch K1i+2 is closed, the scaling factor S(i+1) of the (i+1)th FGS layer and a scaling factor S(i+2) of an (i+2)th FGS layer are determined based on an ith residual signal. For example, when i is 1, the scaling factors S(i+1) and S(i+2) are determined based on a residual signal 203.
1) The scaling factor S1 is set to 0 when residual signal prediction is inactivated between the base quality layer 200 and the first FSG layer 210. Then, the residual signal 203 of the first FGS layer 210 is obtained in the same manner as the manner of obtaining a residual layer 207 of the base quality layer 200.
For example, the prediction signal 202 of the first FGS layer 210 is identical to a prediction layer of the base quality layer 200 and the residual signal 203 of the first FGS layer 210 is encoded irrespective of the residual signal 207 of the base quality layer 200. In this case, the residual signal 203 is not encoded by using prediction but encoded by using a quantization parameter different from the quantization used to encode the residual signal 207. The residual signal 203 can be used to determine a residual signal of the second FGS layer 220 and a scaling factor S2.
2) When the switch K11 (211) is opened and the switch K12 (212) is closed, a switch K22 (222) is switched to a node 2 and the residual signal 203 of the first FGS layer 210 is used to determine the scaling factor S2 of the second FGS layer 220.
3) When the switches K12 (212) and K13 (213) are closed, a switch K23 (233) is switched to a node 1 and the scaling factors S2 and S3 are determined based on the residual signal 203.
In the first alternative, the scaling factor Si of the ith FGS layer is set to 0 such that a prediction signal of a related block of the ith FGS layer becomes identical to the prediction signal of the base layer when the connection switch K1i is opened and interlayer residual prediction is not performed. The second alternative is distinguished from the first alternative as to whether the scaling factor Si is set to 0 or not.
That is, in the second alternative, the scaling factor Si is set to a non-zero value if required even though there is not interlayer residual prediction because the connection switch K1i is opened, and a residual signal of an FGS block for which interlayer prediction is not performed is used to determine a scaling factor of a higher FGS layer.
For example, when the switch K11 (211) is opened and the switch K12 (212) is closed, the switch K22 (222) is switched to the node 2 and the residual signal 203 of the first FGS layer 210 is used to determine the scaling factor S2 of the second FGS layer 220.
The problem described above with reference to
In
A variable sigBCoeff represents a value corresponding to a residual signal and is used to determine a scaling factor. In the current standardization document, sigBcoeff of the ith FGS layer determines the scaling factor of the ith FGS layer when motion_refinement_flag is 1 and residual_prediction_flag is 0. That is, the residual signal 103 determines the scaling factor S1. However, in the present invention, a variable sigBCoeffTem is generated and the standardization document is modified such that sigBCoeff has the residual signal value of the (i-1)th FGS layer in order to solve problems of the current standardization document.
The third alternative is proposed for AR-FGS when the FGS motion refinement is inevitably performed between the base layer and each FGS layer. In the third alternative, the switches K11 (211), K12 (212) and K13 (213) are set to be closed always. That is, in the third alternative, interlayer prediction is activated always such that interlayer residual signal prediction is carried out. Accordingly, all the switches K21 (221), K22 (222) and K23 (223) are switched to the node 1, and thus the residual signal 207 of the base quality layer is always used to determine the scaling factor Si.
Parts of the standardization document, which are modified according to the third alternative, are shaded in
The AR-FGS technique is applied only to a key picture in a group of picture (GOP) in SVC. Thus, the FGS motion refinement is not applied to the key picture to solve the problem generated when the AR-FGS and FGS motion refinement are simultaneously applied. Accordingly, there is no need to modify the existing AR-FGS technique in order to receive the FGS motion refinement technique.
Although motion_refinement_flag that represents whether the motion refinement technique is used is checked for all pictures in the conventional standardization document, in the present invention, motion_refinement_flag is checked only for a picture that is not a key picture.
In the fifth alternative, the FGS motion refinement is not applied when the AR-FGS technique is used for a key picture and applies the FGS motion refinement technique when the AR-FGS technique is not used for the key picture. The fifth alternative is distinguished from the fourth alternative in that the FGS motion refinement technique is not applied to all the key pictures.
In the sixth alternative, both the AR-FGS technique and the FGS motion refinement technique are not applied for a key picture in SVC. In this case, bit stream complexity is decreased and video quality degradation propagation is reduced although encoding efficiency of encoded video signals is not high.
In addition to the first through sixth alternative, the present invention proposes an improved AR-FGS application method that determines the scaling factor Si of the ith FGS layer by using the residual signal of the (i−1)th FGS layer when interlayer prediction is used for a residual signal in AR-FGS (when the FGS motion refinement technique is not used or when interlayer prediction is used for a residual signal although the FGS motion refinement technique is used) based on the fact that the residual signal of the (i−1)th FGS layer is more similar to the residual signal of the ith FGS layer than to the residual signal of the base quality layer.
Specifically, when the switch K12 (212) is closed in
The improved AR-FGS application method can be combined with the third, fourth, and fifth alternatives. For example, when both the switches K11 (211) and K12 (212) are closed, the scaling factor S1 is determined by the residual signal 207 and the scaling factor S2 is determined by the residual signal 203.
The prediction signal determination unit 1010 determines a prediction signal of a current FGS layer block according to a scaling factor of a current FGS layer when interlayer prediction is not performed between the base quality layer or a lower FGS layer and the current FGS layer. The prediction signal of the current FGS layer block is determined according to the above-described first alternative when the scaling factor of the current FGS layer is 0, and the prediction signal of the current FGS layer block is determined according to the above-described second alternative when the scaling factor of the current FGS layer is not 0.
The scaling factor determination unit 1020 determines a scaling factor used to predict a higher FGS layer block corresponding to the current FGS layer block based on the residual signal of the current FGS layer block. In this case, interlayer prediction is set to be performed between the current FGS layer and the higher FGS layer. The detailed operation of the scaling factor determination unit 1020 relates to the first and second alternatives.
The interlayer prediction setting unit 1210 sets that interlayer prediction is inevitably performed between the base layer and each FGS layer. The scaling factor determination unit 1220 determines a scaling factor of a higher FGS layer based on the residual signal of the base layer always. The operation of the SVC encoder illustrated in
The fourth alternative corresponds to an SVC encoder including only the FGS-MR inactivation unit 1310, and the sixth alternative corresponds to an SVC encoder including the both the FGS-MR inactivation unit 1310 and the AR-FGS inactivation unit 1320. The fifth alternative corresponds to an SVC encoder that selectively uses the FGS-MR inactivation unit 1310 only when the AR-FGS technique is applied to a key picture.
As described above, the SVC encoders illustrated in
The present invention can also be embodied as computer readable codes on a computer readable recording medium. The computer readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and carrier waves (such as data transmission through the Internet). The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2007-0002653 | Jan 2007 | KR | national |
10-2007-0104240 | Oct 2007 | KR | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/KR2007/005065 | 10/16/2007 | WO | 00 | 4/10/2009 |