This invention relates to a technique for reducing artifacts in connection with decoding of a coded video stream.
The decoding of a video stream compressed at low bit rate tends to cause visible artifacts noticeable to a viewer. Blockiness and structured noise patterns are common artifacts that arise when using block-based compression techniques. The human visual system has a greater sensitivity to certain types of artifacts, and thus, such artifacts appear more noticeable and objectionable than others. The addition of random noise to the decoded stream can reduce the noticeability of such compression artifacts, but large frame-to-frame differences created by adding random noise can itself produce artifacts that appear noticeable and objectionable.
The addition of a dither signal can reduce human sensitivity to image artifacts, for example to hide contouring and blocking artifacts. One prior art technique has proposed adding a random noise dither that is based on film grain to an image to disguise block effects. The rationale for adding such random noise is that random error is more forgiving than the structure, or correlated error. Other prior art techniques have proposed adding a dither signal to a video stream to hide compression artifacts. Once past technique has proposed adding a random noise dither in the video encoding and decoding process in the in loop deblocking filter for the ITU/ISO H. 264 video coding standard, commonly known as the JVT coding standard. The amount of dither to be added depends on the position of a pixel with respect to a block edge. Another prior technique has proposed adding that random noise subsequent to video decoding (i.e., adding noise as a “post process”), for use as comfort noise. The amount of noise added depends on the quantization parameter and on the amount of noise added to spatially neighboring pixels. The term “comfort noise” comes from the use of noise in audio compression to indicate noise pattern generated at the receiver end to avoid total silence that is uncomfortable to a listener.
Past techniques for reducing artifacts by adding noise typically reduce spatial artifacts at the risk of creating temporal abnormalities, i.e., large frame-to-frame differences. Thus, there exists a need for a technique for reducing artifacts during decoding of a coded video stream that overcomes the aforementioned disadvantages.
Briefly, in accordance with a preferred embodiment of the present principles, there is provided a method for reducing artifacts in a video stream during decoding. The method commences by decoding the video stream. Following decoding, noise is added to the video stream by adding noise to each pixel in an amount correlated to the additive noise of pixels in a prior picture. Thus, in accordance with the present principles, temporal noise correlation aids in determining the additive noise to reduce large frame-to-frame differences, a disadvantage of prior noise additive techniques.
In accordance with another embodiment of the present principles, the prior picture from which the additive noise is derived comprises a displayed picture (i.e., a previously decoded picture to which noise has been added). In yet another embodiment, the prior picture comprises a previously decoded picture.
In accordance with the present principles, adding a random-noise containing dither signal, to already decoded signal, in an amount correlated to the additive noise of pixels in a prior picture, serves to improve the subjective video quality.
A summing block 18 sums each decoded picture from the decoder 12 with noise from a noise generator 16. A clipper 20 then clips the resultant signal output by the summing block 18 to yield a decoded picture for display which exhibits reduced artifacts. Note that noise addition occurs after storage of decoded pictures in the reference picture store 14 since the reference pictures must remain unchanged in order to properly decode the subsequent incoming pictures.
The magnitude of the noise signal from the noise generator 16 added to each decoded picture typically depends on several different factors. To better understand the factors associated with noise addition, let the term N(k, x, y) represent the added pixel noise signal, P(k, x, y) represent the decoded pixel, and D(k, x, y) represent the displayed pixel (x, y) of the kth picture in the video sequence, respectively. The kth pixel of the displayed picture becomes the sum of the decoded pixel plus the noise signal, as given by the relationship
D(k, x, y)=Clip(0, 255, P(k, x, y)+N(k, x, y)) (Equation 1)
The visual impact of adding a noise signal to the video sequence, rather than just to a single image, becomes a consideration in the determination of the magnitude of the noise signal. In accordance with the present principles, the noise generator 16 correlates the magnitude of additive noise signal for at least one pixel in a picture to the value of the additive noise signal of at least one pixel in at least one previously displayed picture, i.e., a decoded picture to which noise has been added. In an alternative embodiment of the current invention, the temporal correlation is based on the previously decoded picture (which contains no noise), rather than the previously displayed picture that contains noise. While noise is typically added to each pixel, comfort noise could be added to some pixels, not necessarily every pixel.
In first embodiment, noise addition using temporal correlation make use of a correlation factor α, 0≦α≦1, yielding the following relationship for added noise:
N(k, x, y)=αN(k−1, x, y)+(1−α)R(k, x, y) (Equation 2.)
where R(k, x, y) is a random number generated using any type of random number distribution, for example a Gaussian or Laplacian distribution. Random number generation can occur by means of a lookup table. The random number R(k, x, y) can also include spatial correlation, such as that used in film grain noise generation.
The term α in Equation 2 has a value less than unity and to avoid division by α, Equation 2 can be simplified as follows:
N(k, x, y)=(a*N(k−1, x, y)+(2b−a)R(k, x, y)+2b−1)>>b (Equation 3)
with scaling variable b, chosen based on the desired precision of the division approximation, and the a=round(α*2b).
Implementing Equation 3 in a decoder arrangement, such as the decoder arrangement 1000 of
An FIR filter approach can be implemented using the decoder arrangement 100 of
The foregoing describes a technique for reducing artifacts in connection with decoding of a coded video stream by adding noise
This application claims the benefit, under 35 U.S.C. §365 of International Application PCT/US2004/025366, filed Aug. 4, 2004, which was published in accordance with PCT Article 21(2) on Mar. 3, 2005 in English and which claims the benefit of U.S. provisional patent application No. 60/496,426, filed Aug. 20, 2003.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2004/025366 | 8/4/2004 | WO | 00 | 2/2/2006 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2005/020585 | 3/3/2005 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5210836 | Childers et al. | May 1993 | A |
5450098 | Oz | Sep 1995 | A |
20020061062 | O'Brien | May 2002 | A1 |
Number | Date | Country |
---|---|---|
2005191994 | Jul 2005 | JP |
2005322028 | Nov 2005 | JP |
WO 2004095829 | Nov 2004 | WO |
WO 2006027045 | Mar 2005 | WO |
Number | Date | Country | |
---|---|---|---|
20060256871 A1 | Nov 2006 | US |
Number | Date | Country | |
---|---|---|---|
60496426 | Aug 2003 | US |