This relates generally to compression of images, including still images and video images, using discrete cosine transform (DCT). More particularly, it relates generally to detecting artifacts resulting from the lossy nature of quantization of discrete cosine transform coefficients.
Conventional discrete cosine transform based video encoder divides each frame into non-overlapping squares called macroblocks. Each macroblock can be processed in either an intra or inter mode. In the intra mode, the encoder uses reconstructed data of only the current frame. In the inter mode, the encoder uses data from previously processed frames as well.
Data obtained in the intra and inter modes is denoted by the term “prediction error”. The prediction error should be divided into W*W blocks for further discrete cosine transform and transform coefficients quantization. A typical value of W is eight. Data is recovered using an inverse discrete cosine transform applied to the dequantized discrete DCT coefficients. Applying quantization-dequantization procedures leads to data losses. Thus, the difference between original and reconstructed data will be called “quantization noise”. Quantization noise leads to a specific visual error called an “artifact.” Among the artifacts are blocking, ringing artifacts, which are vertical or horizontal bars within a block and mosquito noise, which looks like a cloud of insects around the strong edges in the reconstructed image.
In accordance with some embodiments of the present invention, artifacts may be detected from the decoder's output. The artifacts that may be detected, in some embodiments, include ringing artifacts and mosquito noise.
A set of templates is determined for each artifact class, namely, ringing or mosquito noise artifacts. A set of templates, according to one embodiment, is shown in
Each template have an 8×8 size, in one embodiment. The template coefficients have a value set that includes only −1 (black element) and +1 (white element). The total number of templates for all classes may be 63, in one embodiment. Templates are used for artifact class determination.
Thus, referring to
Next, in block 22, the average intensity value is calculated. The calculation of the average value “A” for the current block is as follows:
where pi,j is the intensity value of a pixel with the coordinate i,j within the block in question.
Subsequently, in block 24, the difference from the average is calculated for the intensities of each pixel in the block as follows:
{tilde over (p)}i,j=pi,j−A
Then, in block 26, the 2D scalar multiplication is calculated for each template Tk and the results are summed:
Thus, in one embodiment, a metric is calculated for each template using template values tki,j, defined in
Finally, in block 28, the results of the 2D scalar multiplication in block 26 is compared in a comparator to a threshold thr. In other words, the metric value determined in block 26 is compared to a threshold. If the results of the 2D scalar multiplication is greater than the threshold for that particular template, then the respective artifact (i.e. either a ringing or mosquito noise artifact) is present in the block. The thresholds may be determined empirically based on experience.
Since the template has a value from −1 to +1, multiplication can be replaced by operation of sign changing for negative template values. Then the metric calculation only needs 64 additions and less than 64 sign changes, in one embodiment. All metrics can be calculated across for a current block, in one embodiment. Since the blocks are processed separately, they can also be handled simultaneously using parallel processing, in one embodiment. Since the calculated metric was estimating level for a single discrete cosine transform coefficient, it can be used for further filtration of this block.
The output from the discrete cosine transform based decoder is transformed into the YCbCr space in the transformation unit 32 of the apparatus 30, shown in
In some embodiments, the sequence shown in
In one embodiment, implemented in software, the sequence instructions shown in
In one embodiment, the method may be implemented using field programmable gate array schemes. Independent noise detection may allow use of fast ways for parallel block processing in some embodiments. Thus, the techniques described herein can be implemented with low complexity and high speed in some cases. The maximum count of operations is 64 additions, 32 sign changes per template, in one embodiment, with a maximum template count of 63. Also, 64 additions and one right shift should be used for the average value calculation. Thus, the calculations may be done very economically. In some embodiments, only the current frame is used and there is no need to use previous frames.
The graphics processing techniques described herein may be implemented in various hardware architectures. For example, graphics functionality may be integrated within a chipset. Alternatively, a discrete graphics processor may be used. As still another embodiment, the graphics functions may be implemented by a general purpose processor, including a multicore processor.
References throughout this specification to “one embodiment” or “an embodiment” mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation encompassed within the present invention. Thus, appearances of the phrase “one embodiment” or “in an embodiment” are not necessarily referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be instituted in other suitable forms other than the particular embodiment illustrated and all such forms may be encompassed within the claims of the present application.
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/RU2009/000659 | 12/1/2009 | WO | 00 | 5/31/2012 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2011/068429 | 6/9/2011 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5805727 | Nakano | Sep 1998 | A |
5844614 | Chong et al. | Dec 1998 | A |
6188799 | Tan et al. | Feb 2001 | B1 |
6668097 | Hu et al. | Dec 2003 | B1 |
6940523 | Evoy | Sep 2005 | B1 |
7657098 | Lin et al. | Feb 2010 | B2 |
20020191951 | Sodeyama et al. | Dec 2002 | A1 |
20030053708 | Kryukov et al. | Mar 2003 | A1 |
20030091238 | Plaza | May 2003 | A1 |
20030095206 | Wredenhagen et al. | May 2003 | A1 |
20030099290 | Chen | May 2003 | A1 |
20040170299 | Macy et al. | Sep 2004 | A1 |
20060245499 | Chiu et al. | Nov 2006 | A1 |
20060245506 | Lin et al. | Nov 2006 | A1 |
20060268981 | Owens | Nov 2006 | A1 |
20070076972 | Chiu | Apr 2007 | A1 |
20080037893 | Okumichi et al. | Feb 2008 | A1 |
20080240555 | Nanu et al. | Oct 2008 | A1 |
20090103812 | Diggins | Apr 2009 | A1 |
20100002953 | Kirenko et al. | Jan 2010 | A1 |
20110007971 | Karkkainen | Jan 2011 | A1 |
Number | Date | Country |
---|---|---|
1525399 | Sep 2004 | CN |
2276472 | May 2006 | RU |
WO02102086 | Dec 2002 | WO |
Entry |
---|
“Compression Artifacts in Modern Video Coding and State-of-the Art Means of Compensation”, by Andreas Unterweger, University of Salzburg, Austria, 2010. |
“DCT Quantization Noise in Compressed Images”, by Mark A. Robertson et al., LISA the University of Notre Dame Feb. 23, 2004. |
“Adaptive Filtering Techniques for Acquisition Noise and Coding Artifacts of Digital Pictures”, by Jie Xiang Yang, PhD Thesis, RMIT University Aug. 2010. |
Chinese Office Action dated Jun. 24, 2014 from State Intellectual Property Office of the People's Republic of China of Chinese Patent Application No. 200980163365.3 (English translation) (11 pages). |
Number | Date | Country | |
---|---|---|---|
20120236932 A1 | Sep 2012 | US |