The invention is situated in the context of image synthesis and more specifically in the domain of video compression. The synthesis method applies to the coder and to the decoder.
The method consists in synthesizing the content of an image from texture patches, the patches in question being:
Moreover on the basis of a quality metric, the display of the synthesis thus obtained is compared to the source on the coder side, the parts of the reconstructed image not responding to a level of quality judged as being acceptable by the criterion are then encoded by a more conventional technique, such as for example:
With respect to the known synthesis methods, pixel based techniques can be cited, in the sense that the pixels are constructed one by one, one of the algorithms can be cited developed by L.-Y. Wei and M. Levoy “Fast texture synthesis using tree-structured vector Quantization”. Proceedings of SIG-GRAPH 2000 (July 2000), 479-488. [1]
The purpose here is to synthesize a large texture area from a “patch” that is smaller but that contains all the information required concerning patterns. The quality of the algorithm resides in the fact that this synthesized image does not have to display visible borders or periodicities.
The comparison of neighbouring areas is done “pixel by pixel” via the standard L2. Thus the error minimized here has the form:
With xsynth and xpatch the values of each RGB colour of the pixel considered of the current image and of the patch. Each pixel of the neighbouring area of the current pixel is thus compared with its opposite of the neighbouring area of the pixel tested in the patch.
The neighbouring area is constituted of pixels surrounding the current pixel, it is comprised in a square of given dimensions [dxd]. It is called “causal” when it only comprises pixels already synthesized in the current image. Here it is thus causal neighbouring areas that are used as the non-causal part of the neighbouring area in the current area only comprises noise pixels and is of no interest for the comparison.
The main problem raised by the exhaustive approach remains the calculation time required to synthesize images of reasonable size. This calculation time being correlated with the size of the neighbouring area, this multi-resolution approach will enable the performances to be improved. The main idea introduced in [1] is to use images of lower resolutions so that 5×5 or 3×3 neighbouring areas extend over the texture like 15×15 neighbouring areas in simple resolution. To do this, you begin by creating pyramids, one for the patch and one for the image synthesized using a sub-sampler filter, as shown in
The algorithm then synthesizes the current image pyramid, from the lowest resolution to the highest resolution, as follows
The purpose of the invention being to synthesize an image via texture patches with the objective of image compression, it is obviously necessary the estimate the recovery quality of synthesized image parts in comparison with the source image (on the coder side). These synthesis base reconstruction techniques have a tendency to implicitly give rise to a reconstructed signal that moves away from the original signal in terms of standard distortion of sse (sum of squared error) type, but however offer a visual display that may be entirely acceptable, it is here that the quality metric is confronted. Currently there is a lot of work on the subject, however this paper will be directed towards a measure of a more psycho-visual character called Structural Similarity (SSIM) described for example in the document by Z. Wang, L. Lu, A. C Bovik, “Video quality assessment based on structural distortion measure” Signal processing image communication vol 19 no 2, pp 121-132, February 2004.
This measure is composed of three terms are enables the disparities to be estimated. The SSIM formulation is the following:
where:
SSIM is applied per 8×8 block in the image, relative to each pixel of the image.
One of the purposes of the invention is to overcome the aforementioned disadvantages. The purpose is a method for image decoding using a technique for synthesis of images and image regions exploiting a synthesis algorithm that operates on a set of patches, this operation is carried out through the intermediary of a low resolution image, characterized in that it comprises the following steps for:
According to a particular embodiment, the synthesis technique is of pyramidal type.
According to a particular embodiment, the low resolution image has a spatial scalability type form so that the synthesis algorithm is punctually guided to pyramid levels other than the lowest resolution level.
According to a particular embodiment, the synthesis algorithm operates on an image signal RVB, an image signal YUV or a luminance signal Y alone, the signals U and V undergoing the same processing as the processing applied to the luminance.
The purpose is also a method for image compression using a technique for synthesis of images and image regions exploiting a synthesis algorithm that operates on a set of patches, this operation being performed by the intermediary of a low resolution image, characterized in that it comprises the following steps:
According to a particular embodiment, the synthesis technique is of pyramidal type.
According to a particular embodiment, the low resolution image has a spatial scalability type form so that the synthesis algorithm is punctually guided to pyramid levels other than the lowest resolution level.
According to a particular embodiment, the synthesis algorithm operates on an image signal RVB, an image signal YUV or a luminance signal Y alone, the signals U and V undergoing the same processing as the processing applied to the luminance.
According to a particular embodiment, the quality metric is SSIM (Structural SIMilarity).
The invention enables the synthesis of images and image regions to be improved by using a synthesis algorithm that operates on a set of patches, this operation being carried out by the intermediary of a low resolution image. The application targeted being video compression, a quality metric intervenes in order to code typically the areas of the image badly reconstructed or to or to leave as they are the areas in question.
A first advantage of the invention is thus to enable an acceptable visual display (based on the quality metric) of image regions reconstructed via a synthesis algorithm, this synthesis being guided at the coder and decoder by an image transmitted of low resolution, in order finally to reduce the bit rate at a given visual quality, and vice versa.
It should be noted that this technique does not require a segmentation card to be transmitted to the decoder, the synthesis algorithm naturally operating the distribution of the information contained in the different patches through the intermediary of the guiding image. In addition, the display imperfections by the synthesis technique are corrected by a standard coding, said areas of imperfection being detected by a quality metric, this metric can be the SSIM. A second advantage of the invention is the scalability of the representation, which enables the signal to be decoded at a chosen resolution.
Another advantage is the possibility to code the low resolution image according to an existing coding technique, for example H.264, thus assuring a backward compatibility with these coding techniques.
The idea is to transmit to the hierarchical synthesis algorithm the sub-sampled version of the reference image that will serve as guide for the synthesis of the lowest resolution of the pyramid. The synthesis of this low resolution image is made with a non-causal neighbouring area. For example the exhaustive approach of L. Y. Wei and M. Levoy is chosen that consists in comparing this neighbouring area with all of those of the patch in order to determine the best candidate.
The different steps of the method, shown by
Take for example, to illustrate this type of synthesis, an image from a football match. This reference image is shown in
The synthesized image of dimensions 768×512, shown in
In order to measure if the texture synthesis is revealed as pertinent on the regions of the image produced, a quality metric is used capable of revealing the display of the structure.
In taking again the previous example and a possible metric, the SSIM, a mapping is obtained of the SSIM as shown in
Several decision modes can be applied:
The applications concerned are those linked to video compression. More specifically, the very low and low bitrate applications (for example HD for mobile) as well as super resolution (HD and +).
Number | Date | Country | Kind |
---|---|---|---|
0853721 | Jun 2008 | FR | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2009/056903 | 6/4/2009 | WO | 00 | 12/2/2010 |