The invention relates to image epitome extraction in general, in the context of video compression/coding constraint. More precisely, the invention relates to a method of coding an epitome.
The epitome of an image is its condensed representation containing the essence of the textural and structures properties of the image. The epitome approach aims at reducing redundant information (texture) in the image by exploiting repeated content within an image. An epitome construction method is known from the article from Hoppe et al entitled “Factoring Repeated Content Within and Among Images” and published in the proceedings of ACM SIGGRAPH 2008 (ACM Transaction on Graphics, vol. 27, no. 3, pp. 1-10, 2008).
This epitome construction method consists in factoring an image into a texture epitome and a transform map. Once the self similarity content is determined, the following step of the algorithm consists in extracting redundant texture patches to construct epitome charts, the union of all epitome charts constituting the texture epitome. Each epitome chart represents repeated regions in the image. The construction of an epitome chart is composed of a chart initialization step following of several chart extension steps. The transform map indicates for each block of the image which patch in the texture epitome is to be used for its reconstruction). The reconstruction may be a simple copy of the identified patch. If sub-pel reconstruction is used then an interpolation is made.
More generally, encoding such a texture epitome with existing block-based encoding techniques is costly because of texture edges existing within blocks to be encoded.
The invention is aimed at alleviating at least one of the drawbacks of the prior art. To this aim, the invention relates to a method of coding an epitome of an image divided into blocks comprising the steps of:
According to a first embodiment, the texture epitome is padded after the step of creation of the epitome and wherein the method further comprises, before the coding step, a step of refining the transform map using the padded texture epitome.
According to a specific aspect of the invention, refining the transform map comprises, for each block of the image, identifying a patch in the padded epitome which best match the block according to a criterion.
According to a second embodiment, the texture epitome is padded during the step of epitome creation.
Advantageously, the transform map is refined using the padded texture epitome padded during the step of epitome creation.
Other characteristics and advantages of the invention will appear through the description of a non-limiting embodiment of the invention, which will be illustrated, with the help of the enclosed drawing.
One goal is to propose a complementary tool to be used while extending an epitome by an image region. The invention concerns the consideration of video compression algorithm properties (used to encode the epitome) in building of the epitome.
The invention relates to a coding method of an epitome. The epitome of an image comprises a texture epitome comprising patches of texture extracted from the image and a transform map. The texture epitome is such that all image blocks can be reconstructed from the epitome patches. The transform map is also known as assignation map or vector map in the literature. The transform map indicates, for each block of the image, the location in the texture epitome of the patch used to reconstruct it. With the texture epitome E and the transform map Φ, one is able to reconstruct an image.
The present invention enables the optimization of Image Factorization, i.e. epitome creation, according to the future use of block based transforms such as DCT by realizing a texture padding in order to have a block structure in the epitome and optionally, by operating a “refinement” of the epitome due to the addition of new pixels to the epitome for the padding process. The refinement operation comprises finding new patches in the texture epitome taken into account the new pixels issued from the padding process.
At step 20, an epitome, i.e. a texture epitome E and a transform map Φ are created from the current image Icurr. The epitome of an image is its condensed representation containing the essence of the textural and structure properties of the image. Therefore, according to this specific embodiment, the current image Icurr is factorized, i.e. a texture epitome E and a transform map Φ are created for the current image. The epitome principle was first disclosed by Hoppe et al in the article entitled “Factoring Repeated Content Within and Among Images” published in the proceedings of ACM SIGGRAPH 2008 (ACM Transaction on Graphics, vol. 27, no. 3, pp. 1-10, 2008). The texture epitome E is constructed from pieces of texture (e.g. a set of charts) taken from the current image. The transform map Φ is an assignation map that keeps track of the correspondences between each block of the current image Icurr and a patch of the texture epitome E. From an image I, a texture epitome E and a transform map Φ are created such that all image blocks can be reconstructed from matched epitome patches. A matched patch is also known as transformed patch. The transform map is also known as vector map or assignment map in the literature. With the texture epitome E and the transform map Φ, one is able to reconstruct the current image I′. In the following the epitome designates both the texture epitome E and the transform map Φ.
Others forms of epitome have been proposed in the literature. In document entitled “Summarizing visual data using bidirectional similarity” published in 2008 in Computer Vision and Pattern Recognition CVPR, Simakov et al disclose the creation of an image summary from a bi-directional similarity measure. Their approach aims at satisfying two requirements: containing as much as possible visual information from the input data while introducing as few as possible new visual artifacts that were not in the input data (i.e., while preserving visual coherence).
In document entitled “Video Epitomes” published in International Journal of Computer Vision, vol. 76, No. 2, February 2008 image Cheung et al disclose a statistical method in order to extract an epitome. This approach is based on a probabilistic model that captures both the color information and certain spatial pattern.
At step 210, the epitome construction method comprises finding self-similarities within the current image Icurr. The current image is thus divided into a regular grid of blocks. For each block in the current image Icurr, one searches the set of patches in the same image with similar content. That is, for each block Bi(∈ block grid), a list Lmatch(Bi)={Mi,0, Mi,1 . . . } of matches (or matched patches) is determined that approximate Bi with a given error tolerance ε. In the current embodiment, the procedure of matching is performed with a block matching algorithm using an average Euclidian distance. Therefore, at step 210, the patches Mj,l in the current image whose distance to the block Bi is below ε are added to the list Lmatch(Bi). The distance equals for example the absolute value of the pixel by pixel difference between the block Bi and the patch Mj, l divided by the number of pixels in Bi. According to a variant, the distance equals the SSE (Sum of Square Errors), wherein the errors are the pixel by pixel difference between the block Bi and the patch Mj,l. An exhaustive search is performed in the entire image. Once all the match lists have been created for the set of image blocks new lists L′match(Mj,l) indicating the set of image blocks that could be represented by a matched patch Mj,l, are built at step 220. Note that all the matched blocks Mj,l found during the full search step are not necessarily aligned with the block grid of the image and thus belong to the “pixel grid”.
At step 240, epitome charts are constructed. To this aim, texture patches are extracted, more precisely selected, in order to construct epitome charts, the union of all the epitome charts constituting the texture epitome E. Each epitome chart represents specific regions of the image in term of texture. Step 240 is detailed in the following.
At step 2400, an index n is set equal to 0, n is an integer.
At step 2402, a first epitome chart ECn is initialized. Several candidate matched patches can be used to initialize an epitome chart. Each epitome chart is initialized by the matched patch which is the most representative of the not yet reconstructed remaining blocks. Let Y∈RN×M denote the input image and let Y′∈RN×M denote the image reconstructed by a candidate matched patch and the epitome charts previously constructed. To initialize a chart, the following selection criterion based on the minimization of the Mean Square Error (MSE) criterion is used:
The selected criterion takes into account the prediction errors on the whole image. This criterion allows the epitome to be extended by a texture pattern that allows the reconstruction of the largest number of blocks while minimizing the reconstruction error. In the current embodiment, a zero value is assigned to image pixels that have not yet been predicted by epitome patches when computing the image reconstruction error.
At step 2404, the epitome chart ECn is then progressively grown by a region from the input image. The step is detailed on
In the preferred embodiment, the λ value is set to 1000. The first term of the criterion refers to the average prediction error per pixel when the input image is reconstructed by texture information contained in the current epitome
and the increment ΔE. As in the initialization step when the image pixels are impacted neither by the current epitome Ecurr nor by the increment, a zero value is assigned to them. FCext is thus computed on the whole image and not only on the reconstructed image blocks. The second term of the criterion corresponds to a rate per pixel when constructing the epitome, which is roughly estimated as the number of pixels in the current epitome and its increment, divided by the total number of pixels in the image. After having selected the locally optimal increment ΔEopt, the current epitome chart becomes: ECn(k+1)=ECn(k)+ΔEopt. The assignation map is updated for the blocks newly reconstructed by ECn(k+1).
Then, the current chart is extended, during next iteration k+1, until there are no more matched patches Mj,l which overlap the current chart ECn(k) and represent others blocks. If such overlapping patches exist then the method continues at step 2404 with ECn(k+1). When the current chart cannot be extended anymore and when the whole image is not yet reconstructed by the current epitome (step 2406), the index n is incremented by 1 at step 2408 and another epitome chart is created at a new location in the image. The method thus continues with the new epitome chart at step 2402, i.e. the new chart is first initialized before its extension. The process ends when the whole image is reconstructed by the epitome (step 2406). The texture epitome E comprises the union of all epitome charts ECn. The assignation map indicates for each block Bi of the current image the location in the texture epitome of the patch used for its reconstruction.
Back to
At step 24, the padded texture epitome is coded. Even if more texture than needed is coded due to the padded pixels, the global texture encoding cost is lower than without padding. As an example the texture epitome E is encoded in conformance with H.264 standard using intra only coding mode. According to a variant, the texture epitome is encoded in conformance with JPEG standard. According to another variant, the texture epitome is encoded in inter coding mode using as reference image an homogenous image, e.g. an image whose pixels all equal 128. According to another variant, the texture epitome is encoded using a classical encoder (e.g. H.264, MPEG2, etc) using both intra and inter prediction modes. These methods usually comprise the steps of computing a residual signal from a prediction signal, DCT, quantization and entropy coding.
At step 26, the transform map c is encoded with a fixed length code (FLC) or variable length code (VLC). But others can be used also (CABAC . . . ).The transform map is a map of vectors also referred as vector map.
The coding method comprises a step 20 of epitome creation and a step 22 of padding of the texture epitome.
The coding method further comprises a step 23 of transform map refinement. Indeed, step 22, the texture epitome is slightly modified, i.e. new pixels are added to the texture epitome so that the texture epitome is aligned on the block structure of the image. Consequently the transform map created at step 20 is not anymore optimized for the new texture epitome. During step 23, each block Bi in the current image Icurr is associated with the patch of the padded texture epitome with which it better matches in the sense of a criterion such as an Euclidean distance. The transform map is thus modified by changing for the current block the identifier of the matched patch. The identifier is for example the absolute coordinates of the matched patched in the texture epitome or the coordinate of a translational vector. More complex transformation may be used to associate a block of the current image to a patch in the texture epitome.
In this case the padding of the texture epitome and the transform map refinement are achieved on the fly, i.e. during epitome creation step 20. At iteration k of step 2404 (chart extension step), the best increment ΔEoptk is determined.
At step 2405, the current epitome is padded to have a block structure. The transform map is not anymore optimized for the new texture epitome. During step 2407, each block Bi in the current image Icurr is associated with the patch of the padded texture epitome with which it better matches in the sense of a criterion such as an Euclidean distance. The transform map is thus modified by changing for the current block the identifier of the matched patch. The identifier is for example the coordinates of the matched patched in the texture epitome.
Then, the current chart ECn is extended, during next iteration k+1, are no more matched patches Mj,l which overlap the current chart ECn(k) and represent others blocks. When the current chart cannot be extended anymore and when the whole image is not yet reconstructed by the current epitome (step 2406), the index n is incremented by 1 at step 2408 and another epitome chart is created at a new location in the image. The method thus continues with the new epitome chart at step 2402, i.e. the new chart is first initialized before its extension. The process ends when the whole image is reconstructed by the epitome (step 2406).
Compared to the epitome construction method according to the state of the art approach, the invention has the advantages of decreasing the epitome encoding cost in comparison to the initial non padded epitome.
The epitome (E,Φ) being used to reconstruct an image from the epitome texture E and the vector map Φ, the invention offers better encoding performances in so far as:
The main targeted applications are all the domains concerned with the image epitome reduction. Applications related to video compression and representations of videos are concerned.
The module IFM is linked to a first encoding module ENC1 adapted to encode the texture epitome according to the step 24 of the method of coding into a first bitstream F1. The module IFM is further linked to a second encoding module ENC2 adapted to encode the transform map according to the step 26 of the method of coding into a second bitstream F2. Each output of the encoding modules ENC1 and ENC2 is connected to an output of the encoding device (OUT1 and OUT2). In another embodiment the coding device ENC further comprises a multiplexing module MUX connected to the outputs of both encoding modules ENC1 and ENC2. The multiplexing module MUX is adapted to multiplex both bitstreams F1 and F2 into a single bitstream. In this case the coding device comprises only one output.
In another embodiment the module IFM is adapted to both pad the texture epitome and refine the transform map on the fly according to the step of the coding method.
Number | Date | Country | Kind |
---|---|---|---|
11305063.7 | Jan 2011 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP11/58495 | 5/24/2011 | WO | 00 | 10/14/2013 |