This application claims the benefit, under 35 U.S.C. §119 of European Patent Application No. 14305132.4, filed Jan. 30, 2014.
The present disclosure relates to image epitome construction.
An epitome of an image is a condensed representation containing the essence of the textural and structure properties of the image. The epitome approach aims at reducing redundant information (texture) in the image by exploiting repeated content within the image.
It is known to factor an image into a texture epitome E and a transform map φ. The epitome principle was first disclosed by Wang et al. in the article entitled “Factoring Repeated Content Within and Among Images” published in the proceedings of ACM SIGGRAPH 2008 (ACM Transaction on Graphics, vol. 27, no. 3, pp. 1-10, 2008).
In European patent application EP2011794733, a method to construct an epitome is disclosed that comprises finding self-similarities among the image and then determines redundant texture patches to construct epitome charts. Specifically, finding self-similarities comprises for each block Bi in the image Y, determining a set of patches in the same image with similar content, i.e. that approximates Bi with a given error tolerance ε. Such a solution is time consuming and memory demanding.
A method for constructing an epitome from an image divided into non overlapping blocks is disclosed. The method comprises:
Advantageously, determining, for each block, similar patches in the image comprises:
wherein said means for determining, for each block, similar patches in the image comprises:
A method of constructing an epitome of an image divided into non-overlapping blocks is disclosed. A block is located on a block grid as depicted on the left part of
In a step 10, similar block(s) Ai,I are determined for each block Bi in the image Y, where i is an integer identifying the block Bi and I is an integer identifying the similar block Ai,I. A block is similar to the block Bi if a distance d calculated between the content of these two blocks is below a first threshold value εA. The distance d equals for example the Sum of Absolute Differences (SAD), wherein the differences are the pixel by pixel differences between the two blocks. According to a variant, the distance equals the Sum of Square Errors (SSE), wherein the errors are the pixel by pixel differences between the two blocks. Many other such metrics may be used consistent. A block in Y can have no similar blocks, a single similar block or a plurality of similar blocks. With respect to
In a step 12, similar patches Mi,p are determined for one current block Bi and for the similar blocks Ai,I determined in step 10 for Bi, where p is an integer identifying the similar patch. As an example, the current block is the block for which the number of similar blocks Ai,I determined in step 10 is the highest. If two blocks have the same number of similar blocks Ai,I, the current block can be the first block encountered when going through the picture in a specific scan order, e.g. raster scan order (i.e. from top to bottom and from left to right). With respect to
εA=αA*εM with 0≦αA<1
The value of εA is set via the coefficient αA. In practice, an appropriate value for αA is 0.5. When the method of epitome construction is used in an encoder/decoder, the value of the parameter αA could be particularly useful in order to tune the complexity of the encoder/decoder.
This solution advantageously reduces the number of blocks considered during step 12 for the determination of similar patches. By doing so, a patch similar to a current block Bi can have a distance to a similar block Ai,I larger than εM. According to a variant, only a subset of the patches {Mi,0, Mi,1, . . . } similar to the current block Bi are associated with the similar blocks Ai,I. Specifically, a patch Mi,p similar to a current block Bi is further associated with a similar block Ai,I when the following equation is verified: d(Mi,p; Bi)≦εM−εA, where d(Mi,p; Bi) is the distance between the contents of Bi and Mi,k. This ensures that the distance d between any similar block Ai,I and any of its matched is below the second threshold value εM. At the end, for each block Bi belonging to the block grid, a list Lmatch(Bi)={Mi,0, Mi,1, . . . } of matched patches is determined that approximate Bi with a given error tolerance εM.
The step 12 is repeated for a next current block for which no matched patch is determined until at least one matched patch is determined (step 14) for each block. Consequently, the step 12 is not repeated for blocks Ai,0, Ai,1, Ai,2, Ai,3 and Bj (=Ai,4) because these blocks already have matched patches, namely the patches {Mi,0, Mi,1, . . . } similar to block Bi. When matched patches have already been determined for a block, the block is removed from the list of similar blocks it belongs to. Exemplarily, Ai,0 is a block similar to Bi and Bh. Consequently, Ai,0 is removed from the list of blocks similar to Bh because the matched patches {Mi,0, Mi,1, . . . } are associated with Ai,0 when considering block Bi. According to a variant, a block An,m belonging to several lists of similar blocks is left in the list of the block to which An,m is the closest in the sense of the distance d. Exemplarily, if An,m is a block similar to Bp and Bq and d(An,m, Bq)<d(An,m, Bp) then An,m is removed from the list of blocks similar to Bp and left in the list of blocks similar to Bq.
In a step 16, at least one epitome chart is constructed from the lists of matched patches. The method of Wang disclosed in the article entitled “Factoring Repeated Content Within and Among Images” published in the proceedings of ACM SIGGRAPH 2008 (ACM Transaction on Graphics, vol. 27, no. 3, pp. 1-10, 2008) can be used. Many other such methods for constructing epitome chart(s) using lists of matched patches may be used. According to a specific and non-limiting embodiment depicted on
In a step 162, at least one epitome chart is constructed. To this aim, matched patches are selected in order to construct epitome charts, the union of all the epitome charts constituting the texture epitome E. A matched patch selected to be part of an epitome chart is called an epitome patch. Each epitome chart represents specific regions of the image Y in term of texture. Step 162 is detailed below.
In a step 1620, an index n is set equal to 0, n is an integer.
In a step 1622, a first epitome chart ECn is initialized. Several candidate matched patches can be used to initialize the epitome chart ECn. Each epitome chart is initialized by the matched patch E0 which is the most representative of the not yet reconstructed, i.e. represented, remaining blocks. A block Bi is able to be reconstructed by a matched patch MjJ if Bi belong to the list L′match (MjJ). Let YεRN×M denote the input image and let Y′εRN×M denote the image reconstructed by a candidate matched patch and the epitome charts previously constructed. To initialize a chart, a selection criterion based on the minimization of the MAE (equation 1) or of the Mean Square Error (equation 2) criterion can be used:
where Yi,j is the image value of pixel (i,j) in the image Y and Y′i,j is the image value of pixel (i,j) in the reconstructed image Y′. Other metrics can be used to compute the reconstruction error.
The selected criterion takes into account the reconstruction errors on the whole image. This criterion allows the epitome to be extended by a texture pattern that allows the reconstruction of the largest number of blocks while minimizing the reconstruction error. The reconstruction error is computed between the image Y and the image Y′ reconstructed from the current epitome. The current epitome comprises a candidate matched patch and the epitome charts previously constructed. In a specific and non-limitative embodiment, when computing the image reconstruction error, a zero value is assigned to the pixels of blocks in the image Y′ that are not yet represented by epitome patches of the current epitome. Thus, the error for these pixels is equal to the value of the pixels in the original image. The issue is that the overall distortion does not only depend on the reconstructed part of the image, but also on the non-reconstructed part. According to a variant a value different from zero is used. As an example, the value 128 is used instead of zero. According to yet another variant, the error for these pixels is set to a maximum value, e.g. 255. The latter solution tends to promote reconstruction of larger part of the image, thus accelerating the creation of the epitome.
In a step 1624, the epitome chart ECn is then progressively enlarged. The step is detailed on
where D is a distortion and R a rate.
Exemplarily,
According to a variant,
In a preferred embodiment, the λ value is set to 1000. The first term of the criterion refers to the average reconstruction error per pixel when the input image is reconstructed by texture information contained in the current epitome
and the increment ΔE. As in the initialization step, when the pixels are neither represented by the current epitome Ecurr nor by the increment nor by the inferred patches (i.e. does not belong to a block that can be reconstructed from Ecurr, from the matched patch that contains the increment or from the inferred blocks), a zero value is assigned to them. According to a variant a value different from zero is used. As an example, the value 128 is used instead of zero. According to yet another variant, the error for these pixels is set to a maximum value, e.g. 255. The latter solution tends to promote reconstruction of larger part of the image, thus accelerating the creation of the epitome. The second term of the criterion corresponds to a rate per pixel when constructing the epitome, which is roughly estimated as the number of pixels in the current epitome and its increment divided by the total number of pixels in the image. After having selected the locally optimal increment ΔEopt, the current epitome chart becomes: ECn(k+1)=ECn(k)+ΔEopt. The assignation map is updated for the blocks newly reconstructed by ECn(k+1).
Then, the current chart is extended, during next iteration k+1, until there are no more matched patches Mj,l which overlap the current chart ECn(k) and represent others blocks. If such overlapping patches exist then step 1624 is repeated with ECn(k+1).
According to a specific embodiment, when the current chart ECn(k) cannot be enlarged anymore, it is padded so that the current chart ECn(k) is aligned on the block grid. To the aim, the pixels are for example padded with their value in the original picture Y. Once the current epitome chart is padded, it is checked whether the padded chart contains new inferred patches able to reconstruct new blocks. This embodiment accelerate the reconstruction of the image especially when there are many inferred patches in the padded chart. It is preferable to pad the current epitome chart ECn(k) after its entire construction than padding it after each enlargement by ΔEopt. Indeed, the latter leads to an increase of the size of the epitome chart.
When the current chart cannot be extended anymore and when the whole image is not yet reconstructed by the current epitome (step 1626), the index n is incremented by 1 in a step 1628 and another epitome chart is constructed in a new location in the image. The method thus continues with the new epitome chart at step 1622, i.e. the new chart is first initialized before its enlargement. The process ends when the whole image is reconstructed/represented by the epitome (step 1626). The texture epitome E comprises the union of all epitome charts ECn. The assignation map indicates for each block Bi of the current image Y the location in the texture epitome of the epitome patch that is to be used for its reconstruction.
The implementations described herein may be implemented in, for example, a method or a process, an apparatus, a software program, a data stream, or a signal. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method or a device), the implementation of features discussed may also be implemented in other forms (for example a program). An apparatus may be implemented in, for example, appropriate hardware, software, and firmware. The methods may be implemented in, for example, an apparatus such as, for example, a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device. Processors also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants (“PDAs”), and other devices that facilitate communication of information between end-users.
Implementations of the various processes and features described herein may be embodied in a variety of different equipment or applications, particularly, for example, equipment or applications. Examples of such equipment include an encoder, a decoder, a post-processor processing output from a decoder, a pre-processor providing input to an encoder, a video coder, a video decoder, a video codec, a web server, a set-top box, a laptop, a personal computer, a cell phone, a PDA, and other communication devices. As should be clear, the equipment may be mobile and even installed in a mobile vehicle.
Additionally, the methods may be implemented by instructions being performed by a processor, and such instructions (and/or data values produced by an implementation) may be stored on a processor-readable medium such as, for example, an integrated circuit, a software carrier or other storage device such as, for example, a hard disk, a compact diskette (“CD”), an optical disc (such as, for example, a DVD, often referred to as a digital versatile disc or a digital video disc), a random access memory (“RAM”), or a read-only memory (“ROM”). The instructions may form an application program tangibly embodied on a processor-readable medium. Instructions may be, for example, in hardware, firmware, software, or a combination. Instructions may be found in, for example, an operating system, a separate application, or a combination of the two. A processor may be characterized, therefore, as, for example, both a device configured to carry out a process and a device that includes a processor-readable medium (such as a storage device) having instructions for carrying out a process. Further, a processor-readable medium may store, in addition to or in lieu of instructions, data values produced by an implementation.
As will be evident to one of skill in the art, implementations may produce a variety of signals formatted to carry information that may be, for example, stored or transmitted. The information may include, for example, instructions for performing a method, or data produced by one of the described implementations. For example, a signal may be formatted to carry as data the rules for writing or reading the syntax of a described embodiment, or to carry as data the actual syntax-values written by a described embodiment. Such a signal may be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal. The formatting may include, for example, encoding a data stream and modulating a carrier with the encoded data stream. The information that the signal carries may be, for example, analog or digital information. The signal may be transmitted over a variety of different wired or wireless links, as is known. The signal may be stored on a processor-readable medium.
A number of implementations have been described. Nevertheless, it will be understood that various modifications may be made. For example, elements of different implementations may be combined, supplemented, modified, or removed to produce other implementations. Additionally, one of ordinary skill will understand that other structures and processes may be substituted for those disclosed and the resulting implementations will perform at least substantially the same function(s), in at least substantially the same way(s), to achieve at least substantially the same result(s) as the implementations disclosed. Accordingly, these and other implementations are contemplated by this application.
The present principles find its interest in all domains concerned with the image epitome reduction. Applications related to video compression and representations of videos are concerned.
Number | Date | Country | Kind |
---|---|---|---|
14305132 | Jan 2014 | EP | regional |
Number | Name | Date | Kind |
---|---|---|---|
7978906 | Jojic | Jul 2011 | B2 |
8204338 | Hoppe et al. | Jun 2012 | B2 |
8478057 | Cui et al. | Jul 2013 | B1 |
20090208110 | Hoppe | Aug 2009 | A1 |
20130142405 | Nada | Jun 2013 | A1 |
20130223529 | Amonou | Aug 2013 | A1 |
Number | Date | Country |
---|---|---|
2011794733 | Apr 2006 | EP |
2666291 | Nov 2013 | EP |
WO2012097882 | Jul 2012 | WO |
WO2012097919 | Jul 2012 | WO |
Entry |
---|
Wang et al. (“Factoring Repeated Content Within and Among Images,” ACM Transactions on Graphics, vol. 27, No. 3, Aug. 2008). |
Suzuki et al. (“Image coding by using structure-texture decomposition and an analysis of their relationship,” IEEE Workshop on Signal Processing System, Oct. 6-8, 2010, pp. 105-110). |
Wang et al. (“Intra coding and refresh based on video epitomic analysis,” IEEE International Conference on Multimedia and Expo (ICME), Jul. 19-23, 2010). |
Cherigui et al: “Epitome-based image compression using translational sub-pel mapping”, Multimedia Signal Processing (MMSP), 2011 IEEE 13th International workshop on, IEEE, Oct. 17, 2011, pp. 1-6. |
Suzuki et al: “Image coding by using structure/texture decomposition and an analysis of their relationship”, 2010 IEEE Workshop on Signal Processing System (SIPS 2010): Oct. 6-8, 2010, IEEE, , Oct. 6, 2010, pp. 105-110. |
Wang et al: “Factoring repeated content within and among images”, ACM SIGGRAPH 2008 Papers (SIGGRAPH' 08), Aug. 11, 2008, pp. 1-10. |
Chu: “Epitome and its applications”, Thesis, Univerisity of Illinois, 2012; pp. 1-40. |
Aharon et al: “Sparse and redundant modeling of image content using an image signature directory”, SIAM Journal of imaging sciences, vol. 1, No. 3, pp. 228-247, Jul. 2008. |
Barnes etal: “PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing”, ACM transactions on graphics (Proc. SIGGRAPH), vol. 28, No. 3, Aug. 2009; pp. 1-10. |
Barnes etal: “The Generalized PatchMatch Correspondence Algorithm”, European conference on comptuer vision (ECCV), Sep. 2010; pp. 1-14. |
Bentley: “Multidimensional binary search trees used for associative searching”, commun. ACM, vol. 18, No. 9, pp. 509-517, 1975. |
Bhatia: “Adaptive K means clustering”, in FLAIRS conference, May 2004; pp. 1-5. |
Cheung et al: “Video epitomes”, international journal of computer vision, vol. 76-2, pp. 141-152, Feb. 2008. |
Jojic et al: “Epitomic analysis of appearance and shape”, in Proc. IEEE conf. comput. vis. (ICCV), 2003, pp. 34-41. |
Lucas et al: “An Iterative Image Registration Technique with an Application to Stereo Vision”, in proceedings of imaging understanding workshop, 1981; pp. 121-130. |
Shi et al: “Good features to track”, in IEEE conf. on computer vision and pattern recognition (CVPR), 1994; pp. 595-600. |
Simakov et al: “Summarizing visual data using bidirectional similarity”, in IEEE conf. on computer vision and pattern recognition (CVPR), 2008, pp. 1-8. |
Wang et al: “Improving intra coding in H264—AVC by image epitome”, in advanced in multimedia information processing (PCM), 2009, pp. 190-200. |
Wang et al: “Intra coding and refresh based on video epitomic analysis”, in IEEE international conference on multimedia and expo (ICME), 2010, pp. 452-455. |
Alain et al: “Clustering based methods for fast epitome generation” EUSIPCO 2014; pp. 1-5. |
Search Report Dated Apr. 15, 2014. |
Number | Date | Country | |
---|---|---|---|
20150215629 A1 | Jul 2015 | US |