The present invention relates to a method for encoding a digital image, or a sequence of digital images, for the compression thereof.
For the sake of clarity, it is recalled that a raw image is usually represented digitally by data called “pixels” arranged in matrices, therefore along a width and a length of the image, each pixel associating a point of the image with at least one value of gray level or color intensity (generically referred to as color intensity in the following description). This representation of a raw image is not economical in terms of quantity of data, and it is often desired to encode this raw image in order to represent it digitally in a more compact data format, for example for the storage or transmission thereof.
Known methods for encoding digital images are based on a combination of techniques that operate on an image (“intra-frame encoding”) and on the relationships that exist between several successive images (“inter-frame encoding”).
These techniques usually decompose a digital image into macroblocks of pixels, apply representation space transformations to these macroblocks, such as Fourier, wavelet or discrete cosine transformations, in order to retain only the perceptually significant coefficients.
The document “Representing image in 200 bytes: compression via triangulation” by D. Marwood et al, IEEE ICIP 2018, (arXiv:1809.02257) offers another encoding approach where a digital image is broken down into triangles. The vertex of each triangle corresponds to a pixel of the original raw image from which it inherits the color intensity. By exploiting the properties of a standard tiling of the image, for example by a Delaunay triangulation, the encoded image can be simply represented by the list of vertices. During decoding, the image can be recomposed, pixel by pixel, by interpolation of the color intensity between the vertices of the triangles, these having been reconstituted from the list of vertices. The problem in this approach is to choose the number of triangles and to position the vertices of the triangles in order, at a given compression rate, to minimize the degradation of the reconstituted image with respect to the original raw image.
It is noted that encoding by triangulation of an image is particularly advantageous when this image is intended to be decoded by a modern computer. This computer effectively has a graphics card or chip that is designed to process triangles very efficiently in a pipeline in order to reconstruct images. From this point of view, a triangulation decoding method can be implemented very easily and efficiently.
Intuitively, it is understood that according to this encoding approach, we seek to place a greater density of triangles in the richest areas of the image, in order to be able to recompose the image in these areas with greater finesse. Pragmatically, the aforementioned article proposes several methods, systematic or stochastic, for encoding a digital image by triangulation for the compression thereof.
The present invention proposes an alternative method of encoding by triangulation of a digital image, which is based on the principles of algorithmic topology. This scientific and technical branch has been the subject of numerous publications, and reference can for example be made to the work “Computational Topology: An Introduction” by H. Edelsbrunner and J. Harer, AMS Press, 2009.
In order to achieve one of these aims, the object of the invention proposes a method for encoding a digital image for the compression thereof, the digital image being defined as a point cloud associating a set of N pixels, designated as vertices, with a scalar intensity value, the method aiming to establish triangulation vertices of the digital image and comprising the following steps:
According to other advantageous and non-limiting features of the invention, taken individually or in any technically feasible combination:
According to another aspect, the invention proposes a computer program comprising instructions suitable for implementing each of the steps of the encoding method that has just been presented, when the program is executed on a computer.
According to yet another aspect, the invention proposes an encoder configured to implement the encoding method.
Other features and advantages of the invention will become apparent from the detailed description of the invention that follows with reference to the appended figures, in which:
The present invention proposes a method for encoding a digital image for the compression thereof. More specifically, the encoding aims to establish triangulation data of the image and to this end implements the principles of algorithmic topology, and in particular the technique of persistent homology.
Before going into the detail of this method, it is specified that it is intended to be implemented by an encoder, which can be hardware or software.
As shown in
The encoder ENC outputs triangulation data D, represented here in the form of a computer file. It can be a list of triangles respectively defined by the coordinates of the 3 vertices in the original raw image I (for example in pixel index i,j), each vertex being associated with the digital color intensity data (or with the digital gray level data). Advantageously, however, the encoder relies on a standard tiling, for example a Delaunay triangulation, and the triangulation data can then simply be formed from the list S of vertices {s1, s2, . . . , sL} as shown in
The triangulation data D may themselves undergo digital data compression encoding, for example lossless encoding of the ANS (Asymmetric Numeral System) type. This operation (not shown in
To exploit the encoded image, the triangulation data is supplied to a decoder DEC, which, similarly to the encoder ENC, can assume a hardware or software form. If necessary, this decoder performs the decompression of the digital data received to restore the triangulation data D. When these data D consist of a simple list S of vertices {s1, s2, . . . , sL}, the DEC decoder reconstructs these triangles according to the selected tiling method, a Delaunay triangulation in the example taken above.
To reconstitute an image I′, the decoder DEC recalculates each of its pixels by interpolation from the triangles reconstituted from the list S of vertices {s1, s2, . . . , sL} and the digital color intensity data associated with each of these vertices.
General Principle of the Encoding Method
As specified in the introduction of this application, the present invention proposes an alternative method of encoding by triangulation of a digital image I, which is based on the principles of algorithmic topology. The steps of this method are shown in
This intensity can be a gray level or a luminance level. When the original raw image I is in color, for example with three color levels defined by RGB components for each pixel pij, this image can be processed, during a preliminary processing step PRE, in order to combine the values according to the RGB components and to associate, with the pixel pij, a scalar value, called scalar intensity value fij in this application. Preferably, this combination ensures that two close colors in the original raw image I are transformed into equally close scalar intensity values. A detailed example of such a combination will be given later in the application.
In the remainder of this description, the points of the image will be referred to as vertices, in order to conform to the terminology generally employed in the field of algorithmic topology.
In a first step 1 of an encoding method according to the invention, a filtration of the point cloud is formed. This filtration is composed of a series of simplicial complexes Ki, each simplicial complex corresponding to a data structure associating a plurality of vertices vk with one another.
By definition, in a filtration, the simplicial complexes are ordered relative to one another, such that a complex of rank i is included in the complexes of ranks greater than i. In other words, the vertices vk associated with each other in a simplicial complex Ki having a determined rank i are also associated with each other in the simplicial complexes respectively having ranks higher than the determined rank i.
Using a filtration, the digital image I to be encoded (the point cloud) can be decomposed into subsets of the clouds included in each other. This decomposition can be done in many ways. This may be for example a so-called “Lower Star” filtration, a particularly easy-to-implement example of which will be given in a second part of this description. However, this decomposition is not arbitrary, and it must ensure that each subset presents the properties of a simplicial complex in order to be able to take advantage of the tools that are available in the field of algorithmic topology. It may thus be a Cech filtration or a Vietoris-Rips filtration.
In a second step 2 of an encoding method, the aim is to identify and characterize topological structures within the filtration. These topological structures, in the case of an image and therefore in the 3-dimensional universe of the point cloud, can for example correspond to a connected component, i.e. a cluster of points in the cloud, or to a hole, i.e. an absence of points in a particular area of the cloud.
It is thus sought to determine, in the filtration, the rank from which such topological structures appear and the rank from which these structures disappear. It is also sought to determine the appearance and disappearance vertices of these structures in the filtration. The idea underlying this analysis is that a topological structure that exhibits a relatively long lifetime in the filtration (i.e. the difference between its appearance and disappearance ranks in the filtration is relatively large) is a remarkable topological structure, which “structures” the image. This topological structure can be approximated efficiently during the decoding phase, by interpolation between the vertices transmitted. Conversely, a topological structure having a relatively short lifetime in the filtration is a topological structure of less significance, a detail of the image.
One of the principles underlying the encoding method of the invention is to retain only the remarkable topological structures so as to provide a compressed version of the digital image I. Topological structures of less significance can be omitted without excessively degrading the perceived quality of the image.
To carry out this analysis, during a second step 2, a method according to the invention processes at least part of the simplicial complexes K in order to identify persistence pairs (vic, vid), and to form a list L of persistence pairs {(vic, vid), . . . }. A persistence pair corresponds to a pair of vertices composed of a first and a second vertex vic, vid, the first vertex vic corresponding to the vertex at which a structure appears in the filtration and the second vertex vid of the pair corresponding to the vertex at which the topological structure disappears in the filtration.
The lifetime of this persistence pair (vic, vid) corresponds to the difference between the rank ic of the simplicial complex Kic in which the topological structure appears and the rank id of the simplicial complex Kid in which the topological structure disappears. The method according to the invention uses the filtration persistence pairs identified during the second step 2, during a so-called decimation step 3.
During this step, for each persistence pair (vic, vid) identified in the filtration during the second step 2, the lifetime of the topological structure is calculated, that is to say, the difference in rank is calculated between the rank id of the second vertex vid and the rank ic of the first vertex vic. And only a part of the persistence pairs from the list L created during the second step 2 is retained in a restricted list L′, this part being composed of pairs having the longest lifetimes.
The selection of persistence pairs according to this criterion can be done in multiple ways. It is for example possible to choose to retain a predetermined number of pairs, the “P” pairs with the longest lifetimes. Alternatively, it is possible to choose to retain a predetermined percentage of these pairs, the “P %” of pairs having the longest lifetime. It is also possible to choose the pairs having a lifetime greater than a predetermined threshold.
According to the invention, the persistence pairs retained in the restricted list L′, that is to say, the appearance vertices vic and the disappearance vertices vid of the persistence pairs that constitute the restricted list L′, form the triangulation vertices S. Of course, the more severe the decimation step, i.e. the smaller the number of persistence pairs retained in the restricted list L′, the greater the image compression rate, of course to the detriment of the perceived quality of the reconstructed image. In other words, the invention proposes to retain as triangulation vertices {s1, s2, . . . , sL} of an image I, the appearance vic and disappearance vid vertices of the topological structures having the longest lifetimes in the filtration.
As has already been said, these topological structures are those which best structurally define the image, and the invention therefore proposes to retain the appearance and disappearance vertices of these structures as triangulation vertices.
To encode the image I for the compression thereof, a computer file D therefore contains only the pixels pij of the original raw image I corresponding to the triangulation vertices S retained, that is to say, the indices i, j and the intensity level pij of these pixels. This file D can be recorded on a medium or transmitted directly.
As mentioned previously, this computer file D may be sufficient on its own, and the decoder DEC will construct the triangles from the list S of vertices provided, according to a pre-established or arbitrarily chosen protocol. It can thus be a Delaunay triangulation. Alternatively, the file D may comprise a section designating the triangulation method to be employed.
One can also choose to add pixels arbitrarily in the file D to favor a homogeneous or regular triangulation of the image: one can thus choose to add the pixels forming the 4 corners of the image I, or pixels distributed over the perimeter of the image I or even as a grid on the image.
Alternatively again, the method may comprise an additional step 4 of searching for a triangulation from the selected vertices S. Indeed, there are very many ways to form triangles from a simple list S of vertices. Some of these solutions do not necessarily lead to a high-quality reconstituted image I′, or do not necessarily lead to a reasonable decoding processing time. It is therefore possible in some cases to provide a triangulation step 4 on the encoder side seeking to establish a list of triangles T or information allowing such a list to be established, leading to a satisfactory image quality or decoding speed.
In the most complete case, this triangulation step 4 of a method according to the invention provides a list T of triangles with a favorable triangulation (in processing time, in image quality) and the compressed file D of the image is then made up of triangles defined by their vertex pixels (i.e. the vertex coordinates and color information).
Alternatively, the triangulation step 4 establishes a parameter of a triangulation method known to the decoder, and the computer file D of the compressed image then contains the value of this parameter so that it can be used by the decoder DEC.
In another variant, the triangulation step 4 provides a summary indication that can be used on the decoder DEC side to guide this triangle reconstruction work. It may for example be a matter of recommending to the decoder DEC to form a predetermined number of triangles, independently of the triangulation method implemented.
We will now give a detailed example of the method of encoding an image I that has just been presented in a general manner. This example also forms a preferred embodiment of this method.
An RGB color image shown in
Scalar Intensity Value
To define a scalar value fij at each pixel pij of the image, a first Sobel filter Gx is applied to the image, the filter being defined by the matrix of
The filter is applied to the color vectors of the image pixels, and therefore provides a new vector for each pixel pij, from the color vector of the adjacent pixels pi−1j, and pi+1j.
A second Sobel filter Gy is similarly applied, which is defined by the transposed matrix of Gx that is applied to the color vectors of the pixel pij and of its adjacent pixels according to the index j.
The scalar value fij associated with a pixel pij is defined by the sum of the square norm of the vector Gx·(pi−1j, pij, pi+1j) and the square norm of the vector Gy·(pij−1, pij, pij+1) associated with the pixel pij:
f
ij=norm(Gx·(pi−1j,pij,pi+1j)){circumflex over ( )}2+norm(Gy·(pij−1,pij,pij+1)){circumflex over ( )}2
Order of Vertices in a Vertex Table
Vertex v now denotes the 48,400 pixels pij of the image. Each vertex v can therefore be associated with a scalar quantity of intensity fij, to form a point cloud. A table of the 48,400 vertices is formed, by ordering these vertices v in the table in ascending order of their scalar intensity values fij.
It is of course possible for two vertices vk, vk′ to have identical intensity values. In this case, the following rule can be applied: the vertex vk associated with a pixel pij of index i and j is positioned in the vertex table upstream of a vertex vk′ associated with a pixel pi′j′ of index i′ and j′, if i<i′ or, in the case where i=i′, if j<j′. Otherwise, the vertex vk′ is positioned upstream of the pixel vk in the vertex table. Any other ordering rule may of course be suitable.
Consequently, in the vertex table, a vertex of rank k, denoted vk, is associated with a scalar intensity value fk less than or equal to the scalar intensity value fk′ of a vertex vk′ of rank k′ greater than rank k.
Neighborhood of a Vertex
The filtering algorithm of the present example uses a neighborhood relation of a vertex that can be defined freely while being compatible with the objects used in the algorithmic topology, and in particular the simplicial complexes. This neighborhood forms an equivalence relation on the set formed from the vertex table.
In the case of this preferred embodiment, a first vertex v1, corresponding to a pixel of indices i1, j1 of the digital image, is in the neighborhood of a second vertex v2, corresponding to a pixel of indices i2, j2 of the digital image, if i1=i2+1 and/or if j1=j2+1. This neighborhood relation is illustrated in
Filtration
It will be recalled that filtration aims to form a series of simplicial complexes K, each simplicial complex K corresponding to a data structure associating a plurality of vertices v with one another. In the preferred embodiment, the filtration is a “Lower Star” filtration and as many simplicial complexes K are constructed as there are vertices in the vertex table (i.e. as many as the number of pixels N in the image). This filtration has the advantage of being linear in computational complexity with the number of points in the cloud. It avoids building the simplicial complex of the point cloud itself, and establishes the simplicial complexes of the filtration directly. It naturally brings out the topological invariants associated with the simplicial complexes of the filtration. Each simplicial complex K is composed of at least one class C, each class C grouping together vertices linked together by the neighborhood relation. They are therefore equivalence classes.
In an initialization phase of this first filtration step of the method, an iteration index i and a class index c are initialized to 0, a starting simplicial complex K0 is initialized to an empty set.
Then, the sequence of the following operations is repeated until the iteration index i reaches the number of vertices, here 48,400:
These vertices of the simplicial complex Ki−1 are identified in the neighborhood of the vertex vi, and depending on the case, the simplicial complex K of rank i is defined as follows:
Note that each simplicial complex is composed of at least one class, and generally of a plurality of classes. These classes group together the vertices linked by the neighborhood relation. Two classes of vertices group together vertices that are distinct from each other, each of these classes forming a kind of topological structure of connected component type. The number of classes in a simplicial complex gives, in terms of algorithmic topology, the Betti number of order 0, that is to say, a topological invariant.
Second Step of Establishing Persistence Pairs
From the filtration just constructed, persistence pairs can easily be determined as follows.
We consider case c mentioned above, that is to say, on the occasion of an iteration of index i, the vertices forming a plurality of classes have been grouped together in the class of the lowest rank of this plurality.
A persistence pair can then be established as the pair formed by:
A persistence pair therefore corresponds to the pair of vertices comprising the first vertex vic at which a topological structure (a class) is created in the filtration, and comprising the second vertex vid at which this topological structure (a class) disappears in the filtration.
The persistence pairs can be established at the end of the first step leading to the creation of the filtration, but more simply they can be established at each iteration of the sequence making up this first step, insofar as case c occurs.
By way of illustration of these mechanisms,
In the list under the simplified image of this figure, the algorithm presented has been run from the initialization step for which i=0, up to a terminal step for which i=10.
At each iteration i, the vertex according to one of cases a, b or c described above is added to the simplicial complex Ki.
We observe that up to iteration 8, we create new classes c1, c2, c3 using the vertices vi, or we add this vertex to a pre-existing class according to one of case a or case b.
At iteration 9, vertex v9, in the neighborhood of vertices v5 and v6, contained in classes c2 and c3, respectively, leads to the execution of case c of the first step. Classes c2 and c3 merge within class c2, in which vertex v9 is also added. The disappearance of a class leads to the execution of the second step of the method, and to the creation of a persistence pair whose first vertex is v2 (creation of the class with the lowest rank 2) and whose second vertex is v9 (disappearance of class c3). The lifetime of the associated topological structure is therefore 7.
The same phenomenon is repeated during the last iteration.
Returning to the general description of the processing carried out on the digital image of
Decimation Step
This step requires determining the lifetime of a topological structure. This lifetime is associated with a persistence pair (vic, vid) and is calculated as the difference between the disappearance rank id of the persistence pair and the appearance rank ic of the persistence pair.
As indicated in the general description of the invention, this lifetime is determined for each persistence pair. Then, some of the persistence pairs exhibiting the longest lifetimes are retained in the restricted list. In this example embodiment, about 11% of the pairs whose lifetime is the longest have been retained.
Triangulation vertices are defined as all the vertices that make up the persistence pairs in the restricted list.
An advantageous variant of this preferred embodiment is now presented. It will first be noted that this embodiment allows identification of the topological structures of connected components type, but does not allow identification of those of the hole type. To compensate for this, the method provides for repeating the first step of filtration and the second step of determining the persistence pairs, taking the opposite of the scalar intensity values −fij associated with each vertex. It can be shown that in this way, it is possible to identify topological structures of the hole type.
More precisely, it suffices to repeat the steps presented above, going through the vertex table in a descending manner, and to arrive at the following sequence:
In an initialization phase, the iteration index i is initialized to N+1 and the class index c to 0. A starting simplicial complex KN is initialized to the empty set.
Then, the sequence of the following operations is repeated until the iteration index i reaches 1:
These vertices of the simplicial complex Ki+1 are identified in the neighborhood of the vertex vi, and depending on the case, the simplicial complex Ki of rank i is defined as follows:
Similar to what was explained previously, this last case c′ can be followed by the second step aiming at establishing the persistence pairs. These persistence pairs determined during this “inverse” calculation sequence are grouped together with the pairs identified during the “direct” calculation sequence presented previously, to form the table of persistence pairs subjected to the decimation step. The chain of these two calculation sequences does not represent any favored order, and the “inverse” calculation sequence could naturally be preceded by the “direct” calculation sequence.
To complete the description of this example,
An encoding method according to the invention can be implemented by a hardware device (an encoder) or by software. In the case of software, the method is implemented by a computer, via a computer program consisting of instructions adapted to implement at least each of the steps of this method.
When the encoding method is implemented by computer, said method organizes the data manipulated by the program instructions in the form of computer data structures, which have been designated in the present application by the expressions “table” or “list.” Of course, these designations are not intended to limit how the data is actually organized by the computer program. “Table” or “list” therefore means any data structure allowing access to data recorded in a storage space of the computer. The person skilled in the art can choose the data structure according to the need or the computing environment available to him. It may thus consist of graphs, matrices, lists and/or tables allowing direct access to a data item or access to a pointer to this data item.
As will be readily understood, the invention is not limited to the described embodiment, and it is possible to add variants thereto without departing from the scope of the invention as defined by the claims.
In particular, it will be possible to choose to apply an encoding method according to the invention on a complete image as was presented in the previous detailed example, but alternatively this encoding may be carried out on macroblocks of the image, for example 32 pixels by 32 pixels. This approach has the advantage of allowing the parallelization of the encoding processing of the macroblocks and therefore of reducing the total processing time.
Number | Date | Country | Kind |
---|---|---|---|
1914826 | Dec 2019 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/FR2020/052464 | 12/16/2020 | WO |