Information
-
Patent Application
-
20030210824
-
Publication Number
20030210824
-
Date Filed
May 05, 200321 years ago
-
Date Published
November 13, 200321 years ago
-
Inventors
-
Original Assignees
-
CPC
-
US Classifications
-
International Classifications
Abstract
The invention concerns a method for compressing data, in particular images, by transform, in which method this data is projected onto a base of localized orthogonal or biorthogonal functions, such as wavelets. To quantize each of the localized functions with a quantization step that enables an overall set rate Rc to be satisfied, the method includes the following steps:
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is based on French Patent Application No. 02 05 724 filed May 7, 2002, the disclosure of which is hereby incorporated by reference thereto in its entirety, and the priority of which is hereby claimed under 35 U.S.C. §119.
BACKGROUND OF THE INVENTION
[0002] 1. Field of the Invention
[0003] The invention relates to a method for optimal allocation of the bit rate of a data compressor by transform.
[0004] It concerns more specifically such an allocation process for a compressor by orthogonal or biorthogonal transformation, in particular a wavelet transformation, used in combination with a scalar quantizer and a lossless entropy coder.
[0005] Hereinafter, reference will mainly be made to a wavelet transformation leading to a Multiresolution Analysis (MRA) implemented using digital filters intended to perform a decomposition into subbands. However, the invention is not limited to a wavelet transformation.
[0006] It is recalled that an MRA consists in starting from an image in the space domain with a set of image elements, or pixels, and in decomposing this image into subbands in which the vertical, horizontal and diagonal details are represented. Thus there are three subbands per resolution level as indicated in FIG. 1.
[0007]
FIG. 1 illustrates an MRA on three resolution levels. In this representation, a subband is represented by a block. Thus, the image is first divided into four blocks with three subbands W1,1, W1,2 and W1,3 and a low-frequency representation W1,0 of the initial image. Subband W1,1 contains the horizontal wavelet coefficients; subband W1,2 contains the vertical wavelet coefficients; subband W1,3 contains the diagonal wavelet coefficients and the block W1,0 is called “summary” or low frequencies.
[0008] At the next resolution level, the block W1,0 is itself divided into four blocks (one summary and three subbands) W2,0, W2,1, W2,2 and W2,3 and, finally, the block W2,0 is divided into four blocks W3,0, W3,1, W3,2 and W3,3 for the third Naturally, a finer division (by increasing the resolution levels) or a coarser division (by reducing the number of resolution levels) can be carried out.
[0009] 2. Description of the Prior Art
[0010] It is known that the wavelet transform is well suited to image compression since it provides strong coefficients when the image exhibits strong local variations in contrast, and weak coefficients in the areas in which the contrast varies slightly or slowly.
[0011] It is also known that the probability distribution of a subband can be modeled by a two-parameter unimodal function, centered at the origin, of the generalized Gaussian type:
1
[0012] where
2
[0013] For certain applications, particularly when the compression data must be transmitted over transmission channels imposing a bit rate, it is necessary to quantize the subband coefficients in an optimal manner by minimizing the total distortion while satisfying a set bit rate.
[0014] The known optimum rate allocation methods propose, in general, performing a digital optimization process based on the minimization of a functional linking rate and distortion and controlled by a Lagrange parameter. In this case, an iterative optimization algorithm is employed which is generally very costly in calculation time and therefore cannot be used for real-time applications or for applications with limited calculation resources. Only simplification of these methods enables higher speed, but the price is a degradation in performance.
SUMMARY OF THE INVENTION
[0015] The invention enables an effective and precise optimization of rate allocation, without making use of a conventional Lagrangian optimization scheme.
[0016] To implement an optimum rate allocation, the method in accordance with the invention includes the following steps:
[0017] a) the image (or data) to be compressed is projected onto a base of localized orthogonal or biorthogonal functions, such as wavelets,
[0018] b) a set rate Rc is chosen, representing the total number of bits that can be used to code all coefficients of the transformed image,
[0019] c) to allocate this number of bits to the coefficients which are going to be quantized:
[0020] c.1 a probability density model in the form of a generalized Gaussian is associated with each of the subbands,
[0021] c.2 the parameters of this generalized Gaussian are estimated while minimizing the relative entropy, or Kullback-Leibler distance, between this generalized Gaussian and the empirical distribution of coefficients of each subband, and an optimum quantization step is then deduced therefrom, which is such that the total allocated rate Rc is distributed in the various subbands while minimizing the total distortion.
[0022] Generalized Gaussians are described in the article by S. MALLAT: “Theory for multiresolution signal decomposition: the wavelet representation”, published in IEEE Transaction on pattern analysis and machine intelligence, vol. 11 (1989) No. 7, pages 674-693.
[0023] As a preference, to determine the optimum quantization step for each subband, from the parameters of the generalized Gaussian, the graph of rate as a function of the quantization step and the graph of distortion as a function of the quantization step are determined, and the graphs of rate and distortion are tabulated and, from these tabulated graphs, the optimum quantization step is deduced.
[0024] Minimization of the Kullback-Leibler distance for estimating the parameters of the generalized Gaussian model ensures a minimization of the cost of coding in accordance with information theory.
[0025] Thus, the invention concerns a method for compressing data, in particular images, in which method this data is projected onto a base of localized orthogonal or biorthogonal functions, and which method, to quantize each of the localized functions with a quantization step that enables an overall set rate Rc to be satisfied, includes the following steps:
[0026] a) a probability density model of coefficients in the form of a generalized Gaussian is associated with each subband,
[0027] b) the parameters α and β of this density model are estimated while minimizing the relative entropy, or Kullback-Leibler distance, between this model and the empirical distribution of coefficients of each subband, and
[0028] c) from this model, for each subband, an optimum quantization step is determined such that the rate allocated is distributed in the various subbands and such that the total distortion is minimal.
[0029] As a preference, for each subband, the graphs of rate R and distortion D are deduced, from the parameters α and β, as a function of the quantization step and these graphs are tabulated to determine said optimum quantization step.
[0030] The transformation is, for example, of the wavelet type.
[0031] In this case, according to an embodiment, to determine the parameter β of the generalized Gaussian associated with each subband, the following expression is minimized:
3
[0032] in which formula the first term represents said relative entropy,
4
[0033] has the value:
5
[0034] Wjk[n,m] being a coefficient of a subband, and njk being the number of coefficients in the subband of index j,k.
[0035] As a preference, the parameter ox of the generalized Gaussian for the corresponding subband j,k is determined by the following formula:
6
[0036] The tabulation of the distortion values is advantageously carried out for a sequence of values of the parameter β.
[0037] Similarly, the tabulation of the rates R is advantageously carried out for a sequence of values of the parameter β.
[0038] The sequence of values of the parameters β is for example:
7
[0039] As a preference, the bit budget is distributed to each of the subbands according to their ability to reduce the distortion of the compressed image.
[0040] In one embodiment, a bit budget corresponding to the highest tabulated quantization step is assigned to each subband and the remaining bit budget is then cut into individual parts that are gradually allocated to the localized functions having the greatest ability to make the total distortion decrease, this operation being repeated until the bit budget is exhausted.
[0041] According to one embodiment, the localized functions and the quantization step for each subband and the parameters of the density model are coded using a lossless entropy coder.
[0042] Other features and advantages of the invention will become apparent with the description of some of its embodiments, this description being provided with reference to the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0043]
FIG. 1, already described, shows the decomposition into subbands of an image on three resolution levels.
[0044]
FIG. 2 is a diagram showing graphs of rate for an example application of the method according to the invention.
[0045]
FIG. 3 is a diagram similar to that of FIG. 2 for graphs of distortion.
[0046]
FIG. 4 is a schematic diagram of a device implementing the method according to the invention to which a control means has been added, which is intended to manage the filling of a buffer memory the role of which is to deliver a constant number of bits to a transmission channel or to any other device requiring a fixed rate.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0047] The example of the invention that will be described below concerns an allocation of rate by subbands. It is carried out in three steps:
[0048] First Step: Wavelet Transformation of the Image
[0049] This transformation provides a family of wavelet coefficients Wjk[n,m] distributed in various subbands, where j denotes the scale of the subband and k its orientation (FIG. 1). The number of indexes n,m for which wavelet coefficients are thus defined depends on the scale of the subband. It is therefore noted njk.
[0050] Consider, for example, an image of size 512×512 pixels of a terrestrial landscape obtained by an observation satellite. The size of a subband j,k is in this case njk=(512/2j)2.
[0051] Second Step: A total rate setting is allocated for all the subbands and this total rate setting is distributed among the various subbands.
[0052] To assign the rate in each subband, the procedure is as follows:
[0053] a) It is considered that the coefficients in each subband j,k are distributed according to a statistical distribution corresponding to a generalized Gaussian of parameters α and β. These parameters α and β are estimated, in a manner that will be described later, while minimizing the Kullback-Leibler distance between this statistical model and the empirical distribution of coefficients of each subband.
[0054] b) The knowledge of these parameters α and β can be used to predict, for each subband j,k, the distortion Djk which will be associated with a rate Rjk which will be allocated to it, and the relationship between these two parameters α and β and the quantization step Δjk chosen for this subband. Rate scheduling is carried out by distributing the rate setting R among the subbands j,k, in such a way that the complete rate is allocated to them:
8
[0055] and that the resulting distortion
9
[0056] is minimal.
[0057] At the end of the scheduling phase, an optimum quantization step Δjk has therefore been determined for each subband, enabling the rate setting R to be reached while minimizing the total distortion D.
[0058] Third step: The subbands are quantized with quantization steps Δjk which can be different from one subband to another and the quantized coefficients are sent to an entropy coder.
[0059] Determination of Parameters α and β of the Generalized Gaussians for each Subband:
[0060] For each subband j,k, the statistical distribution of the subband tends toward a generalized Gaussian Gαβz:
10
[0061] where
11
[0062] To determine the parameters αand β, as indicated above, the relative entropy between this subband density model and the empirical density of this subband is minimized.
[0063] It is recalled here that the relative entropy, or Kullback-Leibler distance, between two probability densities p1 and p2 is given by:
12
[0064] In the sense of this Kullback-Leibler distance, the distribution p2 which best tends toward the distribution p1 is that which minimizes D(p1∥p2).
[0065] Note furthermore that to determine p2 which minimizes D(p1∥p2), for a fixed p1, the following is minimized:
D
(p1∥p2)=∫p1(x)log p1(x)dx−∫p1(x)log p2(x)dx.
[0066] As the first term of this difference does not depend on p2, minimizing this sum in P2 amounts to minimizing the second term of this sum, and therefore amounts to minimizing:
H
(p1∥p2)=−∫p1(x)log p2(x)dx (1)
[0067] If p1 and p2 are discrete distributions, the term above is then the average rate of coding of a source of symbols of probability distribution p1, coded with optimum entropy symbols for a distribution p2.
[0068] Minimizing this term amounts therefore to choosing a model distribution p2 which will produce the most efficient symbols for coding a distribution source p1.
[0069] Therefore D(p1∥p2) will be minimized and not the reverse, D(p2∥p1), since the Kullback-Leibler distance is not symmetric.
[0070] In the present case, the distribution p1 is the empirical distribution of a subband (j,k):
13
[0071] and the distribution p2 is the generalized Gaussian indicated above:
14
[0072] The expression (1) calculated with (2) and (3) gives:
15
[0073] Using the value of Aαβ gives:
16
[0074] Minimization in α,β is carried out in two steps: during the first step, α is minimized for a fixed β; this minimization is performed by a simple explicit calculation. During the second step, a search is carried out for an optimum β using tabulated values to avoid calculations that are too complex.
[0075] Minimization of α for a Fixed β
[0076] To minimize expression (4) in α, with a fixed β, the sum of the terms that depend on α must be minimized, other terms being constants, that is:
17
[0077] The derivative of this expression with respect to α is:
18
[0078] and its derivative is cancelled for
19
[0079] To lighten the notations, the moment of order β of the subband j,k is defined:
20
[0080] which is therefore the average of the absolute values of coefficients of the subband, raised to the power β.
[0081] The optimum α is then written simply:
21
[0082] Calculation of Optimum β
[0083] It is therefore known how to determine the optimum α once β is known. To determine the optimum β, in equation (4) α is replaced by the value given by equation (5). The following is obtained:
22
[0084] The optimum value of β will therefore be obtained by minimizing the expression above which can again be rewritten in the following form:
23
[0085] Thus, the calculation of the optimum α,β pair is carried out in two stages:
[0086] First, β is calculated by minimizing expression (6).
[0087] Secondly, α is calculated using expression (5).
[0088] β is chosen from a finite number of candidate values, for example
24
[0089] and most of the calculations can be tabulated. The only component which actually depends on the subband is
25
[0090] The calculation of α is hence explicit.
[0091] To avoid any confusion, it is stated here that, strictly, these coefficients should be called αjk and βjk since they are, a priori, different for each subband.
[0092] In the example above, for the subband W1,1 the calculation of W1,1β and of the expression (6) for values of β in the set
26
[0093] gives the values set out in table 1 below:
1TABLE 1
|
|
|
27
|
|
|
β1/2281292
|
301.421.883.177.9638.6
|
312.702.742.853.023.25
|
[0094] The minimum is therefore reached for β={fraction (1/2)}. The corresponding value of α is then
32
[0095] The following is therefore obtained: α1,1=0.502 and β1,1=½.
[0096] The parameters α and β for the other subbands are calculated in the same way.
[0097] After having determined α and β, the relationship between rate R and quantization step Δ and between the distortion D and the quantization step are determined. The graphs R(Δ) and D(Δ) are tabulated and used in such a way that, for each subband, an optimum quantization step is obtained, that is, in such a way that the allocated rate R is distributed in the various subbands and that the total distortion is minimal.
[0098] Prediction of the Relationship between Rate and Quantization Step
[0099] If a subband has a statistical distribution described by a generalized Gaussian Gα,β, the associated rate (in bits per coefficient) is noted r(α,β,Δ), for a quantization step Δ.
[0100] This rate is the entropy of the quantized subband with a quantization step Δ, which is, for example in the case of a uniform quantization:
33
[0101] if
34
[0102] The rate allocation means must know the relationship between r and (α,β,Δ). For this purpose, it is sufficient to tabulate the function r. Since it can be verified that:
35
[0103] it is sufficient to tabulate the functions xr(1,β,x) for all the candidate values of β (there are five in the above example). The x values are also considered in a range [xmin,xmax].
[0104] Note that the same type of calculations and tabulations can be performed if a quantizer is used, having a quantization interval, centered at 0, of different size, as is often the case in the coding of images by wavelets.
[0105] For a subband of size njk, the total rate will then be
R
jk
=n
jk
r
(α,β,Δ).
[0106]
FIG. 2 indicates tabulated values of r(1,β,x), for values of β in
36
[0107] In FIG. 2, the values of x are plotted as abscissae and the values r of rate are plotted as ordinates.
[0108] Relationship between Distortion and Quantization Step
[0109] In the same way, the relationship between quantization step and distortion can be modeled in a subband having a statistical distribution tending toward a generalized Gaussian. The average distortion per coefficient is noted D(α,β,Δ), for a subband of generalized Gaussian statistical distribution Gαβ.
[0110] This distortion is written:
37
[0111] where [x] denotes the integer that is closest to x, for the case in which the quantization operator is a uniform quantization.
[0112] Here also, the values of d(α,β,Δ) can be tabulated economically, by making use of the following homogeneity equation:
d
(α,β,Δ)=α2d(1,β,Δ/α)
[0113] and it will therefore be sufficient to tabulate the function
x d(1,β,x)
[0114] for values of β in the range of chosen candidate values, and values of x selected in [xmin,xmax].
[0115] Here again, the total distortion for a subband is the sum of distortions per coefficients, and it is therefore written:
D
jk
=n
jk
d
(α,β,Δ)
[0116] The tabulated graphs are shown in FIG. 3.
[0117] In said FIG. 3, the x values are plotted as abscissa and the distortion as ordinate.
[0118] For a fixed α and β, the relationship between r and Δ is invertible. This inversion is performed either by tabulation in advance, or by interpolation of the tabulated values of r(1,β,x). Similarly, the relationship between d and Δ can be inverted.
[0119] The function which associates a quantization step Δ with a distortion d is noted d−1 and the function which associates the quantization step Δ with a rate r is noted r−1, for fixed α and β:
Δ=d−1(α,β,d)
Δ=r−1(α,β,r)
[0120] The rate scheduling consists in cutting the set rate R into fragments which will be promptly allocated. This assignment is carried out iteratively.
[0121] For the initialization, the starting point, for each subband, is a maximum quantization step specified by the tables of R(α,β,Δ) and D(α,β,Δ), which will be:
38
[0122] Rates Rjk and distortions Djk are associated with these quantization steps by the formulae:
R
jk
=n
jk
r
(αjk,βjk,Δjk)
D
jk
=n
jk
d
(αjk,βjk,Δjk)
[0123] The rate remaining to be allocated is therefore
39
[0124] This rate is cut into N fragments Fn which can be of the same size or of different sizes, with
40
[0125] Above all, the maximum size of a fragment must remain small in view of the total rate setting. Each of these fragments is then allocated iteratively, as follows:
[0126] For each subband, the potential rate is calculated which would be the rate associated with the subband if this new rate fragment Fn happened to be allocated to it:
41
[0127] The associated potential quantization steps are then calculated:
42
[0128] then the new associated distortions:
43
[0129] For each subband j,k it can be estimated what the reduction in distortion would be if the rate fragment F were to be allocated to it. This reduction would be:
44
[0130] The “best” subband j,k, that is the one for which the reduction in distortion is strongest, is allocated the fragment:
45
[0131] The same rate allocation is reiterated for the next fragments Fn+1, Fn+2, until the rate to be allocated is exhausted.
[0132] Table 2 below indicates, for the abovementioned example, which are the values of α and β retained. The coding of the low-pass subband W3,0 is performed in DPCM (Difference Pulse Code Modulation), and the statistics indicated in the table are therefore statistics of W3,0[nk,mk]−W3,0[nk−1,mk−1], where the numbering of pairs
k
(nk,mk)
[0133] indicates the direction chosen for the low-pass coefficients during the coding.
2TABLE 2
|
|
Laplacian parameters for a test image
jkα jkβjkDjkRjk
|
110.501/250444
120.351/235444
130.561{square root}2560
211.501/2150111
221.201/2120111
230.751/275111
313.601/236027.8
323.401/234027.8
332.101/221027.8
30111/2110027.8
|
[0134] The total rate allocated is therefore at least the sum of the rates Rjk above, that is 1332 bits, therefore 0.005 bits per pixel. The setting is of 1 bit per pixel, that is a total budget of R=262144 bits, the image being of size 512×512.
[0135] The remaining rate to be allocated of 260812 bits is divided into parts that are not necessarily equal, but of sizes that are all less than a fixed limit. The first rate fragment to be allocated is for example of 384 bits.
[0136] The tables are used to calculate, for each subband, what the distortion associated with each subband will be if a budget of 384 additional bits were to be attributed to it. The result of these calculations is set out in table 3.
[0137] For example, for the subband W1,2, the total rate R1,2, would change from 444 bits to 828 bits, and the distortion D1,2, would change from 0.93×106 to 0.89×106 and would therefore be reduced by 0.03×106.
3TABLE 3
|
|
Simulations for allocating the first fragment. The distortions are to
be multiplied by 106.
jkRjkΔjkDjkR*jkΔ*jkD*jkDjk-D*jk
|
11444501.86828431.790.07
12444350.93828300.890.04
130350.19384120.180.01
211111504.01495983.560.45
221111202.71495802.400.31
23111751.03495490.910.12
3127.83605.854111544.111.74
3227.83405.384111483.781.60
3327.82101.96411891.370.59
3027.811005.244114613.681.56
|
[0138] In table 3, it is seen that the strongest reduction is obtained by allocating the 384 bits to the subband W3,1. Therefor Δ3,1, R3,1 and D3,1 are updated and the process is continued by the allocation of the next fragment. When all the fragments have been allocated, quantization setting Δjk, and prediction on the associated rate Rjk and distortion Dj for each subband, are therefore obtained.
[0139] The table finally obtained for the image is given in table 4 below:
4TABLE 4
|
|
Example of settings of quantization and associated rates by the
scheduler. The rates are given in bits per coefticient.
jkΔ jkRjk
|
116.10.92
126.50.53
138.40.03
215.72.5
225.72.1
235.81.4
306.33.8
315.83.7
3262.9
335.95.5
|
[0140] The allocation of bits can be performed in open-loop mode or, preferably, in closed-loop mode with a device of the type illustrated in FIG. 4.
[0141] The rate regulation obtained using the device illustrated in FIG. 4 is to the nearest bit. This device is based on the one described in the French patent filed on Mar. 18, 1999 under the number 99/03371.
[0142] In this patent, a technique is described for the acquisition of images of the Earth by a moving observation satellite (“Push Broom” mode) in which the images are compressed and transmitted to the ground via a transmission channel which imposes a constant rate. To achieve this objective, an optimum allocation of bits is performed, in real time or in slightly deferred time, in order to code the subband coefficients, and then the allocation errors and variations in rate at the output of the entropy coder are corrected.
[0143] This device includes a compression means 20, and the image data is applied to the input 22 of said compression means 20. The compression means 20 includes a wavelet transformation unit 24 supplying data to a subband quantizer which delivers data to a coder 28 the output of which forms the output of the compression means 20. This output of the compression means 20 is linked to a regulation buffer memory 30 supplying, at its output, compressed data according to a rate Rc.
[0144] The coder 28 also supplies a number Np of bits produced which is applied to an input 33 of a control unit 32 having another input for the set rate Rc and an output applied to a first input of a bit allocation unit 34. This latter unit 34 has another input 36 to which the set rate Rc is applied and an input 38 to which the output data from the transformation unit 24 is applied.
[0145] The output of the unit 34 is applied to an input 40 of the subband quantizer 26.
[0146] The method and device in accordance with the invention enable the cost of coding the image to be reduced and ensure a rapid rate regulation since, unlike in the prior art, no iterative optimization (Lagrangian for example) is performed. The calculations are simple. Thus, the method in accordance with the invention is well suited for all the applications and, in particular, for space applications in which the resources are necessarily limited.
[0147] It is to be noted that, in the present description, the term “image” must be understood as being in the sense of an image of at least one dimension, that is to say that the invention extends to data in general.
Claims
- 1. A method for compressing data, in particular images, by transform, in which method this data is projected onto a base of localized orthogonal or biorthogonal functions, and which method, to quantize each of the localized functions with a quantization step that enables an overall set rate Rc to be satisfied, includes the following steps:
a) a probability density model of coefficients in the form of a generalized Gaussian is associated with each subband, b) the parameters α and β of this density model are estimated, while minimizing the relative entropy, or Kullback-Leibler distance, between this model and the empirical distribution of coefficients of each subband, and c) from this model, for each subband, an optimum quantization step is determined such that the rate allocated is distributed in the various subbands and such that the total distortion is minimal.
- 2. The method claimed in claim 1, wherein, for each subband, the graphs of rate R and distortion D are deduced, from the parameters α and β, as a function of the quantization step and said graphs are tabulated to determine said optimum quantization step.
- 3. The method claimed in claim 1, wherein said functions are wavelets.
- 4. The method claimed in claim 3, wherein, to determine said parameter β of the generalized Gaussian associated with each subband, the following expression is minimized:
- 5. The method claimed in claim 4, wherein said parameter α of the generalized Gaussian for the corresponding subband j,k is determined by the following formula:
- 6. The method claimed in claim 3, wherein the tabulation of the distortion values is carried out for a sequence of values of the parameter β.
- 7. The method claimed in claim 3, wherein the tabulation of the rates R is carried out for a sequence of values of the parameter β.
- 8. The method claimed in claim 6, wherein the sequence of values of the parameters β is
- 9. The method claimed in claim 1, wherein the bit budget is distributed to each of the subbands according to their ability to reduce the distortion of the compressed image.
- 10. The method claimed in claim 9, wherein a bit budget corresponding to the highest tabulated quantization step is assigned to each subband and wherein the remaining bit budget is then cut into individual parts that are gradually allocated to the localized functions having the greatest ability to make the total distortion decrease, this operation being repeated until the bit budget is exhausted.
- 11. The method claimed in claim 1, wherein the localized functions and the quantization step for each subband and the parameters of the density model are coded using a lossless entropy coder.
Priority Claims (1)
Number |
Date |
Country |
Kind |
02 05 724 |
May 2002 |
FR |
|