Bijection mapping for compression/denoising of multi-frame images

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to algorithms and more particularly to algorithms useful in image compression.

2. Brief Description of Prior Developments

The number of systems generating multiple images of the same scene at different frequency bands and their applications have rapidly increased in the last several years. Remote sensing systems, thematic mapper sensors, and multispectral Forward Looking Infrared (FLIR) systems are only a few examples of such systems. The data product of a multispectral sensor is a stack of images for each scene referred to as a multispectral image or data cube. Each band or component of a multispectral image can be considered as a monochrome (scalar) image. Multispectral imagery provides signal diversity that can lead to more precise and accurate results than otherwise possible with traditional single spectral imagery. A major limitation of multispectral imaging systems in practice is the required manipulation and delivery of the associated massive data files. In order to store and transmit such files, very efficient compression tools are needed. Because of this need, this disclosure shall focus on a new method for lossy compression. True lossless compression is limited in performance; here, we allow the compression to be lossy but implicitly maintain control over the fidelity of the compressed imagery so we may trade off rate vs. distortion.

To date, many of the approaches for lossy multispectral image compression rely on two stages of data processing assuming that the various bands are perfectly registered. Typically, these methods first apply a 1-Dimensional Karhunen-Loève transform (KLT) along the spectral dimension leading to a set of spectrally uncorrelated eigen-images, and subsequently each of the resultant eigen-planes is compressed using a scalar image compression method such as JPEG or wavelet based compression as is disclosed in J. A. Saghri, A. G. Tescher & J. T. Reagan, “Practical transform codeing of multispectral imagery,” IEEE Signal Processing Magazine, pp. 32–43, January, 1995. The compression rate for each eigen-image is determined through a bit allocation strategy. Other 2-stage compression methods first decorrelate the data by applying a 2D frequency transform along the spatial dimensions and a 1-dimensional KLT is subsequently applied across the spectral dimension as is disclosed in D. Tretter, C. A. Bouman, “Optical transforms for multispectral and multilayer image coding,” IEEE TIP:4:296–308, March, 1995, J. N. Bradley & C. M. Brislawn; “Spectrum analysis of multispectral imagery in conjunction with wavelet/kit data compression,” Proc. ASIMOLAR Conf. 1:26–30, November, 1993; and J. Vaisey, M. Barlaud & M. Anonini, “Multispectral Image Coding Using Lattice VQ and the Wavelet Transform,” Proc. ICIP 98, Chicago, Ill., 1998. Quantization and coding is then used to encode the transform coefficients. As is also reported in “Spectrum analysis of multispectral imagery in conjunction with wavelet/klt data compression,” Proc. ASIMOLAR Conf. 1:26–30, November, 1993; and J. Vaisey, M. Barlaud & M. Anonini, “Multispectral Image Coding Using Lattice VQ and the Wavelet Transform,” Proc. ICIP 98, Chicago, Ill., 1998 at high compression rates (less than 0.2 bit/pixel/image), both methods seem to have the same performance. The two-stage approaches mentioned above, however, have three disadvantages:

- 1. Image decorrelation is performed in a separable fashion, first along the spectral dimension and subsequently along the spatial dimension or vice versa. The separable approach is intrinsically suboptimal as the spatial and spectral statistics are not exploited jointly;
- 2. The KLT is data dependent and consequently it must be computed for each image set resulting in an unacceptable amount of computation;
- 3. Transformed image planes after the initial decorrelation are coded independently of the others, despite of the spatial similarities across bands. The focus of this paper is to introduce a method which exploits spatial and frequency correlation jointly.

The following example illustrates these issues. Consider a vector-valued M₁×M₂multispectral image f(m))=[f₁(m), f₂(m), . . . , f_L(m)]^Twith L bands. m=(m₁,m₂) represents a pixel's spatial position with m₁=1, . . . , M₁and m₂=1, . . . , M₂, and 1≦k≦L is the spectral band index. The multispectral image is thus composed of M₁×M₂f-vectors of length L. The image is then processed by the two-stage compression method described above, where a 1-Dimension KLT is applied along the spectral dimension. Let R=E(f(m)f^T(m)) be the spectral correlation matrix and let q_iand λ_i, i=1, . . . , L, be the eigenvectors and eigenvalues of R respectively. The KLT of f(m) along the spectral dimension is g(m)=Q^Tf(m) where Q=[q₁, q₂, . . . , q_L] is the L×L matrix whose columns are the eigenvectors of R. The resultant KLT coefficients form a set of images, {g₁(m), g₂(m), . . . , g_L(M)} referred here to as eigen-images. FIG. 1 depicts the KLT of a 2-band image into a pair of decorrelated eigen-images. A three-level wavelet decomposition of each eigen-image is also shown. These images clearly show that although the eigen-images are decorrelated across bands at a particular pixel location, cross-band correlation among groups of pixels remains. The residual cross-band correlation results in a strong spatial similarity not exploited by the wavelet transforms applied to each eigen-image. In fact, FIG. 1 shows that spatial similarity remains after the wavelet transforms are applied.

An approach to overcome the shortcomings of the two-stage compression algorithm would be to design a coder that takes advantage of the intra-band and cross-band spatial similarities. A 3D (3D) zero-tree coder, for instance, can be designed to jointly exploit the intra-band and cross-band correlations inherent in a data cube as is disclosed in A. Bilgin, G. Zweig & M. W. Marcellin, “Three-dimensional Image Compression using Integer Wavelets,” Appl. Optics, 39#11:1799–1814, April, 2000. This, however, can become very complex since the number bands to be jointly encoded can be quite large. Also, the bands are typically separable in three-dimensions and the zero-tree coder spawns children at a cubic rate having implications for the production of zero-trees. Or three-dimensional data may be processed via some three-dimensional extension of a well-known transform such as the DCT G. Abousleman, M. W. Marcellin & B. Hunt, “Compression of Hyperspectral Imagery using 3-D DCT and Hybrid DPCM/DCT,” IEEE TGRS 33#1:26–34, January, 1995. Here, again, there is the question of complexity and the constraints imposed by block transformation. In reference 6, the authors back off from processing using a 3D DCT in favor of a compromise using 2D DCT and DPCM in the spectral domain.

SUMMARY OF INVENTION

The present invention is an approach to multispectral image compression where the intra- and cross-band correlations are jointly exploited in a surprisingly simple yet very effective manner. The method is based on the fact that it is more efficient to jointly encode all the components of a vector source than to encode each component separately as is disclosed in A. Segall, “Bit allocation and encoding for vector sources,” IEEE Trans. Inf. Thy. IT-22:162–169, March 1976. The key component of the algorithm is a simple transformation, a bijection mapping of the original multispectral image into a virtual two-dimensional (2D) scalar image. In this research, we combine the bijection mapping with a very efficient scalar image-coding algorithm, the SPIHT codec as is disclosed in A. Bilgin, G. Zweig & M. W. Marcellin, “Three-dimensional Image Compression using Integer Wavelets,” Appl. Optics, 39#11:1799–1814, April, 2000. The SPIHT codec need not be modified for 3D operation. The choice of the bijection mapping will determine the way in which spatial and spectral information are combined: there is no intrinsic limit in number of bands processed at a time nor in the way spectral and spatial pixels are combined. Combining the bijection mapping with the SPIHT codec gives a no frills, low complexity method for processing multispectral data. This may be improved by selecting the optimal bijection mapping which is equivalent to the problem of finding a permutation map which reduces activity across pixels to effect pixel prediction. This latter optimization problem has been investigated largely in the context of lossless compression, as is disclosed in S. Tate, “Band Ordering in Lossless Compression of Multispectral Images,” IEEE TCOM 46#4:477–483, April, 1997; J. Wang, K. Zhang & S. Tang, “Spectral and Saptial Decorrelation of Landsat-M Data for Lossless Compression,” IEEE GRS 33#5:1277–1285, September, 1995; N. Sidiropoulos, M. Pattichis, A. Bovik & J. Havlicek, “COPERM: Transform-Domain Energy Compaction by Optimal Permutation,” IEEE TSP 47#6:1679–1688, June, 1999; and Z. Arnavut, “Applications for inversions to lossless image compression,”, Opt. Eng. 36#4:1028–1034, April, 1997. Here we shall propose another optimization approach with the goal of optimizing lossy compression. Activity from pixel to pixel is reduced for the purpose of introducing zero-trees into the coded stream thereby increasing compression without increasing distortion. By optimally mapping the multispectral image set into a single 2D-array and by subsequently applying a 2D-decomposition, the spatial correlation as well as the spectral correlation are jointly exploited. In this manner, the self-similarities across the different scales of the virtual image are exploited. Based on the statistical characteristics of the multispectral data, the bijection mapping can be optimized to minimize the distortion introduced by the compression algorithm. The optimization proposed here reduces to the maximization of a function that depends on the second-order statistics of the multispectral data.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is further described with reference to the accompanying drawings wherein:

FIG. 1 is a schematic drawing showing a two-stage approach for compression of a multispectral, two-color FLIR image, wherein spectral transform is followed by 2-D DWT and the multispectral image is a two-color focal plane array LWIR/MWIR;

FIG. 2 is a schematic drawing showing a bijection mapping of a multispectral image into a virtual 2D image, wherein L₁L₂=L;

FIG. 3 is a schematic drawing showing two bijection mappings of a 1×2×4 (4-band) stack into (a) a 1×8 array, and (b) a 2×4 array;

FIGS. 4(
a) and 4(b) are flow charts, wherein FIG. 4(a) shows a compression algorithm: bijection mapping follow by SPIHT encoder, and FIG. 4(b) shows a decompression algorithm: SPIT decoder followed by inverse bijection mapping;

FIGS. 5(
a) and (b) are schematic drawings showing two different bijection mappings of the bispectral FLIR image;

FIGS. 6(
a)–6(d) are graphs of lines of images ouputted by bijection mapping in FIG. 5, wherein FIG. 6(a) depicts the marked line in FIG. 5(a) and FIG. 6(c) depicts the marked line in FIG. 5(b), and FIGS. 6(b) and 6(d) depict the same lines after images are compressed;

FIG. 7 is a graph of average PSNR of the four reconstructed bands achieved by the various compression algorithm;

FIGS. 8(
a)–8(d) show the second band of the reconstructed images at 0.08 bit/pixel/band wherein FIG. 8(a) is the original band, FIG. 8(b) shows a separable approach, FIG. 8(c) is from a traditional two-stage algorithm, and FIG. 8(d) is from bijection mapping onto zeros trees;

FIG. 9 is a graph showing performance of the different compression algorithms acting on unobserved data;

FIGS. 10(
a)–10(d) show the first band of the non-training image at 0.1 bit/pixel/band, wherein FIG. 10(a) is the original, FIG. 10(b) shows a separable approach, FIG. 10(c) is from a traditional two-stage approach, and FIG. 10(d) is from bijection mapping onto zero trees;

FIGS. 11(
a)–11(d) show band VH of the polarimetric SAR data at 0.4 bit/pixel/band, wherein FIG. 11(a) is the original image, FIG. 11(b) is from a separable approach, FIG. 11(c) is from a two-stage approach, and FIG. 11(d) is from bijection mapping onto zero-trees; and

FIG. 12 is a graph showing coding performance achieved by the various algorithms on Polarimetric SAR data.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
1. Multispectral Image Compression Using Bijection Mappings

1.1 Bijection Mappings

A bijection mapping is a one-to-one onto function that maps set A to set B. The set A is called the domain of φ(x) and the set of all elements φ(x)εB is called the range of φ(x). The one-to-one correspondence between an element in A and an element in B assures that the function is invertible. That is, if εB, φ⁻¹(y) is the set of all xεA such that φ(x)=y. φ⁻¹(x) is called the inverse of the function φ(x). In our application, a special type of bijection mapping is used. Let R^M¹^×M²be the set of M₁×M₂arrays defined over the set of real numbers R. Furthermore, let R^M¹^×M²^×Lbe the set of M₁×M₂arrays where each element in the array is a L-component vector defined over the set of real numbers. For instance, if fεR^M¹^×M²^×L, then f has the following structure

$\begin{matrix} f = [\begin{matrix} f (1, 1) & f (1, 2) & \dots & f (1, M_{2}) \\ f (2, 1) & f (2, 2) & \dots & f (2, M_{2}) \\ ⋰ \\ f (M_{1}, 1) & f (M_{1}, 2) & \dots & f (M_{1}, M_{2}) \end{matrix}], & (1) \end{matrix}$

where f(m₁,m₂)=[f₁(m₁,m₂),f₂(m₁,m₂) . . . f_L(m₁,m₂)]^Tis an L-component vector and where each element of f(m₁,m₂) is defined over the set of real numbers, i.e. f_k(m₁,m₂)εR. A M₁×M₂multispectral image with L bands is an example of an element in R^M¹^×M²^×L, whereas each M₁×M₂monochrome (scalar) image belongs to the set R^M¹^×M². As shown in FIG. 2, the bijection mappings utilized in this paper are defined as a one-to-one permutation of the elements in R^M¹^×M²^×Lonto the set R^L¹^M¹^×L²^M². That is,

B:R^M¹^×M²^×L→R^L¹^M¹^×L²^M²,B(f_k(m₁,m₂))=G(i,j) (2)

m₁=1, 2, . . . ,M₁; m₂=1, 2, . . . , M₂; k=1, 2, . . . , L; i=1, 2, . . . , L₁M₁, j=1, 2, . . . , L₂M₂

where GεR^L¹^M¹^×L²¹^M²is the range of the bijection mapping B and R^M×M²^×L¹is its domain. L₁and L₂are integers such that L=.L₁.L₂. The bijection mapping of a multispectral image is thus defined as a mapping of the 3D data stack into a 2D virtual image. This mapping includes normalization such that neighboring pixels in the virtual image have similar dynamic range. The mapping is invertible since the original data cube can be perfectly reconstructed from the virtual two-dimensional array. Each array defined in R^L¹^M¹^×L²^M²is related to an element in R^M¹^×M²^×Lthrough the inverse bijection mapping:

B⁻¹:R^L¹^M¹^×L²^M²→R^M¹^×M²^×L,B⁻¹(G(i,j))=f_k(m₁,m₂). (3)

In summary, R^M¹^×M²^×Lrepresents the set of M₁×M₂multispectral images with L bands, B(R^M¹^×M²^×L) represents the associated set of two-dimensional arrays.

1.2 Bijection Mappings onto Zero-Trees

For an array fεR^M¹^×M²^×L, there are (LM₁M₂)! possible bijection mappings B:R^M¹^×M²^×L→R^L¹^M¹^×L²^M²for a given L=L₁L₂. With most multispectral images having spatial dimensions larger than 512×512, direct bijection mappings of the multispectral images themselves prove to be computationally prohibitive. While the number of mathematically conceivable bijections is large, the number useful for the application at hand is much smaller. We expect only those bijections that map a small, spatially contiguous, vector-pixel set to be relevant for compression because the correlations among the associated pixels will tend to be large. Therefore, a simpler approach to handle large image files is to break the multispectral image stack into smaller non-overlapping sub-images of depth L. That is, the spatial domain is partitioned while the spectral dimension is not, as L is typically in the order of 15 or less for multispectral images. The bijection mapping of a multispectral image thus reduces to the bijection mapping of N₁×N₂×L image cells as depicted in FIG. 3. Assuming that the images are homogeneous, all cells are treated equally and the same bijection mapping is applied to all image cells. This concept could be generalized where various types of bijection mappings are applied to different image cells depending on their statistical characteristics. For the sake of simplicity, at this time, we treat all cells uniformly. The multispectral image is first divided into cells of size N₁×N₂×L. Each 3D cell is then subjected, independently of the others, to a bijection mapping. The resultant two-dimensional arrays are placed into a large single array according to the position of the original 3D cells. FIG. 3 shows two different bijection mappings of a four-band cube, where the size of each cell is 1×2×4. In FIG. 3(a), each cell is mapped to a 1×8 array, i.e., B:R^1×2×4→R^1×8. FIG. 3(b) shows a different bijection mapping where each cell is mapped to a 2×4 array. That is B:R^1×2×4→R^2×4. The shaded pixels are the bijection mapping of the first 3D cell, upper-left corner. The unshaded pixels show the bijection mapping of the neighboring block. Note that the locations of both two-dimensional arrays are in accordance with the position of the 3D cells. For simplicity, only a few pixels of the cube are depicted and no normalization of the data is depicted at this time. In a practical implementation the data has to be normalized such that the various components are given the same dynamic range. Having defined the image partition into smaller cells in R^N¹^×N²^×L, the proposed compression algorithm consists of:

- 1. Mapping each cell by B:R^N¹^×N²^×L→R^L¹^N¹^×L²^N²into the L₁N₁×L₂N₂array.
- 2. Tiling the obtained arrays into a virtual image I.
- 3. Encoding the virtual image I by a zero-tree coder.
- 4. Attaching the type of bijection mapping used as a header.
  
  FIG. 4 shows the proposed compression algorithm. Note that the compressed image is a binary bit string whose length, denoted by S in FIG. 4, is determined by the compression ratio. Note also that the last step of the proposed algorithm, not shown in FIG. 4, adds information needed by the decoder to reconstruct the multispectral data. This overhead, however, is negligible compared to the amount of compressed data. As it will be shown shortly, a permutation matrix that defines the re-ordering of the data pixels can compactly represent the bijection mapping. (Figure to be included)
  
  The multispectral image decoder reverses the encoding process by:
- 1. Extracting the type of bijection mapping from the file header.
- 2. Reconstructing the virtual image Î with the zero-tree decoder.
- 3. Partitioning the virtual image Î into L₁N₁×L₂N₂cells and mapping back each cell by B⁻¹:R^L¹^N¹^×L²^N²→R^N¹^×N²^×L, to form the reconstructed 3D cells.
- 4. Concatenating the obtained cells to form the multispectral image.
  
  Given the bijection mapping onto the zero-tree codec shown in FIG. 4, the following questions arise.
- 1. How are N₁and N₂chosen so as to maximize the attained energy compaction?
- 2. Given N₁and N₂there are (LN₁N₂)! possible bijection mapping. How does one select a bijection mapping so as to maximize the energy compaction?
  
  Energy compaction, E_c, is defined as where E_Tis the total energy of the multi spectral image

$\begin{matrix} E_{c} = \frac{E_{u}}{E_{T}} & (4) \end{matrix}$

and E_llis the energy compacted in the approximation of the wavelet decomposition. In compression, higher E_cgenerally translates into smaller distortion in the reconstructed data. The answer to the first question reduces to how much computational complexity can we tolerate? If N₁>N_i, i=1,2, and letting {B(N₁,N₂)} be the set of all possible bijection mappings of a N₁×N₂cell, then {B(N₁, N₂)}⊂{B(N′₁, N′₂). Hence, we will always do equal or better as N₁and N₂are incremented. The real issue is, how much gain does one attain as N_iis incremented to N_i+1, N_i+2, because the size of the search space increases as (LN₁N₂)!. It is conceivable that an optimum may be achieved for small values of N₁, N₂in which case picking a larger cell adds computation. As N₁is incremented to N₁+1, an additional (LN₂) possible bijection mapping are added to the search space. A related and perhaps more important question is: if only one parameter is incremented, either N₁or N₂, which one should be chosen. In summary, the performance gain as the size of N₁and N₂are monotonically increased, is somewhat reminiscent to the order of an AR model. In fact, some of the same issues that arise on selecting the order of an AR model also arise here such as overfitting, complexity, etc. The second question is perhaps the most important since it is directly related to energy compaction. Having fixed N₁and N₂which one of the (LN₁N₂)! bijection mappings will maximize the energy compaction? Answers to these questions will be given in Section 3. Table 1 shows through a simple numerical example the performance gain that is achieved using bijection mapping.

2. Optimal Bijection Mappings

The approach to the design of the image partition size N₁×N₂is through heuristics, although other more elaborated approaches can be used. In the heuristic approach, the type of the 2D transform used in the encoding stage determines the shape of the partition cell. To decorrelate the virtual image, SPIHT uses a two-dimensional discrete wavelet transform that exploits the correlation by filtering the image first along the rows and then along the columns. This fact provides a reliable guideline defining N₁and N₂as well as the shape of the two-dimensional array, i.e. L₁and L₂, inherent in the bijection mapping. We have found experimentally for the data to be considered that the correlation existing in a 1×N₂×L cell is better captured by the wavelet decomposition if each 1×N₂×L cell is mapped into a 1×N₂L array. Thus, N₂consecutive vector-pixels in the same row are mapped into a 1×N₂L array. This leads to a bijection mapping B:R^1×N²^L→R^1×LN²for each 3D cell, yielding a virtual image of size M₁×LM₂. For simplicity, we assume that M₂/N₂is an integer. Taking into account the two-dimensional transform used in the proposed compression algorithm, the bijection mapping of a

M₁×M₂multispectral image f(m) with L bands reduces to B:R^M¹^×M²^×L→R^M¹^×LM²,G=B(f),

where

$\begin{matrix} B (f) = [\begin{matrix} P [\begin{matrix} σ f (1, 1) + μ \\ ⋮ \\ σ f (1, N_{2}) + μ \end{matrix}] & \dots & P [\begin{matrix} σ f (1, M_{2} - N_{2} + 1)) + μ \\ ⋮ \\ σ f (1, M_{2})) + μ \end{matrix}] \\ ⋮ & ⋰ & ⋮ \\ P [\begin{matrix} σ f (M_{1}, 1) + μ \\ ⋮ \\ σ f (M_{1}, N_{2} + μ \end{matrix}] & \dots & P [\begin{matrix} σ f (M_{1}, M_{2} - N_{2} + 1) + μ \\ ⋮ \\ σ f (M_{1}, M_{2}) + μ \end{matrix}] \end{matrix}], & (5) \end{matrix}$

where f(m₁,m₂) is the vector-valued pixel at (m₁,m₂), μ=[μ₁, μ₂, . . . , μ_L]^Tand σ=diag(σ₁, σ₂, . . . , σ_L) normalize the data set, where σ_kand μ_kare chosen such that the resultant normalized spectral components have the same dynamic range. It is shown in Table 2 that σ_kand μ_kare given respectively by:

$\begin{matrix} σ_{k} = \frac{cov (f_{k}, \overline{f})}{var (f_{k})} μ_{k} = \overset{=}{f} - σ_{k} f_{k}, & (6) \end{matrix}$

where cov(x,y), var(x) and x denote the covariance, variance and mean respectively, and

$\overline{f} = \frac{\sum_{k = 1}^{L} f_{k}}{L}$

is the image obtained by averaging the different spectrum components. Note that in the above equations the spatial index m=(m₁,m₂) has been dropped for simplicity.

P in (5) is an LN₂×LN₂matrix that contains exactly one 1 in each row and column. P is referred here to as a permutation matrix since it defines the re-ordering of the pixels into the array G. FIG. 3(a) shows the case when N₂=2 and L=4. Note in FIG. 3(a) that, for the shaded block, the vector-valued pixels are f(1,1)=[a₁, b₁, c₁, d₁]^Tand f(1,2)=[a₂, b₂, c₂, d₂]^T, σ=I₂, μ=[0 0]^Tand the permutation matrix for this example is

$\begin{matrix} P = [\begin{matrix} J_{4} & O_{4} \\ O_{4} & I_{4} \end{matrix}], & (7) \end{matrix}$

where I_Nis an N×N identity matrix, 0_Nis an N×N all zero matrix and J_Nis an N×N right-to-left diagonal matrix. Note also that

$P [\begin{matrix} f & (1, 1) \\ f & (1, 2) \end{matrix}]$

gives re-ordering of the shaded samples into the two-dimensional array. Each permutation matrix defines a different re-ordering and therefore a different bijection mapping. The permutation matrix P defining the re-ordering of the samples thus characterizes a bijection mapping. A fundamental problem is thus the design of P that leads to the highest compression attainable. For a 1×N₂×L cell, there exists (LN₂)! different re-orderings of the samples inside a cell.

This, consequently, leads to (LN₂)! different M₁×LM₂virtual images. Some of them are more suitable for compression than others. We expect those mappings with the pixels arranged to maximize low frequency correlation to be the best for compression. The question that naturally arises is which one of the (LN₂)! bijection mappings generates virtual images that are most suitable for compression. Under the energy compaction viewpoint, the optimal bijection mapping is the one that compacts the most energy onto the low frequency components in the transform domain. Thus, the optimal bijection mapping, B(P) is selected such that for a fixed wavelet transform, where E_cis the energy compaction

B₀=max_pE_c(B(P)), (8)

defined in (4). Since the total energy of a multispectral image does not depend on the bijection mapping nor on the wavelet transform used, (8) reduces to maximizing the energy compacted in the low-frequency components of the wavelet domain. If the wavelet transform is applied to an image that is smooth, having most of its energy in the lowpass band of the spectrum, the resultant energy compaction is high. On the other hand, if the energy of an image is concentrated in the highpass band of the spectrum, the wavelet transform will have more of the energy distributed on the high-frequency components of the wavelet domain, resulting in poor energy compaction.

To illustrate this point better, FIG. 5 shows two different bijection mappings of a bispectral FLIR image. Although the images outputted by the two different bijection mappings look similar, they have markedly different characteristics. To observe their differences better, FIG. 6(a) and FIG. 6(c) depict horizontal scans marked as white lines in FIG. 5. Note that the line shown in FIG. 6(c) is smoother than the one plotted in FIG. 6(a). FIG. 6(b) and FIG. 6(d) show the same lines after the virtual images outputted by the bijection mappings have been compressed/decompressed. The reconstructed line of FIG. 6(d) is less distorted than the one depicted in FIG. 6(b). We use this observation in order to redefine the optimization problem as follows. The optimal bijection mapping B is the one that generates the smoothest virtual image which is the most suitable image for compression. As a measure of image smoothness, we use the correlation between consecutive samples in the virtual image. Finding the optimal bijection mapping reduces to finding the best re-ordering of the samples in the virtual image that maximizes the spatial correlation.

Let {tilde over (f)} be the vector formed by concatenating the normalized vector-valued pixels in a cell of size 1×N₂×L, i.e.

{tilde over (f)}=[σf(m₁,m₂)^T+μ^T,σf(m₁,m₂+1)^T+μ^T, . . . ,σf(m₁,m₂+N₂)^T+μ^T]^T.

Furthermore, let g=P{tilde over (f)} be a re-ordered vector of {tilde over (f)} produced by the permutation matrix P. We want to find the best re-ordering of the components of {tilde over (f)} such that the sum of correlations between consecutive components of the re-ordered vector is maximized. Let r_i,jbe the correlation between the i th and j th components of g_p. Since {tilde over (f)} is formed by concatenating N₂consecutive normalized vector-valued pixels, r_i,jrepresents the spectral correlation, or the spatial correlation, or the cross-correlation between pixels. The objecting function to be maximized is then given by

$\begin{matrix} \max_{all g_{p}} [\sum_{i = 1}^{L N_{2}} r_{i, i, + 1} + r_{L N_{12}, 1}] = \max_{all g_{p}} E (g_{p} {Ag}_{p}) & (9) \end{matrix}$

where r_LN₂_,1takes into account the fact that the bijection mapping of consecutive 3D-cells is concatenated according to (5). A is an LN₂×LN₂matrix operator that circularly shifts the components (rows) of a vector (matrix) one position up, i.e.,

$\begin{matrix} A = \langle \begin{matrix} 0 & 1 & 0 & \dots \\ 0 & 0 & 1 & \dots \\ ⋰ & ⋰ \\ 1 & 0 & 0 & \dots \end{matrix} \rangle & (10) \end{matrix}$

Since P{tilde over (f)}, the objective function (9) can be rewritten as a function of the permutation matrix P as follows:

max_PεΩ(E({tilde over (f)}^TP^TAP{tilde over (f)})), (11)

where Ω is the set of all permutation matrix of size LN₂. Thus, finding the best re-ordering of the components of {tilde over (f)} that produces a smooth virtual image reduces to finding the permutation matrix P that maximizes (11). The optimal permutation matrix P is determined using a heuristic search of the space Ω.

2.1 Sub-Optimal Bijection Mapping

Searching for the optimal solution of (11) can be time consuming since the search space is quite large. In fact, equation (11) has to be computed for each element in the search space Ω, and the permutation matrix that yields the maximum value is chosen as the optimal solution. This can become computational expensive since the search space increases as (LN₂)!. To overcome this limitation, we propose a sub-optimal bijection mapping that has been experimentally proved to yield good results. This sub-optimal bijection mapping is described next.

Suppose that we have a 1×N₂×L 3D-cell to be mapped into a 1×N₂L array. Without loss of generality, let us assume that L=4 and N₂is an even number. Denote the four normalized bands as A, B, C and D, and the pixels belonging to those bands as a_i, b_i, c_iand d_irespectively where i denotes the spatial position of the pixel in the cell. Let r_a, r_b, r_cand r_dbe the spatial correlations between two consecutive pixels in the bands A, B, C and D respectively. Furthermore, let r_X,Ybe the spectral correlation between the band X and Y where X □{A, B, C, D} and Y □{A, B, C, D}. The suboptimal bijection mapping follows the four-step procedure:

1. Select the pixels of the band that has the highest spatial correlation, say max(r_a, r_b, r_c, r_d)=r_a, hence

a₁
a₂
□
a_N₂

2. On each side of the pixel set attach the pixels corresponding to the band that has the highest spectral correlation with respect to the band found in the previous step, say max(r_A,B, r_A,C, r_A,D)=r_A,B, hence the re-ordering becomes

b₁
b₂
. . .
b_N₂_/2
a₁
a₂
. . .
a_N₂
b_N₂_{/2 + 1}
. . .
b_N₂

3. On each side of the resultant array insert the pixels corresponding to the band that has not been merged yet and that has the highest spectral correlation with respect to the samples added in step 2, i.e. max (r_B,C,r_B,D) defines the next band to be multiplexed. Following with the example and assuming that r_B,C>r_B,D, the re-ordering becomes

c₁
. . .
c_N₂_/2
b₁
. . .
b_N₂_/2
a₁
. . .
a_N₂
b_N₂_{/2 + 1}
. . .
b_N₂
c_N₂_{/2 + 1}
. . .
c_N₂

Repeat the previous step up to all the bands are multiplexed into a single band. Figure [REF:fig:bijection](a) shows, for this example, the sub-optimal bijection mapping for N₂=2. This four-step procedure is repeated for each 3D cell and the resultant 2D array is tiled in the way described above for the optimal solution. Note that the key in the sub-optimal bijection mapping is to keep, as close as possible, those pixels that have a high correlation. Moreover, it takes into account both the spectral and spatial correlations in defining the re-ordering of the samples. For both optimal and sub-optimal solutions, N₂has to be set to a small number (between 2 and 6). The reason for doing so lies in the fact that the spatial correlation decays rapidly as the distance between pixels increases.

3. Simulation Results

The proposed compression algorithm was applied to 4-band airborne multispectral data collected using the AISA Airborne Imaging Spectrometer. Each band is a monochrome 8 bit image of size 384×700. A 256×256 pixel area in the upper left part of each band is used as training data to compute the correlation matrix needed for the optimization algorithm. The optimal bijection mapping, B₀, is determined following the algorithm described in Section 3. The data is partitioned into 1×2×4 m cells. The bijection mapping B₀:R^1×2×4→R^1×8is then applied to each cell to form the virtual image that is compressed by the SPIHT coder at a compression rate from 0.05 bit/pixel/band to 1 bit/pixel/band. The sub-optimal bijection mapping is also determined and used here for comparison.

In order to illustrate the efficiency of the proposed compression algorithm, we include the comparison of this approach with two additional compression algorithms. The first algorithm compresses each spectral band separately using the SPIHT coder, and thus only takes advantage of the spatial redundancy in the image set. We use this compression algorithm, referred here to as the separable approach, as a benchmark to see the advantage of exploiting all the redundancy existing in a multispectral data. The second algorithm uses the two-stage approach described in the introduction. That is, the KLT is computed along the spectral dimension to obtain the 4 eigen-planes. The SPIHT coder is then used to compress each eigen-plane at a compression rate that is proportional to the logarithm of the energy of the eigen-plane. In order to compute the covariance matrix needed for the KLT, the same 256×256 area used in the proposed algorithm is used here as training data. We use the Peak Signal-to-Noise Ration (PSNR) as a distortion measure of the quality of the reconstructed image. For 8-bit images of size M₁×M₂, PSNR is defined as:

$\begin{matrix} PSNR (dB) = 10 * \log_{10} (\frac{255^{2}}{\frac{1}{M_{1} M_{2}} \sum_{i = 1}^{M_{1}} \sum_{j = 1}^{M_{2}} {(x_{i, j} - {\hat{x}}_{i, j})}^{2}}), & (8) \end{matrix}$

where x_i,jis the pixel at the i th row and j th column in the image and {circumflex over (x)}_i,jis its corresponding reconstructed pixel. PSNR is separately computed for each band. FIG. 7 shows the average PSNR obtained using the proposed algorithm (optimal and sub-optimal bijection mappings) and the algorithms described above for a compression rate between 0.05 bit/pixel/band to 1 bit/pixel/band.

Note that for high compression rates (less than 0.25 bit/pixel/band), the proposed compression algorithm outperforms the two-stage compression algorithm. Note also that the sub-optimal bijection mapping yields similar results than those found with the optimal bijection mapping. FIG. 8 shows the second reconstructed band yielded by the different compression algorithms at a compression rate of 1:100 or equivalently 0.08 bit/pixel/band. FIG. 8(a) shows the original second band, FIG. 8(b) shows the performance achieved if each band is separately compressed; i.e. not spectral correlation is exploited. FIG. 8(c) shows the reconstructed image yielded by the two-stage algorithm and FIG. 8(d) depicts the corresponding band when the proposed compression method is taken. The sub-optimal bijection mapping yields similar visual performance to that of the optimal bijection mapping; therefore, its reconstructed image is not shown. Note that in the reconstructed image obtained using the proposed algorithm, the details are preserved and the edges are well defined. By comparison, there is noticeable blurring for the other reconstructions. Note also the poor performance that is achieved when the spectral correlation is not exploited. As can be seen by comparing the images and the PSNR curve of FIG. 7 the proposed compression method is simple yet very efficient particularly at very high compression rates.

In order to see the performance of the various compression approaches acting over unobserved data, a section of the four-band multispectral data that was not used as training data is compressed using the different algorithms. The same optimal and sub-optimal bijection mappings determined using the training data are used in the compression of the unobserved data. Similarly, the same eigenvectors of the covariance matrix determined using the training data are used to compute the KLT of the unobserved data for the two-stage compression algorithm.

FIG. 9 shows the performance of the different algorithms acting on an unobserved data set. Observe that the two-stage approach is more data dependent than the proposed algorithm since the proposed approach outperforms the two-stage approach over a much wider range of compression rate. FIG. 10(a) shows the first band of the original unobserved data, and FIGS. 10 (b), (c) and (d) show the corresponding reconstructed images using the separable approach, the two-stage approach and the proposed algorithm respectively.

As a third example, we tested the algorithms on a four-component polarimetric Synthetic Aperture Radar (SAR) data which contains signal-dependent multiplicative noiseⁱ. The SAR data was collected by the EROS data center. 11(a) shows the Vertical-Horizontal (VH) polarization of the test data. FIG. 11(b) shows the performance of the separable approach. FIGS. 11(c) and (d) show the performance of the two-stage approach and the proposed approach, respectively. The reconstructed images shown in FIG. 11 are compressed 20:1. Note that at this compression the separable and two-stage algorithms exhibit ringing artifacts which are very disturbing. The PSNRs obtained using the different compression algorithms are displayed in FIG. 12. The bijection mapping onto zero-tree outperforms the other two methods by approximately 0.5 dB.

5. CONCLUSION

It will be understood by those skilled in the art that we have described a compression method for multispectral images that jointly exploits the spectral and spatial correlation inherent in a multispectral image in a simple yet efficient way. The proposed method uses a bijection mapping to project the 3D data cube into a two-dimensional array. A scalar image compression coder then compresses the obtained two-dimensional image. An optimization algorithm is developed to find the optimal parameters of the bijection mapping that leads to the most suitable virtual image for compression. The performance of the proposed compression algorithm was compared with those found using the separable approach and the traditional two-stage approach. In general, at very high compression rates, the bijection compression approach yields superior results, both perceptually and in the PSNR distortion measure.

While the present invention has been described in connection with the preferred embodiments of the various figures, it is to be understood that other similar embodiments may be used or modifications and additions may be made to the described embodiment for performing the same function of the present invention without deviating therefrom. Therefore, the present invention should not be limited to any single embodiment, but rather construed in breadth and scope in accordance with the recitation of the appended claims.

TABLE 1

Assuming a simple mathematical model for the spatial and spectral correlations, it is shown here that energy compaction can be increased by combining multispectral planes. While the model does not accurately model the correlations of multispectral imagery, it does indicate how the bijection map can improve compression. Consider a bispectral sensor measuring the reflectance of the same scene at two different frequencies. Assume that the received signals are stationary and have been sampled at the same rate. Let f(n)=[f₁(n), f₂(n)]^Tn=1, . . . N, be the bispectral image where the one-dimensional spatial indexing variable “n” is used to denote pixel location for notational simplicity. For a given spectral location, the signal forms the vector f_i=[f_i(1), f_i(2), . . . , f_i(N)]^Tfor i=1,2. The spatial correlation matrices for each spectral signal is given by

$\begin{matrix} R_{i} = E (f_{i} f_{i}^{T}) = [\begin{matrix} r_{o} + r & r & r & r \\ r & r_{o} + r & \dots & r \\ r & r & ⋰ & r \\ r & r & \dots & r_{o} + r \end{matrix}] & (13) \end{matrix}$

where a simple spatial correlation model is assumed for mathematical tractability. The correlation structure of the autocorrelation matrix may not be particularly accurate for multispectral imagery; however, the assumption of the more or less the same autocorrelation is a valid assumption as the multispectral images will be, more or less, similar. Since both f₁and f₂view the same scene, cross-spectral correlation between them exists. For mathematical tractability, assume that the cross-spectral correlation C=E(f₁f₂^T) is given by

$\begin{matrix} C = E (f_{i} f_{i}^{T}) = [\begin{matrix} ρ_{o} & ρ & \dots & ρ \\ ρ & \dots & ρ \\ ⋰ \\ ρ & ρ & \dots & ρ_{o} \end{matrix}] & (14) \end{matrix}$

where ρ₀>ρ is assumed. Again, the particular correlations may not be reasonable, but the cross-correlation structure is described exactly for this simple case i=1,2. Now, consider the two encoding strategies: (a) each signal is coded individually and (b) a virtual signal is formed by a bijection mapping followed by scalar encoding. For simplicity, the energy preserved in the reconstructed signals after encoding is used as a performance measure. Recall that a signal's energy is given by the sum of the eigenvalues of its correlation matrix. The eigenvalues of R_i, i=1,2, can be computed as:

$\begin{matrix} λ_{i} = {\begin{matrix} r_{o} & i = 1, 2, \dots, N - 1 \\ r_{o} + Nr & i = N . \end{matrix} & (15) \end{matrix}$

Note that if each signal is individually compressed at a rate 1:N, the maximum energy preserved in each signal is given by the highest eigenvalue, namely, λ₁=r₀+Nr. Now, consider the case where both signals are mapped into one larger virtual signal, g=[f₁^T, f₂^T]^T. The correlation matrix of g is given by

$\begin{matrix} R_{g} = E ({gg}^{T}) = [\begin{matrix} R_{1} & C \\ C & R_{1} \end{matrix}], & (5) \end{matrix}$

whose eigenvalues are

$\begin{matrix} λ_{i} = [\begin{matrix} r_{o} + ρ - ρ_{o} & for i = 1, \dots, N - 1 \\ r_{o} - ρ + ρ_{o} & for i = N, N_{1}, \dots, 2 N - 2 \\ - (N - 1) ρ + Nr + r_{o} - ρ_{o} & i = 2 N - 1 \\ + (N - 1) ρ + Nr + r_{o} + ρ_{o} & i = 2 N \end{matrix}] . \end{matrix}$

For a compression rate of 1:N, the maximum energy preserved is determined by the two highest eigenvalues, namely:

$\begin{matrix} Preserved Energy = {\begin{matrix} Nr + 2 r_{o} + (N - 2) ρ + 2 ρ_{o} & if (N - 2) ρ + 2 ρ_{o} > Nr \\ 2 Nr + 2 r_{o} & otherwise . \end{matrix} & (18) \end{matrix}$

From (4) and (7), the compression of the virtual image always provides higher energy preservation suggesting the potential the advantage of bijection mappings.

TABLE 2

The normalization factors given in Equation (6) are derived here. Let f_kbe k-th spectral component of a multispectral image. For notational simplicity, the pixel's spatial position index has been dropped. We want to normalize this component such that the resultant normalized component, denoted by {tilde over (f)}_k, has the same dynamic range than the other normalized components. Let f be the average of the different spectral components, that is

$\overline{f} = \frac{1}{L} \sum_{k}^{L} = {}_{1}f_{k},$

Note that the average is performed for each spatial position m=(m₁,m₂). We want to normalize f_kwith respect to f, for k=1,2 . . . L, thus all the components have the same dynamic range. Define {tilde over (f)}_k=σ_kf_k+μ_k, where σ_kand μ_kare normalization factors to be optimally chosen. Since we are also interested in increasing the similarities between normalized components, σ_kand μ_kare chosen to minimize the following cost function

minσ_k,μ_kE({tilde over (f)}_k− f)² (19)

It can be easily shown that the values of σ_kand μ_kthat minimize the MSE between the k th normalized component and the spectral average image are given by:

$σ_{k} = \frac{E (f_{k} \overline{f}) - {Ef}_{k} E \overline{f}}{{Ef}_{k}^{2} - {({Ef}_{k})}^{2}}$

$μ_{k} = E \overline{f} - σ_{k} {Ef}_{k}$

- The above equations can be rewritten as follows

$σ_{k} = \frac{cov (f_{k}, \overline{f})}{var (f_{k})}$

$μ_{k} = \overset{\underline{-}}{f} - \frac{cov (f_{k}, \overline{f})}{var (f_{k})} {\overline{f}}_{k}$

where cov(x,y), var(x) and x denote the covariance, variance and mean respectively.

Number	Name	Date	Kind
5737448	Gardos	Apr 1998	A
5963935	Ozbutun et al.	Oct 1999	A
6081800	Ozbutun et al.	Jun 2000	A
6879716	Ishibashi	Apr 2005	B1
7035457	Ishibashi	Apr 2006	B2

Bijection mapping for compression/denoising of multi-frame images

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US

International Classifications

Term Extension

Abstract

Description

Claims

CROSS REFERENCE TO RELATED APPLICATIONS

STATEMENT OF GOVERNMENT INTEREST

US Referenced Citations (5)

Provisional Applications (1)