Bijection mapping for compression/denoising of multi-frame images

Information

  • Patent Grant
  • 7224845
  • Patent Number
    7,224,845
  • Date Filed
    Friday, February 28, 2003
    21 years ago
  • Date Issued
    Tuesday, May 29, 2007
    17 years ago
Abstract
A new approach to multispectral image compression where the intra- and cross-band correlations are jointly exploited in a surprisingly simple yet very effective manner. The key component of the algorithm is a bijection mapping of the original multispectral image into a virtual 2 dimensional scalar image. By optimally mapping the multispectral image set into a single 2 dimensional array and by subsequently applying a scalar image coding algorithm, the spatial correlation and the spectral correlation of the multispectral data set are jointly exploited. Based on the statistical characteristics of the multispectral data, the bijection mapping can be optimized to minimize the distortion introduced by the compression algorithm. The optimization reduces to the maximization of a function of the second-order statistics of the multispectral data. At high compression rates, the new algorithm outperforms traditional compression algorithms whenever the cross-band correlation is high and it yields comparable performance at low compression rates.
Description
BACKGROUND OF THE INVENTION

1. Field of the Invention


The present invention relates to algorithms and more particularly to algorithms useful in image compression.


2. Brief Description of Prior Developments


The number of systems generating multiple images of the same scene at different frequency bands and their applications have rapidly increased in the last several years. Remote sensing systems, thematic mapper sensors, and multispectral Forward Looking Infrared (FLIR) systems are only a few examples of such systems. The data product of a multispectral sensor is a stack of images for each scene referred to as a multispectral image or data cube. Each band or component of a multispectral image can be considered as a monochrome (scalar) image. Multispectral imagery provides signal diversity that can lead to more precise and accurate results than otherwise possible with traditional single spectral imagery. A major limitation of multispectral imaging systems in practice is the required manipulation and delivery of the associated massive data files. In order to store and transmit such files, very efficient compression tools are needed. Because of this need, this disclosure shall focus on a new method for lossy compression. True lossless compression is limited in performance; here, we allow the compression to be lossy but implicitly maintain control over the fidelity of the compressed imagery so we may trade off rate vs. distortion.


To date, many of the approaches for lossy multispectral image compression rely on two stages of data processing assuming that the various bands are perfectly registered. Typically, these methods first apply a 1-Dimensional Karhunen-Loève transform (KLT) along the spectral dimension leading to a set of spectrally uncorrelated eigen-images, and subsequently each of the resultant eigen-planes is compressed using a scalar image compression method such as JPEG or wavelet based compression as is disclosed in J. A. Saghri, A. G. Tescher & J. T. Reagan, “Practical transform codeing of multispectral imagery,” IEEE Signal Processing Magazine, pp. 32–43, January, 1995. The compression rate for each eigen-image is determined through a bit allocation strategy. Other 2-stage compression methods first decorrelate the data by applying a 2D frequency transform along the spatial dimensions and a 1-dimensional KLT is subsequently applied across the spectral dimension as is disclosed in D. Tretter, C. A. Bouman, “Optical transforms for multispectral and multilayer image coding,” IEEE TIP:4:296–308, March, 1995, J. N. Bradley & C. M. Brislawn; “Spectrum analysis of multispectral imagery in conjunction with wavelet/kit data compression,” Proc. ASIMOLAR Conf. 1:26–30, November, 1993; and J. Vaisey, M. Barlaud & M. Anonini, “Multispectral Image Coding Using Lattice VQ and the Wavelet Transform,” Proc. ICIP 98, Chicago, Ill., 1998. Quantization and coding is then used to encode the transform coefficients. As is also reported in “Spectrum analysis of multispectral imagery in conjunction with wavelet/klt data compression,” Proc. ASIMOLAR Conf. 1:26–30, November, 1993; and J. Vaisey, M. Barlaud & M. Anonini, “Multispectral Image Coding Using Lattice VQ and the Wavelet Transform,” Proc. ICIP 98, Chicago, Ill., 1998 at high compression rates (less than 0.2 bit/pixel/image), both methods seem to have the same performance. The two-stage approaches mentioned above, however, have three disadvantages:

    • 1. Image decorrelation is performed in a separable fashion, first along the spectral dimension and subsequently along the spatial dimension or vice versa. The separable approach is intrinsically suboptimal as the spatial and spectral statistics are not exploited jointly;
    • 2. The KLT is data dependent and consequently it must be computed for each image set resulting in an unacceptable amount of computation;
    • 3. Transformed image planes after the initial decorrelation are coded independently of the others, despite of the spatial similarities across bands. The focus of this paper is to introduce a method which exploits spatial and frequency correlation jointly.


The following example illustrates these issues. Consider a vector-valued M1×M2 multispectral image f(m))=[f1(m), f2(m), . . . , fL(m)]T with L bands. m=(m1,m2) represents a pixel's spatial position with m1=1, . . . , M1 and m2=1, . . . , M2, and 1≦k≦L is the spectral band index. The multispectral image is thus composed of M1×M2 f-vectors of length L. The image is then processed by the two-stage compression method described above, where a 1-Dimension KLT is applied along the spectral dimension. Let R=E(f(m)fT(m)) be the spectral correlation matrix and let qi and λi, i=1, . . . , L, be the eigenvectors and eigenvalues of R respectively. The KLT of f(m) along the spectral dimension is g(m)=QTf(m) where Q=[q1, q2, . . . , qL] is the L×L matrix whose columns are the eigenvectors of R. The resultant KLT coefficients form a set of images, {g1(m), g2(m), . . . , gL(M)} referred here to as eigen-images. FIG. 1 depicts the KLT of a 2-band image into a pair of decorrelated eigen-images. A three-level wavelet decomposition of each eigen-image is also shown. These images clearly show that although the eigen-images are decorrelated across bands at a particular pixel location, cross-band correlation among groups of pixels remains. The residual cross-band correlation results in a strong spatial similarity not exploited by the wavelet transforms applied to each eigen-image. In fact, FIG. 1 shows that spatial similarity remains after the wavelet transforms are applied.


An approach to overcome the shortcomings of the two-stage compression algorithm would be to design a coder that takes advantage of the intra-band and cross-band spatial similarities. A 3D (3D) zero-tree coder, for instance, can be designed to jointly exploit the intra-band and cross-band correlations inherent in a data cube as is disclosed in A. Bilgin, G. Zweig & M. W. Marcellin, “Three-dimensional Image Compression using Integer Wavelets,” Appl. Optics, 39#11:1799–1814, April, 2000. This, however, can become very complex since the number bands to be jointly encoded can be quite large. Also, the bands are typically separable in three-dimensions and the zero-tree coder spawns children at a cubic rate having implications for the production of zero-trees. Or three-dimensional data may be processed via some three-dimensional extension of a well-known transform such as the DCT G. Abousleman, M. W. Marcellin & B. Hunt, “Compression of Hyperspectral Imagery using 3-D DCT and Hybrid DPCM/DCT,” IEEE TGRS 33#1:26–34, January, 1995. Here, again, there is the question of complexity and the constraints imposed by block transformation. In reference 6, the authors back off from processing using a 3D DCT in favor of a compromise using 2D DCT and DPCM in the spectral domain.


SUMMARY OF INVENTION

The present invention is an approach to multispectral image compression where the intra- and cross-band correlations are jointly exploited in a surprisingly simple yet very effective manner. The method is based on the fact that it is more efficient to jointly encode all the components of a vector source than to encode each component separately as is disclosed in A. Segall, “Bit allocation and encoding for vector sources,” IEEE Trans. Inf. Thy. IT-22:162–169, March 1976. The key component of the algorithm is a simple transformation, a bijection mapping of the original multispectral image into a virtual two-dimensional (2D) scalar image. In this research, we combine the bijection mapping with a very efficient scalar image-coding algorithm, the SPIHT codec as is disclosed in A. Bilgin, G. Zweig & M. W. Marcellin, “Three-dimensional Image Compression using Integer Wavelets,” Appl. Optics, 39#11:1799–1814, April, 2000. The SPIHT codec need not be modified for 3D operation. The choice of the bijection mapping will determine the way in which spatial and spectral information are combined: there is no intrinsic limit in number of bands processed at a time nor in the way spectral and spatial pixels are combined. Combining the bijection mapping with the SPIHT codec gives a no frills, low complexity method for processing multispectral data. This may be improved by selecting the optimal bijection mapping which is equivalent to the problem of finding a permutation map which reduces activity across pixels to effect pixel prediction. This latter optimization problem has been investigated largely in the context of lossless compression, as is disclosed in S. Tate, “Band Ordering in Lossless Compression of Multispectral Images,” IEEE TCOM 46#4:477–483, April, 1997; J. Wang, K. Zhang & S. Tang, “Spectral and Saptial Decorrelation of Landsat-M Data for Lossless Compression,” IEEE GRS 33#5:1277–1285, September, 1995; N. Sidiropoulos, M. Pattichis, A. Bovik & J. Havlicek, “COPERM: Transform-Domain Energy Compaction by Optimal Permutation,” IEEE TSP 47#6:1679–1688, June, 1999; and Z. Arnavut, “Applications for inversions to lossless image compression,”, Opt. Eng. 36#4:1028–1034, April, 1997. Here we shall propose another optimization approach with the goal of optimizing lossy compression. Activity from pixel to pixel is reduced for the purpose of introducing zero-trees into the coded stream thereby increasing compression without increasing distortion. By optimally mapping the multispectral image set into a single 2D-array and by subsequently applying a 2D-decomposition, the spatial correlation as well as the spectral correlation are jointly exploited. In this manner, the self-similarities across the different scales of the virtual image are exploited. Based on the statistical characteristics of the multispectral data, the bijection mapping can be optimized to minimize the distortion introduced by the compression algorithm. The optimization proposed here reduces to the maximization of a function that depends on the second-order statistics of the multispectral data.





BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is further described with reference to the accompanying drawings wherein:



FIG. 1 is a schematic drawing showing a two-stage approach for compression of a multispectral, two-color FLIR image, wherein spectral transform is followed by 2-D DWT and the multispectral image is a two-color focal plane array LWIR/MWIR;



FIG. 2 is a schematic drawing showing a bijection mapping of a multispectral image into a virtual 2D image, wherein L1L2=L;



FIG. 3 is a schematic drawing showing two bijection mappings of a 1×2×4 (4-band) stack into (a) a 1×8 array, and (b) a 2×4 array;



FIGS. 4(
a) and 4(b) are flow charts, wherein FIG. 4(a) shows a compression algorithm: bijection mapping follow by SPIHT encoder, and FIG. 4(b) shows a decompression algorithm: SPIT decoder followed by inverse bijection mapping;



FIGS. 5(
a) and (b) are schematic drawings showing two different bijection mappings of the bispectral FLIR image;



FIGS. 6(
a)–6(d) are graphs of lines of images ouputted by bijection mapping in FIG. 5, wherein FIG. 6(a) depicts the marked line in FIG. 5(a) and FIG. 6(c) depicts the marked line in FIG. 5(b), and FIGS. 6(b) and 6(d) depict the same lines after images are compressed;



FIG. 7 is a graph of average PSNR of the four reconstructed bands achieved by the various compression algorithm;



FIGS. 8(
a)–8(d) show the second band of the reconstructed images at 0.08 bit/pixel/band wherein FIG. 8(a) is the original band, FIG. 8(b) shows a separable approach, FIG. 8(c) is from a traditional two-stage algorithm, and FIG. 8(d) is from bijection mapping onto zeros trees;



FIG. 9 is a graph showing performance of the different compression algorithms acting on unobserved data;



FIGS. 10(
a)–10(d) show the first band of the non-training image at 0.1 bit/pixel/band, wherein FIG. 10(a) is the original, FIG. 10(b) shows a separable approach, FIG. 10(c) is from a traditional two-stage approach, and FIG. 10(d) is from bijection mapping onto zero trees;



FIGS. 11(
a)–11(d) show band VH of the polarimetric SAR data at 0.4 bit/pixel/band, wherein FIG. 11(a) is the original image, FIG. 11(b) is from a separable approach, FIG. 11(c) is from a two-stage approach, and FIG. 11(d) is from bijection mapping onto zero-trees; and



FIG. 12 is a graph showing coding performance achieved by the various algorithms on Polarimetric SAR data.





DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
1. Multispectral Image Compression Using Bijection Mappings

1.1 Bijection Mappings


A bijection mapping is a one-to-one onto function that maps set A to set B. The set A is called the domain of φ(x) and the set of all elements φ(x)εB is called the range of φ(x). The one-to-one correspondence between an element in A and an element in B assures that the function is invertible. That is, if εB, φ−1(y) is the set of all xεA such that φ(x)=y. φ−1(x) is called the inverse of the function φ(x). In our application, a special type of bijection mapping is used. Let RM1×M2 be the set of M1×M2 arrays defined over the set of real numbers R. Furthermore, let RM1×M2×L be the set of M1×M2 arrays where each element in the array is a L-component vector defined over the set of real numbers. For instance, if fεRM1×M2×L, then f has the following structure










f
=

[




f


(

1
,
1

)





f


(

1
,
2

)








f


(

1
,

M
2


)







f


(

2
,
1

)





f


(

2
,
2

)








f


(

2
,

M
2


)



























f


(


M
1

,
1

)





f


(


M
1

,
2

)








f


(


M
1

,

M
2


)





]


,




(
1
)








where f(m1,m2)=[f1(m1,m2),f2(m1,m2) . . . fL(m1,m2)]T is an L-component vector and where each element of f(m1,m2) is defined over the set of real numbers, i.e. fk(m1,m2)εR. A M1×M2 multispectral image with L bands is an example of an element in RM1×M2×L, whereas each M1×M2 monochrome (scalar) image belongs to the set RM1×M2. As shown in FIG. 2, the bijection mappings utilized in this paper are defined as a one-to-one permutation of the elements in RM1×M2×L onto the set RL1M1×L2M2. That is,

B:RM1×M2×L→RL1M1×L2M2,B(fk(m1,m2))=G(i,j)  (2)

m1=1, 2, . . . ,M1; m2=1, 2, . . . , M2; k=1, 2, . . . , L; i=1, 2, . . . , L1M1, j=1, 2, . . . , L2M2

where GεRL1M1×L21M2 is the range of the bijection mapping B and RM×M2×L1 is its domain. L1 and L2 are integers such that L=.L1.L2. The bijection mapping of a multispectral image is thus defined as a mapping of the 3D data stack into a 2D virtual image. This mapping includes normalization such that neighboring pixels in the virtual image have similar dynamic range. The mapping is invertible since the original data cube can be perfectly reconstructed from the virtual two-dimensional array. Each array defined in RL1M1×L2M2 is related to an element in RM1×M2×L through the inverse bijection mapping:

B−1:RL1M1×L2M2→RM1×M2×L,B−1(G(i,j))=fk(m1,m2).  (3)

In summary, RM1×M2×L represents the set of M1×M2 multispectral images with L bands, B(RM1×M2×L) represents the associated set of two-dimensional arrays.


1.2 Bijection Mappings onto Zero-Trees


For an array fεRM1×M2×L, there are (LM1M2)! possible bijection mappings B:RM1×M2×L→RL1M1×L2M2 for a given L=L1L2. With most multispectral images having spatial dimensions larger than 512×512, direct bijection mappings of the multispectral images themselves prove to be computationally prohibitive. While the number of mathematically conceivable bijections is large, the number useful for the application at hand is much smaller. We expect only those bijections that map a small, spatially contiguous, vector-pixel set to be relevant for compression because the correlations among the associated pixels will tend to be large. Therefore, a simpler approach to handle large image files is to break the multispectral image stack into smaller non-overlapping sub-images of depth L. That is, the spatial domain is partitioned while the spectral dimension is not, as L is typically in the order of 15 or less for multispectral images. The bijection mapping of a multispectral image thus reduces to the bijection mapping of N1×N2×L image cells as depicted in FIG. 3. Assuming that the images are homogeneous, all cells are treated equally and the same bijection mapping is applied to all image cells. This concept could be generalized where various types of bijection mappings are applied to different image cells depending on their statistical characteristics. For the sake of simplicity, at this time, we treat all cells uniformly. The multispectral image is first divided into cells of size N1×N2×L. Each 3D cell is then subjected, independently of the others, to a bijection mapping. The resultant two-dimensional arrays are placed into a large single array according to the position of the original 3D cells. FIG. 3 shows two different bijection mappings of a four-band cube, where the size of each cell is 1×2×4. In FIG. 3(a), each cell is mapped to a 1×8 array, i.e., B:R1×2×4→R1×8. FIG. 3(b) shows a different bijection mapping where each cell is mapped to a 2×4 array. That is B:R1×2×4→R2×4. The shaded pixels are the bijection mapping of the first 3D cell, upper-left corner. The unshaded pixels show the bijection mapping of the neighboring block. Note that the locations of both two-dimensional arrays are in accordance with the position of the 3D cells. For simplicity, only a few pixels of the cube are depicted and no normalization of the data is depicted at this time. In a practical implementation the data has to be normalized such that the various components are given the same dynamic range. Having defined the image partition into smaller cells in RN1×N2×L, the proposed compression algorithm consists of:

    • 1. Mapping each cell by B:RN1×N2×L→RL1N1×L2N2 into the L1N1×L2N2 array.
    • 2. Tiling the obtained arrays into a virtual image I.
    • 3. Encoding the virtual image I by a zero-tree coder.
    • 4. Attaching the type of bijection mapping used as a header.

      FIG. 4 shows the proposed compression algorithm. Note that the compressed image is a binary bit string whose length, denoted by S in FIG. 4, is determined by the compression ratio. Note also that the last step of the proposed algorithm, not shown in FIG. 4, adds information needed by the decoder to reconstruct the multispectral data. This overhead, however, is negligible compared to the amount of compressed data. As it will be shown shortly, a permutation matrix that defines the re-ordering of the data pixels can compactly represent the bijection mapping. (Figure to be included)


      The multispectral image decoder reverses the encoding process by:
    • 1. Extracting the type of bijection mapping from the file header.
    • 2. Reconstructing the virtual image Î with the zero-tree decoder.
    • 3. Partitioning the virtual image Î into L1N1×L2N2 cells and mapping back each cell by B−1:RL1N1×L2N2→RN1×N2×L, to form the reconstructed 3D cells.
    • 4. Concatenating the obtained cells to form the multispectral image.


      Given the bijection mapping onto the zero-tree codec shown in FIG. 4, the following questions arise.
    • 1. How are N1 and N2 chosen so as to maximize the attained energy compaction?
    • 2. Given N1 and N2 there are (LN1N2)! possible bijection mapping. How does one select a bijection mapping so as to maximize the energy compaction?


      Energy compaction, Ec, is defined as where ET is the total energy of the multi spectral image










E
c

=


E
u


E
T






(
4
)








and Ell is the energy compacted in the approximation of the wavelet decomposition. In compression, higher Ec generally translates into smaller distortion in the reconstructed data. The answer to the first question reduces to how much computational complexity can we tolerate? If N1>Ni, i=1,2, and letting {B(N1,N2)} be the set of all possible bijection mappings of a N1×N2 cell, then {B(N1, N2)}⊂{B(N′1, N′2). Hence, we will always do equal or better as N1 and N2 are incremented. The real issue is, how much gain does one attain as Ni is incremented to Ni+1, Ni+2, because the size of the search space increases as (LN1N2)!. It is conceivable that an optimum may be achieved for small values of N1, N2 in which case picking a larger cell adds computation. As N1 is incremented to N1+1, an additional (LN2) possible bijection mapping are added to the search space. A related and perhaps more important question is: if only one parameter is incremented, either N1 or N2, which one should be chosen. In summary, the performance gain as the size of N1 and N2 are monotonically increased, is somewhat reminiscent to the order of an AR model. In fact, some of the same issues that arise on selecting the order of an AR model also arise here such as overfitting, complexity, etc. The second question is perhaps the most important since it is directly related to energy compaction. Having fixed N1 and N2 which one of the (LN1N2)! bijection mappings will maximize the energy compaction? Answers to these questions will be given in Section 3. Table 1 shows through a simple numerical example the performance gain that is achieved using bijection mapping.


2. Optimal Bijection Mappings

The approach to the design of the image partition size N1×N2 is through heuristics, although other more elaborated approaches can be used. In the heuristic approach, the type of the 2D transform used in the encoding stage determines the shape of the partition cell. To decorrelate the virtual image, SPIHT uses a two-dimensional discrete wavelet transform that exploits the correlation by filtering the image first along the rows and then along the columns. This fact provides a reliable guideline defining N1 and N2 as well as the shape of the two-dimensional array, i.e. L1 and L2, inherent in the bijection mapping. We have found experimentally for the data to be considered that the correlation existing in a 1×N2×L cell is better captured by the wavelet decomposition if each 1×N2×L cell is mapped into a 1×N2L array. Thus, N2 consecutive vector-pixels in the same row are mapped into a 1×N2L array. This leads to a bijection mapping B:R1×N2L→R1×LN2 for each 3D cell, yielding a virtual image of size M1×LM2. For simplicity, we assume that M2/N2 is an integer. Taking into account the two-dimensional transform used in the proposed compression algorithm, the bijection mapping of a


M1×M2 multispectral image f(m) with L bands reduces to B:RM1×M2×L→RM1×LM2,G=B(f),


where











B


(
f
)


=

[




P


[





σ






f


(

1
,
1

)



+
μ












σ






f


(

1
,

N
2


)



+
μ




]








P


[






σ






f


(

1
,


M
2

-

N
2

+
1


)



)

+
μ













σ






f


(

1
,

M
2


)



)

+
μ




]


















P


[





σ






f


(


M
1

,
1

)



+
μ











σ






f
(


M
1

,


N
2

+
μ







]








P


[





σ






f


(


M
1

,


M
2

-

N
2

+
1


)



+
μ












σ






f


(


M
1

,

M
2


)



+
μ




]





]


,




(
5
)








where f(m1,m2) is the vector-valued pixel at (m1,m2), μ=[μ1, μ2, . . . , μL]T and σ=diag(σ1, σ2, . . . , σL) normalize the data set, where σk and μk are chosen such that the resultant normalized spectral components have the same dynamic range. It is shown in Table 2 that σk and μk are given respectively by:











σ
k

=


cov


(


f
k

,

f
_


)



var


(

f
k

)












μ
k

=


f
=

-


σ
k



f
k




,





(
6
)








where cov(x,y), var(x) and x denote the covariance, variance and mean respectively, and







f
_

=





k
=
1

L



f
k


L






is the image obtained by averaging the different spectrum components. Note that in the above equations the spatial index m=(m1,m2) has been dropped for simplicity.


P in (5) is an LN2×LN2 matrix that contains exactly one 1 in each row and column. P is referred here to as a permutation matrix since it defines the re-ordering of the pixels into the array G. FIG. 3(a) shows the case when N2=2 and L=4. Note in FIG. 3(a) that, for the shaded block, the vector-valued pixels are f(1,1)=[a1, b1, c1, d1]T and f(1,2)=[a2, b2, c2, d2]T, σ=I2, μ=[0 0]T and the permutation matrix for this example is










P
=

[




J
4




O
4






O
4




I
4




]


,




(
7
)








where IN is an N×N identity matrix, 0N is an N×N all zero matrix and JN is an N×N right-to-left diagonal matrix. Note also that






P


[



f



(

1
,
1

)





f



(

1
,
2

)




]






gives re-ordering of the shaded samples into the two-dimensional array. Each permutation matrix defines a different re-ordering and therefore a different bijection mapping. The permutation matrix P defining the re-ordering of the samples thus characterizes a bijection mapping. A fundamental problem is thus the design of P that leads to the highest compression attainable. For a 1×N2×L cell, there exists (LN2)! different re-orderings of the samples inside a cell.


This, consequently, leads to (LN2)! different M1×LM2 virtual images. Some of them are more suitable for compression than others. We expect those mappings with the pixels arranged to maximize low frequency correlation to be the best for compression. The question that naturally arises is which one of the (LN2)! bijection mappings generates virtual images that are most suitable for compression. Under the energy compaction viewpoint, the optimal bijection mapping is the one that compacts the most energy onto the low frequency components in the transform domain. Thus, the optimal bijection mapping, B(P) is selected such that for a fixed wavelet transform, where Ec is the energy compaction

B0=maxpEc(B(P)),  (8)

defined in (4). Since the total energy of a multispectral image does not depend on the bijection mapping nor on the wavelet transform used, (8) reduces to maximizing the energy compacted in the low-frequency components of the wavelet domain. If the wavelet transform is applied to an image that is smooth, having most of its energy in the lowpass band of the spectrum, the resultant energy compaction is high. On the other hand, if the energy of an image is concentrated in the highpass band of the spectrum, the wavelet transform will have more of the energy distributed on the high-frequency components of the wavelet domain, resulting in poor energy compaction.


To illustrate this point better, FIG. 5 shows two different bijection mappings of a bispectral FLIR image. Although the images outputted by the two different bijection mappings look similar, they have markedly different characteristics. To observe their differences better, FIG. 6(a) and FIG. 6(c) depict horizontal scans marked as white lines in FIG. 5. Note that the line shown in FIG. 6(c) is smoother than the one plotted in FIG. 6(a). FIG. 6(b) and FIG. 6(d) show the same lines after the virtual images outputted by the bijection mappings have been compressed/decompressed. The reconstructed line of FIG. 6(d) is less distorted than the one depicted in FIG. 6(b). We use this observation in order to redefine the optimization problem as follows. The optimal bijection mapping B is the one that generates the smoothest virtual image which is the most suitable image for compression. As a measure of image smoothness, we use the correlation between consecutive samples in the virtual image. Finding the optimal bijection mapping reduces to finding the best re-ordering of the samples in the virtual image that maximizes the spatial correlation.


Let {tilde over (f)} be the vector formed by concatenating the normalized vector-valued pixels in a cell of size 1×N2×L, i.e.

{tilde over (f)}=[σf(m1,m2)TT,σf(m1,m2+1)TT, . . . ,σf(m1,m2+N2)TT]T.

Furthermore, let g=P{tilde over (f)} be a re-ordered vector of {tilde over (f)} produced by the permutation matrix P. We want to find the best re-ordering of the components of {tilde over (f)} such that the sum of correlations between consecutive components of the re-ordered vector is maximized. Let ri,j be the correlation between the i th and j th components of gp. Since {tilde over (f)} is formed by concatenating N2 consecutive normalized vector-valued pixels, ri,j represents the spectral correlation, or the spatial correlation, or the cross-correlation between pixels. The objecting function to be maximized is then given by











max

all






g
p





[





i
=
1


L






N
2





r

i
,
i
,

+
1




+

r


L






N
12


,
1



]


=


max

all






g
p









E


(


g
p



Ag
p


)







(
9
)








where rLN2,1 takes into account the fact that the bijection mapping of consecutive 3D-cells is concatenated according to (5). A is an LN2×LN2 matrix operator that circularly shifts the components (rows) of a vector (matrix) one position up, i.e.,









A
=





0


1


0







0


0


1

























1


0


0












(
10
)








Since P{tilde over (f)}, the objective function (9) can be rewritten as a function of the permutation matrix P as follows:

maxPεΩ(E({tilde over (f)}TPTAP{tilde over (f)})),  (11)

where Ω is the set of all permutation matrix of size LN2. Thus, finding the best re-ordering of the components of {tilde over (f)} that produces a smooth virtual image reduces to finding the permutation matrix P that maximizes (11). The optimal permutation matrix P is determined using a heuristic search of the space Ω.


2.1 Sub-Optimal Bijection Mapping


Searching for the optimal solution of (11) can be time consuming since the search space is quite large. In fact, equation (11) has to be computed for each element in the search space Ω, and the permutation matrix that yields the maximum value is chosen as the optimal solution. This can become computational expensive since the search space increases as (LN2)!. To overcome this limitation, we propose a sub-optimal bijection mapping that has been experimentally proved to yield good results. This sub-optimal bijection mapping is described next.


Suppose that we have a 1×N2×L 3D-cell to be mapped into a 1×N2L array. Without loss of generality, let us assume that L=4 and N2 is an even number. Denote the four normalized bands as A, B, C and D, and the pixels belonging to those bands as ai, bi, ci and di respectively where i denotes the spatial position of the pixel in the cell. Let ra, rb, rc and rd be the spatial correlations between two consecutive pixels in the bands A, B, C and D respectively. Furthermore, let rX,Y be the spectral correlation between the band X and Y where X □{A, B, C, D} and Y □{A, B, C, D}. The suboptimal bijection mapping follows the four-step procedure:

  • 1. Select the pixels of the band that has the highest spatial correlation, say max(ra, rb, rc, rd)=ra, hence




















a1
a2

aN2










  • 2. On each side of the pixel set attach the pixels corresponding to the band that has the highest spectral correlation with respect to the band found in the previous step, say max(rA,B, rA,C, rA,D)=rA,B, hence the re-ordering becomes

























b1
b2
. . .
bN2/2
a1
a2
. . .
aN2
bN2/2 + 1
. . .
bN2









  • 3. On each side of the resultant array insert the pixels corresponding to the band that has not been merged yet and that has the highest spectral correlation with respect to the samples added in step 2, i.e. max (rB,C,rB,D) defines the next band to be multiplexed. Following with the example and assuming that rB,C>rB,D, the re-ordering becomes





























c1
. . .
cN2/2
b1
. . .
bN2/2
a1
. . .
aN2
bN2/2 + 1
. . .
bN2
cN2/2 + 1
. . .
cN2










Repeat the previous step up to all the bands are multiplexed into a single band. Figure [REF:fig:bijection](a) shows, for this example, the sub-optimal bijection mapping for N2=2. This four-step procedure is repeated for each 3D cell and the resultant 2D array is tiled in the way described above for the optimal solution. Note that the key in the sub-optimal bijection mapping is to keep, as close as possible, those pixels that have a high correlation. Moreover, it takes into account both the spectral and spatial correlations in defining the re-ordering of the samples. For both optimal and sub-optimal solutions, N2 has to be set to a small number (between 2 and 6). The reason for doing so lies in the fact that the spatial correlation decays rapidly as the distance between pixels increases.


3. Simulation Results

The proposed compression algorithm was applied to 4-band airborne multispectral data collected using the AISA Airborne Imaging Spectrometer. Each band is a monochrome 8 bit image of size 384×700. A 256×256 pixel area in the upper left part of each band is used as training data to compute the correlation matrix needed for the optimization algorithm. The optimal bijection mapping, B0, is determined following the algorithm described in Section 3. The data is partitioned into 1×2×4 m cells. The bijection mapping B0:R1×2×4→R1×8 is then applied to each cell to form the virtual image that is compressed by the SPIHT coder at a compression rate from 0.05 bit/pixel/band to 1 bit/pixel/band. The sub-optimal bijection mapping is also determined and used here for comparison.


In order to illustrate the efficiency of the proposed compression algorithm, we include the comparison of this approach with two additional compression algorithms. The first algorithm compresses each spectral band separately using the SPIHT coder, and thus only takes advantage of the spatial redundancy in the image set. We use this compression algorithm, referred here to as the separable approach, as a benchmark to see the advantage of exploiting all the redundancy existing in a multispectral data. The second algorithm uses the two-stage approach described in the introduction. That is, the KLT is computed along the spectral dimension to obtain the 4 eigen-planes. The SPIHT coder is then used to compress each eigen-plane at a compression rate that is proportional to the logarithm of the energy of the eigen-plane. In order to compute the covariance matrix needed for the KLT, the same 256×256 area used in the proposed algorithm is used here as training data. We use the Peak Signal-to-Noise Ration (PSNR) as a distortion measure of the quality of the reconstructed image. For 8-bit images of size M1×M2, PSNR is defined as:











PSNR


(
dB
)


=

10
*


log
10

(


255
2



1


M
1



M
2








i
=
1


M
1







j
=
1


M
2





(


x

i
,
j


-


x
^


i
,
j



)

2





)



,




(
8
)








where xi,j is the pixel at the i th row and j th column in the image and {circumflex over (x)}i,j is its corresponding reconstructed pixel. PSNR is separately computed for each band. FIG. 7 shows the average PSNR obtained using the proposed algorithm (optimal and sub-optimal bijection mappings) and the algorithms described above for a compression rate between 0.05 bit/pixel/band to 1 bit/pixel/band.


Note that for high compression rates (less than 0.25 bit/pixel/band), the proposed compression algorithm outperforms the two-stage compression algorithm. Note also that the sub-optimal bijection mapping yields similar results than those found with the optimal bijection mapping. FIG. 8 shows the second reconstructed band yielded by the different compression algorithms at a compression rate of 1:100 or equivalently 0.08 bit/pixel/band. FIG. 8(a) shows the original second band, FIG. 8(b) shows the performance achieved if each band is separately compressed; i.e. not spectral correlation is exploited. FIG. 8(c) shows the reconstructed image yielded by the two-stage algorithm and FIG. 8(d) depicts the corresponding band when the proposed compression method is taken. The sub-optimal bijection mapping yields similar visual performance to that of the optimal bijection mapping; therefore, its reconstructed image is not shown. Note that in the reconstructed image obtained using the proposed algorithm, the details are preserved and the edges are well defined. By comparison, there is noticeable blurring for the other reconstructions. Note also the poor performance that is achieved when the spectral correlation is not exploited. As can be seen by comparing the images and the PSNR curve of FIG. 7 the proposed compression method is simple yet very efficient particularly at very high compression rates.


In order to see the performance of the various compression approaches acting over unobserved data, a section of the four-band multispectral data that was not used as training data is compressed using the different algorithms. The same optimal and sub-optimal bijection mappings determined using the training data are used in the compression of the unobserved data. Similarly, the same eigenvectors of the covariance matrix determined using the training data are used to compute the KLT of the unobserved data for the two-stage compression algorithm.



FIG. 9 shows the performance of the different algorithms acting on an unobserved data set. Observe that the two-stage approach is more data dependent than the proposed algorithm since the proposed approach outperforms the two-stage approach over a much wider range of compression rate. FIG. 10(a) shows the first band of the original unobserved data, and FIGS. 10 (b), (c) and (d) show the corresponding reconstructed images using the separable approach, the two-stage approach and the proposed algorithm respectively.


As a third example, we tested the algorithms on a four-component polarimetric Synthetic Aperture Radar (SAR) data which contains signal-dependent multiplicative noisei. The SAR data was collected by the EROS data center. 11(a) shows the Vertical-Horizontal (VH) polarization of the test data. FIG. 11(b) shows the performance of the separable approach. FIGS. 11(c) and (d) show the performance of the two-stage approach and the proposed approach, respectively. The reconstructed images shown in FIG. 11 are compressed 20:1. Note that at this compression the separable and two-stage algorithms exhibit ringing artifacts which are very disturbing. The PSNRs obtained using the different compression algorithms are displayed in FIG. 12. The bijection mapping onto zero-tree outperforms the other two methods by approximately 0.5 dB.


5. CONCLUSION

It will be understood by those skilled in the art that we have described a compression method for multispectral images that jointly exploits the spectral and spatial correlation inherent in a multispectral image in a simple yet efficient way. The proposed method uses a bijection mapping to project the 3D data cube into a two-dimensional array. A scalar image compression coder then compresses the obtained two-dimensional image. An optimization algorithm is developed to find the optimal parameters of the bijection mapping that leads to the most suitable virtual image for compression. The performance of the proposed compression algorithm was compared with those found using the separable approach and the traditional two-stage approach. In general, at very high compression rates, the bijection compression approach yields superior results, both perceptually and in the PSNR distortion measure.


While the present invention has been described in connection with the preferred embodiments of the various figures, it is to be understood that other similar embodiments may be used or modifications and additions may be made to the described embodiment for performing the same function of the present invention without deviating therefrom. Therefore, the present invention should not be limited to any single embodiment, but rather construed in breadth and scope in accordance with the recitation of the appended claims.


TABLE 1

Assuming a simple mathematical model for the spatial and spectral correlations, it is shown here that energy compaction can be increased by combining multispectral planes. While the model does not accurately model the correlations of multispectral imagery, it does indicate how the bijection map can improve compression. Consider a bispectral sensor measuring the reflectance of the same scene at two different frequencies. Assume that the received signals are stationary and have been sampled at the same rate. Let f(n)=[f1(n), f2(n)]T n=1, . . . N, be the bispectral image where the one-dimensional spatial indexing variable “n” is used to denote pixel location for notational simplicity. For a given spectral location, the signal forms the vector fi=[fi(1), fi(2), . . . , fi(N)]T for i=1,2. The spatial correlation matrices for each spectral signal is given by










R
i

=


E


(


f
i



f
i
T


)


=

[





r
o

+
r



r


r


r




r




r
o

+
r






r




r


r





r




r


r







r
o

+
r




]






(
13
)








where a simple spatial correlation model is assumed for mathematical tractability. The correlation structure of the autocorrelation matrix may not be particularly accurate for multispectral imagery; however, the assumption of the more or less the same autocorrelation is a valid assumption as the multispectral images will be, more or less, similar. Since both f1 and f2 view the same scene, cross-spectral correlation between them exists. For mathematical tractability, assume that the cross-spectral correlation C=E(f1f2T) is given by









C
=


E


(


f
i



f
i
T


)


=

[




ρ
o



ρ





ρ




ρ










ρ
























ρ


ρ






ρ
o




]






(
14
)








where ρ0>ρ is assumed. Again, the particular correlations may not be reasonable, but the cross-correlation structure is described exactly for this simple case i=1,2. Now, consider the two encoding strategies: (a) each signal is coded individually and (b) a virtual signal is formed by a bijection mapping followed by scalar encoding. For simplicity, the energy preserved in the reconstructed signals after encoding is used as a performance measure. Recall that a signal's energy is given by the sum of the eigenvalues of its correlation matrix. The eigenvalues of Ri, i=1,2, can be computed as:










λ
i

=

{




r
o





i
=
1

,
2
,





,

N
-
1








r
o

+
Nr




i
=

N
.










(
15
)








Note that if each signal is individually compressed at a rate 1:N, the maximum energy preserved in each signal is given by the highest eigenvalue, namely, λ1=r0+Nr. Now, consider the case where both signals are mapped into one larger virtual signal, g=[f1T, f2T]T. The correlation matrix of g is given by











R
g

=


E


(

gg
T

)


=

[




R
1



C




C



R
1




]



,




(
5
)








whose eigenvalues are










λ
i

=


[





r
o

+
ρ
-

ρ
o







for





i

=
1

,





,

N
-
1








r
o

-
ρ
+

ρ
o







for





i

=
N

,

N
1

,





,






2





N

-
2









-

(

N
-
1

)



ρ

+
Nr
+

r
o

-

ρ
o





i
=


2





N

-
1









+

(

N
-
1

)



ρ

+
Nr
+

r
o

+

ρ
o





i
=

2





N





]

.














For a compression rate of 1:N, the maximum energy preserved is determined by the two highest eigenvalues, namely:










Preserved





Energy

=

{




Nr
+

2






r
o


+


(

N
-
2

)


ρ

+

2






ρ
o








if






(

N
-
2

)


ρ

+

2






ρ
o



>
Nr







2





Nr

+

2






r
o






otherwise
.









(
18
)








From (4) and (7), the compression of the virtual image always provides higher energy preservation suggesting the potential the advantage of bijection mappings.


TABLE 2

The normalization factors given in Equation (6) are derived here. Let fk be k-th spectral component of a multispectral image. For notational simplicity, the pixel's spatial position index has been dropped. We want to normalize this component such that the resultant normalized component, denoted by {tilde over (f)}k, has the same dynamic range than the other normalized components. Let f be the average of the different spectral components, that is








f
_

=



1
L




k
L


=

f
k


1




,





Note that the average is performed for each spatial position m=(m1,m2). We want to normalize fk with respect to f, for k=1,2 . . . L, thus all the components have the same dynamic range. Define {tilde over (f)}kkfkk, where σk and μk are normalization factors to be optimally chosen. Since we are also interested in increasing the similarities between normalized components, σk and μk are chosen to minimize the following cost function

minσkkE({tilde over (f)}kf)2  (19)

It can be easily shown that the values of σk and μk that minimize the MSE between the k th normalized component and the spectral average image are given by:







σ
k

=



E


(


f
k



f
_


)


-


Ef
k


E


f
_





Ef
k
2

-


(

Ef
k

)

2










μ
k

=


E


f
_


-


σ
k



Ef
k









    • The above equations can be rewritten as follows










σ
k

=


cov


(


f
k

,

f
_


)



var


(

f
k

)










μ
k

=


f

-
-


-



cov


(


f
k

,

f
_


)



var


(

f
k

)






f
_

k








where cov(x,y), var(x) and x denote the covariance, variance and mean respectively.

Claims
  • 1. A method for multispectral image compression comprising the steps of: sub-optimally mapping the multispectral image into a 2-dimensional array such that the most highly correlated points are near to one another, said sub-optimal mapping step including the steps of (1) selecting the pixels of the band that have the highest spatial correlation; (2) on each side of the pixel set, attaching the pixels corresponding to the band and that has the highest spectral correlation with respect to the band found in the previous step; (3) on each side of the resultant array, inserting the pixels corresponding to the band that has not been merged yet that has the highest spectral correlation with respect to the samples added in Step (2); and, (4) repeating the previous step up to all the bands that are multiplexed into a single band, such that the sub-optimal bijection mapping is to keep, as close as possible, those pixels that have a high correlation; and,then applying an image coding algorithm to said 2-dimensional array, whereby enhanced performance is achieved at high compression rates wherever cross-band correlation is high.
  • 2. The method of multispectral image compression of claim 1 wherein the mapping of the multispectral image is a bijection mapping of the multispectral image.
  • 3. The method of claim 2 wherein the original multispectral image is mapped onto a virtual 2-dimensional scalar image.
  • 4. The method of claim 3 wherein the multispectral image is comprised of multispectral data including spatial correlation and spectral correlation and said spatial correlation and said spectral correlation are jointly exploited.
  • 5. The method of claim 4 wherein the sub-optimal bijection mapping is optimized.
  • 6. The method of claim 5 wherein distortion is minimized.
  • 7. The method of claim 6 wherein there is a maximization of a function of second order statistics of the multispectral data.
  • 8. The method of claim 1 wherein the image coding algorithm is a scalar image coding algorithm.
  • 9. The method of claim 1 wherein there is no degradation of performance at low compression rates.
  • 10. The method of claim 1 wherein there is a single 2-dimensional array.
  • 11. A method of multispectral image compression comprising the steps of sub-optimally mapping by bijection mapping the multispectral image into a 2-dimensional array such that the most highly correlated points are near to one another.
  • 12. The method of claim 11 wherein the original multispectral image is mapped into a virtual 2-dimensional scalar image.
  • 13. The method of claim 12 wherein the multispectral image is comprised of multispectral data including spatial correlation and spectral correlation of said spatial correlation and spectral correlation are jointly exploited.
  • 14. The method of claim 13 wherein the sub-optimal bijection mapping is optimized.
  • 15. The method of claim 14 wherein distortion is minimized.
  • 16. The method of claim 15 wherein there is a maximization of a function of second order statistics of the multispectral data.
  • 17. The method of claim 11 wherein the image coding algorithm is a scalar image coding algorithm.
  • 18. The method of claim 11 wherein there is no degradation of performance at low compression rates.
  • 19. The method of claim 11 wherein there is a single 2-dimensional array.
  • 20. A method for the compression of a multispectral image comprised of multispectral data, said method comprising the steps of: sub-optimally mapping by bijection mapping the multispectral image into a 2-dimensional array such that the most highly correlated points are near to one another and such that there is a maximization of a function of second order statistics of the multispectral data; andthen applying an image coding algorithm to said 2-dimensional array, whereby enhanced performance is achieved at high compression rates whenever cross band correlation is high, but there is no degradation of performance at low compression rates.
CROSS REFERENCE TO RELATED APPLICATIONS

This application claims rights under U.S. Patent Application Ser. No. 60/360,457, filed Feb. 28, 2002 under 35 U.S.C. § 19(e).

STATEMENT OF GOVERNMENT INTEREST

The U.S. Government may have rights in this application as a result of activities conducted under Government Contract No. DAAL 01-96-2-0002 awarded by the Department of the Army.

US Referenced Citations (5)
Number Name Date Kind
5737448 Gardos Apr 1998 A
5963935 Ozbutun et al. Oct 1999 A
6081800 Ozbutun et al. Jun 2000 A
6879716 Ishibashi Apr 2005 B1
7035457 Ishibashi Apr 2006 B2
Provisional Applications (1)
Number Date Country
60360457 Feb 2002 US