The present invention relates generally to data compression, and more particularly, but not exclusively, to the use of predictive-transform (PT) source coders for data compression.
Recently it was shown that wavelets based JPEG2000 [1] can yield remarkably ‘poor’ results when applied to synthetic aperture radar (SAR) images that are being used in knowledge-aided airborne moving target indicator (AMTI) radar applications [5]. To demonstrate these surprising results a very simple strip-processor minimum mean squared error (MMSE) predictive-transform (PT) source coder was used [2]. The reason for JPEG2000's poor performance, more than 5 dBs worse for the SAR image under test [5], may be traced to the significant difference in correlation between adjacent horizontal and adjacent vertical pixels found in typical SAR images. Fortunately PT source coding offers a very simple solution to this problem. This is the case since its optimum design of prediction and transformation matrices in a flexible pixel geometry processing environment explicitly takes into consideration the vastly different horizontal and vertical pixel correlations. In addition, there are now available fast on-line PT implementation algorithms that are based on even/odd eigenvector decompositions [4] and/or Hadamard structures [6]. However, for standard images such as those given in the JPEG suitcase as well as the Lena image it has been found that PEG2000 performs satisfactorily. This is due to the use of subband coding that produces an exceptionally appealing objective and subjective visual performance when the correlation between adjacent horizontal and adjacent vertical pixels does not vary significantly, as is the case for this type of images. On the other hand, the current predictive transform strategy still needs to be refined to yield results that are significantly superior to those of JPEG2000 when compressing images such as those found in the JPEG suitcase.
Non-limiting and non-exhaustive embodiments of the present invention are described with reference to the following drawings. In the drawings, like reference numerals refer to like parts throughout the various figures unless otherwise specified.
For a better understanding of the present invention, reference will be made to the following Detailed Description of the Invention, which is to be read in association with the accompanying drawings, wherein:
The invention now will be described more fully hereinafter with reference to the accompanying drawings, which form a part hereof, and which show, by way of illustration, specific exemplary embodiments by which the invention may be practiced. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Among other things, the invention may be embodied as methods or devices. Accordingly, the invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. The following detailed description is, therefore, not to be taken in a limiting sense.
Since JPEG2000 does not use prediction from subband to subband it stands to reason that the structural flexibility of MMSE PT source coding may be transported to subband coding to achieve even better results. This is one of the problems to which this invention is directed.
Briefly, the presented invention is directed to minimum mean squared error (MMSE) predictive-transform (PT) source coding integrated with subband compression to further improve the performance of low bit rate MMSE PT source coders. A desirable byproduct of the advanced scheme is that the incorporation of joint optimum prediction and transformation from subband to subband is ideally suited to its integration with JPEG2000 to yield even higher compression levels while producing an outstanding objective as well as subjective visual performance.
In
The digital structure includes multiplying δck by a real and scalar compression factor ‘g’ and then finding the closest integer representation for this real valued product, i.e.,
δĉk=└gδck+½┘. (2.2)
The quantizer output δck is then added to the prediction coefficient ĉk/k−1 to yield a coefficient estimate ĉk/k. Although other types of digital quantizers exist [3] the quantizer used here (2.2) is the simplest one to implement and yields outstanding results as seen in our simulations [2]. The coefficient estimate ĉk/k is then multiplied by the transformation matrix T to yield the pixel vector estimate {circumflex over (x)}k/k. This estimate is then stored in a memory which contains the last available estimate ŷk−1 of the pixel matrix y. Note that the initial value for ŷk−1, i.e., ŷ0 can be any reasonable estimate for each pixel. For instance, since the processing of the image is done in a sequential manner using prediction from pixel block to pixel block, the initial ŷ0 can be constructed by assuming for each of its pixel estimates the average value of the pixel block x1.
The design equations for the T and P matrices are derived by minimizing the mean squared error expression
E[(Xk−{circumflex over (X)}k/k)t(Xk−{circumflex over (X)}k/k)] (2.3)
with respect to T and P and subject to three constraints. They are:
1) The elements of δck are uncorrelated from each other.
2) The elements of δck are zero mean.
3) The analog quantizer of (2.1) is assumed.
where these expressions are a function of the first and second order statistics of Xk and Zk−1 including their cross correlation. To find these statistics the following isotropic model for the pixels of y can be used [4]:
E[yn]=K, (2.7)
E[(y0−K)(yi+v,j+h−K)=(Pavg−K3)ρD (2.8)
D=√{square root over ((rv)2+h2)} (2.9)
ρ=E[(yij−K)(yi,j+1−K]/(Pavg−K2) (2.10)
where v and h are integers, K is the average value of any pixel, Pavg, is the average power associated with each pixel, and r is a constant that reflects the relative distance between two adjacent vertical and two adjacent horizontal pixels (r=1 when the vertical and horizontal distances are the same).
In
The Subband predictive Transform source coding will now be described in more detail. The proposed scheme is next advanced by considering in detail a simple example that integrates the PT source coding scheme with the wavelets JPEG2000 subband approach. More specifically, we consider the compression of the 4×4 dimensional image depicted in
The subband PT (SPT) algorithm begins with the evaluation of the average value x0 of the given image {y(i,j)}, i.e.,
This average value can be encoded with 8 bits.
Next the first subband is encoded as shown in
These four values are in turn collected into the 4 dimensional column vector X1, i.e.,
x1=[x11,1x12,1x11,2x12,2]t. (3.3)
This vector is then multiplied by a 4×4 unitary transform matrix T to generate the coefficient vector c1, i.e.,
e1=Ttx1. (3.4)
Clearly, when this transformation matrix is the Hadamard transform we have the standard wavelets JPEG2000 approach [1]. The second large square to investigate is placed on the upper left hand side of the image. It displays the predicted values for the four pixel averages (3.3). These predicted values are denoted by the set of four scalar elements {{circumflex over (x)}1/0k,l:k=1,2 & l=1,2} where ‘all’ of these elements are given the same value of X0 which is, as mentioned earlier, the average value of the entire image
(3.1). It then follows that our prediction vector for the transform coefficients is defined
by the expression
The prediction vector Z0 is then multiplied by a 4×4 prediction matrix P resulting in the prediction coefficient vector ĉ1/0, i.e.,
ĉ1/0=Ptz0 (3.6)
Next the design of T and P is addressed by using the isotropic image correlation model (2.7)-(2.10) with the real constant value of ‘s’ added to (rv)2+h2. This is done to reflect the fact that the prediction (3.5) and predicted (3.3) averaged pixels are derived from the same pixel space but are extracted from different subband passes. Furthermore, assigning 0.99999 to both ρ and r, and using any value for K, the following T and P realizations are obtained when s=4:
Notice that the transform matrix (3.7) is the Hadamard transform [1]. However, this will not be the case in general when using a different averaged pixel block size. The difference between the coefficient vector c1 and its predicted value ĉ1/0 then results in the 4 dimensional coefficient error or innovation ĉ1/0, i.e.,
The four elements of δc1 are depicted on the third large square located on the right hand side of
Next, the coefficient error is quantized [1] yielding the quantization coefficient error δĉ1, i.e.
The prediction coefficient vector ĉ1/0 is then added to the quantized coefficient error δĉ1 to yield the estimated coefficient vector ĉ1/1, i.e.,
ĉ1/1=ĉ1/0+δc1. (3.11)
The estimated coefficient vector is then multiplied by the Hadamard transform (3.7) to yield an estimate {circumflex over (x)}1/1; i.e.,
of the ‘first’ subband average pixel values X1 (3.3). This completes the first subband pass of the 4×4 pixel image of
The description of the second and last subband pass of the proposed algorithm begins with an explanation of
As was the case for the first subband pass the discussion begins with the large square located on the lower left hand side of the figure. Its four sub-squares, as is also the case for the other two large squares, are differentiated from each other by the order pair set (k,l), i.e.,
(k,l)ε{(1,1),(2,1),(1,2),(2,2)}, (3.13)
as seen from the picture. For each (k,l) sub-square case the following 4 dimensional column vector X2(k,l) is then defined
This vector is then multiplied by the Hadamard transform matrix T to generate the coefficient vector c2(k,l), i.e.,
e2(k,l)=Ttx2(k,l). (3.16)
The second large square on the upper left hand side of the image displays the prediction vector
for the four pixels in (3.14). Note that all the elements in the prediction sub-square (k,l) are predicted with the same identical scalar value {circumflex over (x)}1/1k,l that is available from the first subband pass. It now follows that our prediction vector for the transform coefficient vector c2(k,l) is defined by the expression
zj(k,l)={circumflex over (x)}2/1(k,l). (3.18)
The prediction vector ĉ2/2(k,l) is then multiplied by the 4×4 prediction matrix P resulting in the prediction coefficient vector ĉ2/1(k,l), i.e.,
ĉ2/1(k,l)=Ptz1(k,l). (3.19)
The difference between the coefficient vector c2(k,l) and its predicted value ĉ2/1(k,l), then results in the 4 dimensional coefficient error or innovation δc2(k,l), i.e.,
that is plotted in the third large square of
δê2(k,l)=Q(δc2(k,l)). (3.21)
A coefficient estimate ĉ2/2(k,l) of the coefficient vector c2(k,l) is then obtained by adding the predicted coefficient vector to the quantized coefficient error to yield
ĉ2/2(k,l)=ĉ2/1(k,l)+δĉ2(k,l) (3.22)
Finally an estimate of the pixels (3.14)-(3.15) {circumflex over (x)}2/2(k,l) is derived by multiplying coefficient estimate (3.22) by the Hadamard transform to yield
{circumflex over (x)}2/2(k,l)=Tc2,2(k,l). (3.23)
This concludes the subband PT source coding methodology that can be readily extended to arbitrary size images.
In
Next in
The PSNR of the MMSE PT compressed Lena image of
The best image isotropic model parameter‘s’ value to use for each possible level of compression and/or subband remains to be investigated. In addition, it is noticed that the proposed methodology can be readily applied to any averaged pixel block size processing structure which naturally includes that of a strip processor [2]. This problem is being investigated and further results will be forthcoming in the near future.
The above specification, examples, and data provide a complete description of the manufacture and use of the composition of the invention. Since many embodiments of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended.
This application is the U.S. National Phase of International Patent Application No. PCT/US07/79469, filed on Sep. 25, 2007, which claims the benefit U.S. Provisional Patent Application No. 60/847,126 filed on Sep. 25, 2006, both of which are hereby incorporated by reference in their entirety.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2007/079469 | 9/25/2007 | WO | 00 | 9/22/2009 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2008/042659 | 4/10/2008 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5272529 | Frederiksen | Dec 1993 | A |
20030037082 | Daniell | Feb 2003 | A1 |
20030113024 | Feria et al. | Jun 2003 | A1 |
20030133500 | Auwera et al. | Jul 2003 | A1 |
Number | Date | Country | |
---|---|---|---|
20100027901 A1 | Feb 2010 | US |
Number | Date | Country | |
---|---|---|---|
60847126 | Sep 2006 | US |