This invention relates to image compression, and more particularly, to image and video compression methods and devices.
Recently, Digital Still Cameras (DSCs) have become a very popular consumer appliance appealing to a wide variety of users ranging from photo hobbyists, web developers, real estate agents, insurance adjusters, photo-journalists to everyday photography enthusiasts. Recent advances in large resolution CCD arrays coupled with the availability of low-power digital signal processors (DSPs) has led to the development of DSCs that have the resolution and quality offered by traditional film cameras. These DSCs offer several additional advantages compared to traditional film cameras in terms of data storage, manipulation, and transmission. The digital representation of captured images enables the user to easily incorporate the images into any type of electronic media and transmit them over any type of network; see
Further, DSCs can be extended to capture video clips (short video sequences) and to compress (sequences of) images with methods such as JPEG or JPEG2000.
In contrast to JPEG, JPEG2000 uses wavelet decomposition with both lossy and lossless compression enables progressive transmission by resolution (which can generate a small image from the code for the full size image), and facilitates scalable video with respect to resolution, bit-rate, color component, or position with transcoding by using Motion JPEG2000. Indeed,
However, the real wavelet transforms used in JPEG2000 suffer from three shortcomings: (i) lack of shift invariance, (ii) lack of directionality, and (iii) lack of explicit phase information. Complex wavelet transforms, in which the real and imaginary parts of the transform coefficients are an approximate Hilbert-transform pair, offer solutions to these three shortcomings. This enables efficient statistical models for the coefficients that are also geometrically meaningful. Indeed, there are distinct relationships between complex coefficient magnitudes and phases, and edge orientations and positions, respectively. These relationships allows development of an effective hidden Markov tree model for the complex wavelet coefficients; for example see Choi et al, Hidden Markov Tree Modeling of Complex Wavelet Transforms, 2000 IEEE ICASSP 133. Unfortunately, the success of geometric modeling in complex wavelet coefficients has been limited to the class of redundant, or over-complete, complex transforms. This redundancy complicates any application to problems such as image/video compression for DSCs and for wireless-linked Internet transmission where parsimonious signal representations are critical.
To address the redundancy problem, Fernandes et al, A New Directional, Low-Redundancy, Complex-Wavelet Transform, 2001 IEEE ICASSP 3653 provided a low redundancy by projection and negative frequency discard. Subsequently, Fernandes introduced the Non-Redundant Complex Wavelet Transforms (NCWT); see for example, Fernandes et al, A New Framework for Complex Wavelet Transforms, 51 IEEE Trans. Signal Proc. 1825 (2003). But this implementation can be viewed as a combination of a downsampled positive-frequency projection filter with a traditional dual-band real wavelet transform. Therefore, at the finest scale, the complex wavelet transform has resolution 4x lower than the real input signal. These NCWTs do enjoy directionality and explicit phase information because of the approximate Hilbert-transform relationship between real and imaginary parts of their transform coefficients. To date, however, they have been significantly less amenable to geometric modeling than their redundant counterparts.
T. D. Tran et al, Linear-Phase Perfect Reconstruction Filter Bank: Lattice Structure, Design, and Application in Image Coding, 48 IEEE Trans. Signal Proc. 133 (2000) discloses general methods of filter bank design.
The invention provides a separable two-dimensional non-redundant complex wavelet image transform using one-dimensional triband transforms. Preferred embodiments include constituent filter designs.
This has advantages including reduction in encoding memory use while maintaining complex wavelet properties.
a-2c show constituent filter magnitudes.
a-3d are one-dimensional filter structures.
a-4b illustrate the preferred embodiment two-dimensional transform in the frequency domain.
a-5b are filter structures.
a-6c show preferred embodiment scaling function and complex wavelet and corresponding filter frequency dependencies.
a-8c show an input image and corresponding vertical subband complex wavelet coefficients (magnitude and phase).
a-9b show digital camera functions and blocks.
a-11b illustrate JPEG2000.
1. Overview
The preferred embodiment image compression methods efficiently encode (sequences of) images by successive applications of a one-dimensional, non-redundant, triband (three output subbands) complex wavelet decomposition with some negative frequency discards as illustrated in
a-4b show the ideal consequent two-dimensional frequencies involved in the applications of the one-dimensional transforms of
Preferred embodiment implementations of the subband filters include a parameterization approach and a lifting approach. In particular,
a-9b illustrate a digital camera which has the DSP and/or IMX (image coprocessor) to compute the complex wavelet transform coefficients. The input and encoded output images are stored in the memory.
The preferred embodiment linear-phase, semi-orthogonal, directional NCWT design uses a triband (downsample by 3) filter bank which permits a natural, direct NCWT implementation using complex wavelet filters and a real scaling filter. At the finest scale, the resulting complex wavelet transform has resolution 3x lower than the real input signal. The preferred embodiment design has properties (directionality, magnitude coherence, and phase coherence) that may make the two-dimensional non-redundant coefficients amenable to geometric modeling.
2. Two-Dimensional Non-Redundant Complex Wavelet Transform
In more detail, presume an input N×M image, x(n,m), with 0≦n<N and 0≦m<M and two-dimensional z-transform X(z1, z2). The first preferred embodiment method (
Consider
Complete the
Finally, apply the
The first level of preferred embodiment two-dimensional NCWT is now complete. The five output subbands ideally partition the (prior-to-downsampling) frequency domain as shown in
3. One-Dimensional Filter Banks for Complex Inputs
b shows the three-band analysis filter bank that performs the first level of a non-redundant complex wavelet decomposition of X(z), the z-transform of x(n), a complex-valued input signal. H0(z) has real filter coefficients while H+(z) and H−(z) have complex filter coefficients such that H+(z)*=H−(z), which signifies that the H+(z) filter coefficients are complex conjugates of the H−(z) filter coefficients. The idealized magnitude responses of these filters are shown in
1. The frequency responses of the analysis filters must approximate the idealized magnitude responses in
2. To ensure that the one-dimensional NCWT for real inputs is nonredundant, require H+(z)*=H−(z).
3. To obtain smooth wavelet basis functions, H0(z) must satisfy existence and vanishing-moment conditions.
4. For image/video compression applications, the H0(z), H+(z), and H−(z) filter bank should be linear-phase and orthogonal.
5. A synthesis filter bank that reconstructs X(z) from the subband signals X0(z), X+(z), and X−(z) must exist.
Multi-band filter bank design is a difficult problem, and no direct design method satisfies all the above criteria simultaneously. Therefore, some preferred embodiments adopt the a parameterization approach and other preferred embodiments use a lifting approach to design the analysis filter bank. Preliminarily, note that practical filter bank implementation typically uses the polyphase filter approach illustrated in
Parameterization Approach
First, follow Tran et al. (see Background cite) to specify a length-9, 3-band, orthogonal, linear-phase, real-coefficient filter bank. Then exploit the free parameters to impose two vanishing moments on the scaling filter. Let Ê(z) denote the polyphase matrix of the analysis filter bank of real-valued filters in the resulting system.
Next, define
Now, the first, second, and third rows of the polyphase matrix CÊ(z) are to contain the preferred embodiment polyphase components for H0(z), H+(z), and H−(z), respectively. Note that C essentially combines the two real-valued filters, H1(z), and H2(z), from the Tran et al construction into the complex conjugate pair H+(z) and H−(z). These analysis filters satisfy all preceding constraints except for Constraint 1, which is violated because the magnitude responses |H+(ω)| and |H−(ω)| differ from the idealized responses in
and generate a new analysis filter bank with polyphase matrix E(z) defined by E(z)=CV(z)U(z)S Ê(z). Observe that the entries in the first rows of C, S, U(z), V(z) guarantee that the scaling filter specified by E(z) is the same (modulo shifts) as the scaling filter specified by Ê(z). Hence, Constraint 3 is still satisfied by the E(z) system. Now, S introduces the free parameter θ into E(z) without affecting Constraint 4 because S is orthogonal and also preserves linear phase. Next, consider the matrices U(z) and V(z). These are left-extension matrices that lengthen the wavelet filters by introducing free parameters u and v into the analysis filter bank while preserving linear phase. However, orthogonality of the wavelet filters is not preserved by these matrices. The zeros in the first rows and first columns of the left-extension matrices ensure that the the scaling filter specified by V(z)U(z)SÊ(z) is orthogonal to its own shifts as well as to shifts of the wavelet filters, although the wavelet filters are not orthogonal to their own shifts. Thus, in addition to semi-orthogonality, the basis associated with the V(z)U(z)SÊ(z) system also has orthogonal scaling functions. Therefore, the V(z)U(z)SÊ(z) filter bank satisfies a weakened form of Constraint 4 in which “orthogonal” is replaced by “semi-orthogonal and H0(z) should be shift-orthogonal.” Note that the scaling filter and two wavelet filters associated with the V(z)U(z)SÊ(z) system have lengths 9, 15, and 21, respectively, because the U(z) and V(z) lengthen the original length-9 Ê(z) system. Finally, the matrix C is introduced to transform the real-coefficient polyphase matrix V(z)U(z)SÊ(z) into E(z), the second and third rows of which specify complex-coefficient filters H+(z) and H−(z) that satisfy Constraint 2. Optimizing over the real-valued, free parameters θ, u, v, yields E(z) with wavelet-filter magnitude responses that have minimum mean-squared error with respect to the idealized responses in
Section 4 describes the modifications of the filter bank for the real-valued inputs as in
Lifting Approach
d shows an alternative filter bank construction using the lifting approach. In particular, the preferred embodiment approach first generates a real-valued filter bank of filters H0(z), H1(z), and H2(z) (analogous to the foregoing Ê(z)) by the lifting method and then forms H+(z) and H−(z) as the complex conjugate pair [H1(z)±jH2(z)]/2. The lifting approach automatically provides a perfect-reconstruction synthesis filter bank that satisfies the third design constraint. The filters Tij(z) in
H0(z)=1+z−1 T01(z3)+z−2 T02(z3)
H1(z)=z−1+T10(z3) H0(z)+z−2 T12(z3)
H2(z)=z−2+T20(z3) H0(z) T21(z3) H1(z )
As previously noted, the butterfly of H1(z) and H2(z) at the right edge of
a shows the synthesis filter bank corresponding to the analysis filter bank of
To find suitable lifting filters, impose the existence and vanishing wavelet-moment criteria on the foregoing analysis filters and perform numerical optimization to obtain analysis filters with frequency responses approximating those shown in
H0(z)=0.5774+0.5774 z−1+0.5774 z−2
H1(z)=−0.4349+0.5651 z−1+−0.1303 z−2
H2(z)=−0.3333−0.3333 z−1+0.6667 z−2
Besides ensuring the existence of a synthesis filter bank, the lifting approach has two other practical advantages that are exploited in the design. First, the lifting approach corresponds to a lattice decomposition that enables a very efficient implementation. Second, the filter bank enjoys the cardinal interpolation property; this guarantees that no initialization error is incurred while using the filter bank.
4. One-Dimensional Non-Redundant Complex Wavelet Transform for Real Input
Creating the one-dimensional NCWT for real-valued inputs necessitates slight modifications to the filter banks of the previous section. Consider
To obtain a non-redundant transform for real-valued input, observe that x−(n) is the complex conjugate of x+(n), because x(n) is real-valued and while H+(z)*=H−(z). Therefore x−(n) contains the same information as x+(n), and may be discarded because it is redundant.
For the lifting approach, the analysis filter bank for real input simply discards the X−(z) output (retain only half of the butterfly); and the synthesis filter bank also separates the real and imaginary parts of X+(z) as inputs as illustrated in
5. Properties
The preferred embodiment two-dimensional NCWT has properties that may be useful for image/video processing.
Directionality:
Magnitude coherence:
Phase coherence: Also shown in
The above properties suggest that the preferred embodiment two-dimensional NCWT may be well-suited to image processing and geometric modeling. For example, a zerotree compression algorithm could be developed based on coefficient magnitudes. Such techniques will require a nona-tree structure due to the triband filter bank: each complex coefficient will have 9 children (instead of 4, as with dual-band real wavelet transforms). In addition, the higher decimation provides greater frequency separation between wavelet scales (more than one octave), and so less depth will be needed in the tree.
6. Multiple Decomposition Levels
If more than one level of decomposition is required, such as to create a coefficient tree, store the lowpass subband, X00, coefficients until image (tile component) has been filtered, and then repeat the two-dimensional transform on the lowpass subband. That is, the center square in
7. Systems
The preferred embodiment methods are well-suited for environments which require continuous compression and storage of video or sequences of images which contain only partial spatial updates.
8. Modifications
The preferred embodiments can be varied while maintaining the features of image wavelet transform based on a separable, three-band, non-redundant wavelet transform.
For example, the three filters of the one-dimensional filter bank could have various characteristics provided the passbands and stopbands approximate those of
This application claims priority from the following provisional patent application: application Ser. No.: 60/428,422, filed Nov. 22, 2002.
Number | Name | Date | Kind |
---|---|---|---|
5748786 | Zandi et al. | May 1998 | A |
7046854 | Daniell | May 2006 | B2 |
Number | Date | Country | |
---|---|---|---|
20040120592 A1 | Jun 2004 | US |
Number | Date | Country | |
---|---|---|---|
60428422 | Nov 2002 | US |