Filter coefficients can be optimized according to many different criterium. For example stop band attenuation, coding gain, degree of smoothness can be used as the criterium to optimize a filter. However, solving for these optimal filter coefficients requires solving a time domain problem that is a non-linear constrained optimization problem that can be difficult to solve. In order to optimize these coefficients requires solving the non-linear constrained optimization equations iteratively to arrive at the optimum solutions. However, this iterative process is computationally expensive due to the difficulty of solving the equations. In addition, if the constraints are not satisfied then the solution is not invertible.
Therefore, it would be advantageous to provide a method for solving for the optimal parameters for a filter that is not constrained and is more efficient computationally than the time domain problem.
A method for finding optimal filter coefficients for a filter given an input data sequence and an objective function is disclosed. The method includes selecting a wavelet basis having k parameters and minimizing the predetermined objective function with respect to the k parameters. The wavelet basis is reparameterized into k/2 rotation parameters and factorized into a product of rotation and delay matrices. The k/2 rotation parameters are provided for the rotation matrices and a data transform matrix is computed based on the product of the rotation and delay matrices. The input data sequence is converted into transformed data by applying the data transform matrix to the input data. The Jacobian of the data transform matrix and the input data sequence is determined and multiplied by the gradient vector with respect to the transformed data of the objective function. This product is compared to a predetermined criterium and if the predetermined criterium is not satisfied, a new set of k/2 parameter values are provided and the gradient descent is continued until the optimal k/2 parameters are found. The optimal filter coefficients are then calculated based on the optimal k/2 parameters. The wavelet basis may be selected from a wavelet packet library containing orthonormal wavelet packet bases, and in which the selected wavelet packet basis minimizes a given cost function, which can be an entropy function.
A method for determining filter coefficients to form a filter to filter data of length N is disclosed, wherein the filter coefficients are optimized to minimize an objective function that measures a predetermined quality of the signal data. The method includes the steps of providing the number of coefficients, k, in the filter and selecting a wavelet packet basis. An objective function is provided and the first set of k/2 parameter values are also provided. A data transform matrix is formed as a function of the selected wavelet packet basis and the k/2 parameter values. The data is transformed by multiplying the data transform matrix with the data, and the value of the objective function is calculated using the transformed data. The optimal set of values for the k/2 parameters are then found based on the value of the objective function.
In one embodiment, finding the optimal set of values for the k/2 parameters includes first determining if the objective function satisfies a first criteria. If the first criteria is satisfied, the optimal filter coefficients are calculated using the current set of k/2 parameters. If the objective function does not satisfy the first criteria a gradient steepest descent method is used to modify the k/2 parameters to a local minimum of the objective function. The gradient steepest descent method is one in which the Jacobian of the data transform matrix is calculated and multiplied by the gradient of the value of the objective function with respect to the transformed data. If the product of the Jacobian and the gradient satisfies a predetermined criterium, the iterative process is stopped and the optimal filter coefficients are calculated based on the current set of k/2 parameters. If the product of the Jacobian and the gradient does not satisfy the predetermined criterium the values of the k/2 parameters are updated based on the value of the gradient, and the process is continued in an iterative manner until the criterium is satisfied and the optimal parameters found.
Other forms, features and aspects of the above-described methods and system are described in the detailed description that follows.
The invention will be more fully understood by reference to the following Detailed Description of the Invention in conjunction with the Drawing of which:
A method for determining the optimal filter coefficients to form a filter of length M for filtering data of length N is disclosed. The filter is optimized to achieve an optimal value for an objective function that measures a predetermined quality of the signal data.
In step 118, The Jacobian of the factorized-reparameterized second filter basis is calculated. The gradient of the objective function of the transformed data with respect to the transformed data is calculated, as depicted in step 120. The Jacobian and the gradient are then multiplied together to form the gradient of the objective function with respect to the set of k/2 parameters, as depicted in step 122. The gradient of the objective function with respect to the set of k/2 parameters is compared to a second criteria, as depicted in step 124. In the event that the gradient satisfies the second criteria, control passes to step 128 and the optimal filter coefficients are calculated from the first set of filter parameters. In the event that the gradient does not satisfy the criterium, control passes to step 126. In step 126 the value of the k/2 parameters are updated as a function of the value of the gradient value and control passes to step 111, wherein the intervening steps are repeated with the new set of filter parameters. This process is continued until either the objective function satisfies the first criteria or the gradient satisfies the second criteria.
Alternatively, the initial values of the k/2 parameters can be determined other than by the time domain methods described above. For example, after executing steps 102, 103, and 104, the values of the k/2 parameters can be determined using other techniques. In one embodiment, the values of the k/2 parameters can be determined randomly and used in the optimization process. Alternatively, a user having a priori information or experience can select the values of the k/2 parameters accordingly. Alternatively, step 106 can also be executed and a mother wavelet selected and a set of a priori values for the k/2 parameters can be supplied. In all of these alternative embodiments the optimization of the parameters can continue as described above with respect to steps 111-126. Step 128 is executed in the case where a digital filter architecture is used. In the case where the optimization and filtering is to be accomplished using lattice filters, there is no need to execute step 128 to find the coefficients of the digital filters.
Filters can be optimized according to different criterium. The different criterium can be described according to a predetermined objective function. For example, stop-band attenuation, coding gain, and degree of smoothness are three possible criterium. Alternatively, the predetermined objective function can be based on morphological aspects of the signal to be filtered. For instance in medical monitoring applications, an objective function may be used to differentiate between normal breathing patterns and labored or wheezing breathing patterns, or between a normal cardiac rhythm and an abnormal cardiac rhythm. The objective function is application specific and developed according to a set of promulgated system requirements.
In general, a discrete wavelet transform includes low-pass and high-pass filters applied to a signal and the output is down-sampled by a factor of two. The high-pass coefficients are retained and the high-pass and low-pass filters are applied to the low-pass coefficients until the length of the residual signal is smaller than the length of the filter. Accordingly, in general the maximum number of times a filter of length M+1, followed or preceded by a decimation by a factor of 2, can be applied to a data signal of length N+1, where N+1 is a power of two, is given by:
It should be noted that the data signal does not have to be a power of two for the optimal filter coefficients to be found. Although, the algorithm will operate more efficiently and faster for signal lengths that are a power of two it is not required. If the signal length is not a power of two, any of the well known techniques to extend the data set may be used. For example, zero padding where zeros are added to the end of the data signal to increase the signal length without interfering with the spectral analysis is an option as is the use of boundary wavelets to analyze the edges of the data signal. Other techniques that are known in signal processing may be used depending on the system requirements, the length of the data stream to be analyzed and other design requirements.
Consider a discrete wavelet transform operating on a signal with length that is a power of 2. As an illustration a filter of length 4 is operating on a signal of length 8. Let {right arrow over (x)}=[x(0),x(1), . . . x(7)]T be the original signal and c(0), c(1), c(2), and c(3) and d(0), d(1), d(2), d(3) be the low pass and high pass filter coefficients respectively. After applying the two filters to the original data, the result is down-sampled by two:
[σ(0), σ(1), σ(2), σ(3), δ(0), δ(1), δ(2), δ(3)]T=C1{right arrow over (x)}, (2)
where
The process is repeated on the just the σ's, the low pass coefficients:
C2[σ(0), σ(1), σ(2), σ(3), δ(0), δ(1), δ(2), δ(3)]T, (4)
where
These two steps can be combined and the transformed signal {right arrow over (y)} is given by:
{right arrow over (y)}=C{right arrow over (x)}=C2C1{right arrow over (x)}. (6)
If C1 is orthonormal, then C2 and C are also orthonormal. The conditions for orthonormality are given by:
c(0)2+c(1)2+c(2)2+c(3)2=1 (7)
c(0)c(2)+c(1)c(3)=0 (7)
d(0)2+d(1)2+d(2)2+d(3)2=1 (8)
d(0)d(2)+d(1)d(3)=0 (8)
c(0)d(0)+c(1)d(1)+c(2)d(2)+c(3)d(3)=0
c(0)d(2)+c(1)d(3)=0 (9)
c(2)d(0)+c(3)d(1)=0 (9)
If the c's satisfy (7), then equations (8) and (9) can be solved using:
d(k)=(−1)kc(3−k), k=0, . . . , 3. (10)
As discussed above, the number of times filters of length M+1 can be applied to a signal of length N+1, where N+1 is a power of two, is given by equation (1). Accordingly, the matrix C can be the product of Q orthonormal matrices:
C=CQCQ−1 . . . C2C1, (11)
where Q≦Qmax and Qmax is given in equation (1).
The above example is intended for illustrative purposes only. In general, the constraints on the c's are given by:
the constraints on the d's are given by
and the relations between the c's and the d's are given by:
and where δ(k) is the Kronecker delta function. As above, when equation (12) is satisfied, equations (13) and (14) may be solved using:
d(k)=(−1)kc(N−k), k=0, . . . , N. (15)
This leads to a constrained optimization problem that is difficult to solve for some applications. The coefficients c(0), c(1), . . . c(N) are reparameterized so that the constraints in equation (12) are automatically satisfied. For example, equation (7) implies that:
[c(0)+c(2)]2+[c(1)+c(3)]2=1, (16)
which is automatically satisfied by setting
c(0)+c(2)=cos(θ1+θ2)
c(1)+c(3)=sin(θ1+θ2). (17)
Where the number of parameters have been reduced by one-half, from four to two in the illustrative example. Using the trigonometric formulas
cos(α+β)=cos(α)cos(β)−sin(α)sin(β)
sin(α+β)=sin(α)cos(β)+cos(α)sin(β), (18)
leads to solving for the c's using:
c(0)=cos(θ1)cos(θ2)
c(1)=cos(θ1)sin(θ2)
c(2)=−sin(θ1)sin(θ2)
c(3)=sin(θ1)cos(θ2) (19)
For a length N filter, the orthonormality conditions imply that:
so that the c's may be reparameterized in terms of trigonometric functions given by:
As can be seen, the right-hand side contains (N−1)/2 terms, but the expansions of the left-hand sides using generalized trigonometric addition formula contains 2(N−1)/2 terms. Although the reparameterized form of the filter uses less terms, distributing the trigonometric monomials to the various filter coefficients is difficult and complex.
Distributing the trigonometric monomials to the various filter coefficients is accomplished using lattice factorization of filter banks. A polyphase matrix of any two channel orthogonal filter bank can be factorized as:
Hp(K)=ρ(θ1)Λ(z)ρ(θ2) . . . Λ(z)ρ(θK) (22)
where ρ(θ)εO(2) and
This procedure leads to a factorization of a wavelet transform. This factorization process is illustrated below for a filter having six (6) coefficients operating on a signal of length eight (8). Although illustrated for this example, the process can be extended for filters having a larger number of coefficients operating on signals having a longer length.
Equation (22) can be rewritten as:
Hp(K)=Hp(K−1)Λ(z)ρ(θK). (24)
As an illustrative example, a six coefficient filter using three angles will be provided. In particular, the polyphase matrices are given by:
where the orthonormality is assured for Hp1, and the remaining polyphase matrices derived from Hp1 when
Thus from equation (24), Hp2 is given by:
Hp2=Hp1Λ(z)ρ(θ2), (27)
which can be written as:
Multiplying the right-hand side and solving for the value of the various coefficients on the left-hand side can be determined using the reparameterized coefficients directly by equating powers of z. This leads to:
c0(2)=cos(θ1)cos(θ2)=c0(1)cos(θ2)
c1(2)=cos(θ1)sin(θ2)=c0(1)sin(θ2) (29)
c2(2)=−sin(θ1)sin(θ2)=−c1(1)sin(θ2)
c3(2)=sin(θ1)cos(θ2)=c1(1)cos(θ2)
and similarly for the d's:
d0(2)=−sin(θ1)cos(θ2)=d0(1)cos(θ2)
d1(2)=−sin(θ1)sin(θ2)=d0(1)sin(θ2) (30)
d2(2)=−cos(θ1)sin(θ2)=−d1(1)sin(θ2)
d3(2)=cos(θ1)cos(θ2)=d1(1)cos(θ2)
These relationships can be re-written as
for the relationship in equation (29) and
for the relationships in equation (30). As noted above in equation (24), the next polyphase matrix Hp(3) can be found as a function of Hp(2). Multiplying the matrices and equating like powers of z as above yields:
Substituting equation (31) into equation (33) yields
If B=A3A2A1 then the transpose of B, BT can be found to be BT=A1TA2TA3T. Taking the transpose of equation (34) and combining equation (34) with a similar matrix equation for solving for the d's, yields:
Thus, each coefficient can be found as a function of the reparameterized angle coefficients. Performing the matrix multiplication in equation (35) defines each coefficient as a linear combination of two or more trigonometric functions. Reformulating equation (4) from above such that the c's and d's alternate rows, the matrix C1 can be written as the product of three matrices:
Where Sk=sin(θk), Kk=cos (θk) and all coefficients are assumed to be for the third polyphase matrix so that the superscripts have been dropped without any loss of generality. The matrix equation given in equation (36) can be expressed in the form of:
C1=R(θ1)SR(θ2)SR(θ3) (37)
where R(θ) are wavelet transform matrices and S are shifting matrices. A matrix E is then applied to C1 to separate the high-pass coefficients from the low-pass ones.
The above is for illustrative purposes only, and signals with lengths of 16, 32, 64, or greater powers of two may be filtered. As an example, the left-hand side of equation (36) can be extended to filter a signal of length thirty-two (32). In this example, the left-hand side of equation (36), which is C1 in equation (11), becomes a 32×32 matrix in which the coefficients are shifted and wrapped around across all thirty-two columns. C2 is formed in a block matrix form as follows:
where C1 is the left-hand side 8×8 matrix of equation (36) extended to a 16×16 matrix by shifting and wrapping the filter coefficients as described above. In the illustrative example, Qmax is given by equation (1) and is equal to 3. C3 is then given in block matrix form as:
where C1 is the left-hand side 8×8 matrix in equation (36). Thus, a wavelet transform of data of length 32 can be given by:
This can also be expressed as:
C=EQRQ(θ1)SQRQ(θ2) . . . SQRQ(θK) . . . E1R1(θ1)S1R1(θ2) . . . S1R1(θK). (41)
As can be observed, the filter is a traditional straight wavelet transform. As is known, the straight wavelet transform continuously applies a low and a high pass filter to the data, down-samples the results by two (2), retaining the high pass results, then applies the high and low pass filters to the low pass results only. This is repeated until the filter size is too small to filter the remaining data. Alternatively, a subset of wavelet transforms that are less than the maximum number of transforms calculated in equation (1) may also be used to form the basis.
Other wavelet bases may be used, however, and the possible combinations of the placement of the identity matrix and C, matrix within each sub-block can be used.
A particularly efficient method of selecting the best basis is an entropy-based selection process. In this method, the entropy of a family of wavelet packet bases operating on a signal vector is calculated and the best basis is selected as the basis providing the minimum entropy. A suitable entropy function can be developed in which the sequences of data operated on by each basis in the wavelet packet library is compared by their respective rate of decay, i.e., the rate at which the elements within the sequence becomes negligible when arranged in decreasing order. This allows the dimension of the signal to be computed as:
where pn=|xn|2∥x∥2 and
represents the entropy, or information cost, of the processed signal. This leads to a proposition in which if two sequences {xn} and {xn′} are compared so that a function corresponding function {pn} and {pn′} are monotonically decreasing and if Σ0<n<mpn≧Σ0<n<mp′n then d≦d′. Although entropy is one measure of concentration or efficiency of an expression, other cost functions are also possible depending upon the application and the need to discriminate and choose between special functions.
The wavelet packet basis selected, which may or may not be the optimal basis for the particular signal, is applied to the sequence of data forming a transformed sequence of data. The particular parameters are optimized to select the optimal set of parameters and hence, the optimal set of filter coefficients. In the preferred embodiment, a gradient descent can be used to optimize the parameters. In particular, the gradient of the objective function with respect to the parameter set can be calculated as:
∇θφ=JθT∇yφ (43)
where J is the Jacobian matrix for Cx, φ is the objective function, θ is the set of parameters, and y is the input data sequence transformed by the selected basis. The Jacobian matrix is of the form:
Jθ=[(∂θ1C){right arrow over (x)}(∂θ2C){right arrow over (x)}, . . . , (∂θkC){right arrow over (x)}]. (44)
An explicit form of the derivatives with respect to the new parameters can be obtained using equation (41):
where
j=1, 2, . . . , K and sin′(θ)=cos(θ)=sin(θ+π/2), and cos′(θ)=sin(θ)=cos(θ+π/2). It is possible to express the above derivatives in terms of rotations of the original angles, thus making the storage and computation of the derivatives more efficient. The derivative of a rotation block is:
so that
∂jRl(θj)=DlRl(θj) (47)
where D1 is a block diagonal matrix. Thus, equation (45) becomes:
∂jC(θ1, . . . θK)=ERQ(θ1)SQ . . . DQRQ(θj) . . . S1R1(θK)+
ERQ(θ1)SQ . . . DQ−1RQ−1(θj) . . . S1R1(θK)+(48)+
ERQ(θ1)SQ . . . D1R1(θj) . . . S1R1(θK) (48)
Accordingly, with the appropriate choice of the matrices C1, C2, . . . , CQ the above expression allows for the computation of the gradient with respect to the angles for any basis in the wavelet packet library.
Thus, the Jacobian can be computed and stored efficiently and separately from the objective function gradient that is calculated. Accordingly, the Jacobian is efficiently computed for each set of parameters that are used in the gradient descent method. For each of the sets of variables that are used, the Jacobian can be calculated and stored according to equation (48).
The gradient of the objective function φ is calculated with respect to the transformed data sequence, i.e., the resultant data sequence from applying the selected wavelet basis to the input data sequence. This result is multiplied by the Jacobian matrix and the product of the Jacobian matrix and the gradient vector forms the gradient vector of the objective function with respect to the angle parameters. The gradient vector of the objective function with respect to the angle parameters provides a vector that is the steepest descent from the starting point to a local minimum. If the gradient vector is equal to zero, or less than a gradient threshold level, a local minimum has been reached, or nearly reached, and the objective function is calculated for the current angle parameters. If the objective function is less than a predetermined objective function criterium, the current angle parameters are the optimal angle parameters and the filter coefficients can be calculated as described above. If the gradient vector of the objective function with respect to the angle parameters does not indicate that it is at or near a local minimum, or if the objective function exceeds the predetermined objective function threshold, then a predetermined adjustment is made to the current set of angle parameters. This process is repeated until an optimal set of angle parameters is found. The actual value of the gradient threshold level, the predetermined objective function threshold, and the predetermined adjustment to the angle parameters are dependent on the particular system requirements.
where Kθ=cos(θ) and Sθ=sin(θ). The rotation block depicted in FIG. is considered to have a rotation angle of θ such that the input signals will be rotated through an angle of θ. The rotation block 300 includes first and second inputs α1 and α2. Input α1 is multiplied by the cos(θ1) and sin(θ1), and input α2 is multiplied by the cos(θ1) and −sin(θ1). The products α1cos(θ1) and −α2sin(θ1) are added together in summation module 302 to provide output −α1R. Similarly, the products α2cos(θ1) and α1sin(θ1) are added together in summation module 304 to provide output α2R. The rotation block depicted in
In particular,
The Jacobian can also be computed simultaneously with the transformed data vector y(0), y(1), and y(2). In particular, the data on line 413 is negated by module 414 and provided as a first input to provided module 412. The data on line 411 is provided as the second input to module 412. Module 412 in combination with the portion 402A of filter module 402 form a lattice filter module having first and second rotation blocks having rotation angles of θ1 and θ2 respectively. Module 412 provides a first data output 415 and a second data output 417, wherein the second data output 417 is subsequently filtered by lattice filter module 410 that provides a third and fourth data output 419 and 421.
The high pass filter data output 403 is provided to lattice filter module 408 that has two rotation modules that have rotation angles of θ1 and θ2 respectively. Filter module 408 filters the high pass data output 403 and provides a fifth data output 423 and a sixth data output 425. The data on line 427 is provided to module 406, which in combination with the first portion 404A of lattice filter module 404 also forms a lattice filter having two rotation blocks that have rotation angles of θ1 and θ2 respectively. Module 406 provides a seventh data output 429 and an eighth data output 431.
The Jacobian values are formed as linear combinations of the various output data. In particular, J(0) is formed as the sum of the low pass data filter output 405 that is negated by module 416 age and the transformed output Y(0) combined in summation module 420. J(1) is formed as the sum of the fifth data output line 423 and the transformed output value Y(2), negated by module 418, combined in the summation module 422. J(2) is formed as the sum of the sixth data output line 425 and the transformed output value Y(1) combined in the summation module 424. J(3) is formed as the sum of the first data output line 415 and the transformed output value Y(0) combined in the summation module 426. J(4) is formed as the sum of the third data output line 419 and the seventh data output on line 429 combined in the summation module 428. J(5) is formed as the sum of the fourth data output line 421 and the eighth output data line 431 combined in the summation module 430. the data that is provided on the outputs J(0), J(1), . . . ,J(5) corresponds to the type of decomposition matrix that is used. For example, in the straight wavelet transform depicted in equations (38)-(40) the various Y output values correspond to the various blocks of the block diagonal matrix. In this example using 64 data input signals, J(0) provides 32 values, J(1) provides 16 values, J(2) provides 8 values, J(3) provides 4 values, J(4) provides 2 values, and J(1) provides 1 value. As can be seen, the values provided by each output corresponds to the position of the identity matrix in the straight wavelet transform block matrices.
It is convenient to require that the mean of the high pass filter used herein be zero to ensure that no DC component passes through the filter without significant attenuation. In the time domain formulation a zero mean high pass filter satisfies the zero mean filter criteria:
where c is the coefficient of the high pass filter as discussed above. This condition when translated into the lattice angle formulation and combined with the inequalities in equation (21) becomes:
The above is true if:
where k is equal to N/2, and N is the number of time domain filter coefficients. Thus, the search for the optimal filter coefficients becomes a constrained optimization problem that is expressed as:
The gradient of objective function can easily be expressed on this plane, where the objective function is given by:
then the gradient of the objective function provided in equation (58) is given by:
As can be seen, constraining the high pass filter to be a zero mean filter has the advantage of reducing the complexity of the optimization problem by one from k to k−1 degrees of freedom.
Those of ordinary skill in the art should further appreciate that variations to and modification of the above-described methods for providing optimal filter coefficients may be made without departing from the inventive concepts disclosed herein. Accordingly, the invention should be viewed as limited solely by the scope spirit of the appended claims.
This application claims priority under 35 U.S.C. §19(e) to U.S. Provisional Patent Application Ser. No. 60/269,678, filed Feb. 20, 2001 the disclosure of which is incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5384725 | Coifman et al. | Jan 1995 | A |
5526299 | Coifman et al. | Jun 1996 | A |
6101284 | Matsubara et al. | Aug 2000 | A |
6581081 | Messerly et al. | Jun 2003 | B1 |
6684234 | Kraker | Jan 2004 | B1 |
Number | Date | Country | |
---|---|---|---|
20030005007 A1 | Jan 2003 | US |
Number | Date | Country | |
---|---|---|---|
60269678 | Feb 2001 | US |