1. Field of the Invention
The present invention relates to digital signal processing techniques and signal filtering, and particularly to a system and method for least mean fourth adaptive filtering that provides an adaptive filter and method of adaptive filtering utilizing a normalized least mean fourth algorithm.
2. Description of the Related Art
The least mean fourth (LMF) algorithm has uses in a wide variety of applications. The algorithm outperforms the well-known least mean square (LMS) algorithm in cases with non-Gaussian noise. Even in Gaussian environments, the LMF algorithm can outperform the LMS algorithm when initialized far from the Wiener solution. The true usefulness of the LMF algorithm lies in its faster initial convergence and lower steady-state error relative to the LMS algorithm. More importantly, its mean-fourth error cost function yields better performance than that of the LMS for noise of sub-Gaussian nature, or noise with a light-tailed probability density function. However, this higher-order algorithm requires a much smaller step-size to ensure stable adaptation, since the cubed error in the LMF gradient vector can cause devastating initial instability, resulting in unnecessary performance degradation.
One approach to solving the above degradation problem is to normalize the weight update term of the algorithm. In conventional techniques, the LMF algorithm is normalized by dividing the weight vector update term by the squared norm of the regressor. In some prior art techniques, this normalization is performed by dividing the weight vector update term by a weighted sum of the squared norm of the regressor and the squared norm of the estimation error vector. Thus, the LMF algorithm is normalized by both the signal power and the error power. Combining the signal power and the error power has the advantage that the former normalizes the input signal, while the latter can dampen down the outlier estimation errors, thus improving stability while still retaining fast convergence.
The above has been modified by using an adaptive, rather than fixed, mixing parameter in the weighted sum of the squared norm of the regressor and the squared norm of the estimation error vector. The adaptation of the mixing parameter improves the tracking properties of the algorithm. However, unlike the normalization of the LMS algorithm, the above normalization techniques of the LMF algorithm do not protect the algorithm from divergence when the input power of the adaptive filter increases. In fact, as will be shown below, the above prior art normalized LMF (NLMF) algorithms diverge when the input power of the adaptive filter exceeds a threshold that depends on the step-size of the algorithm. The reason for this drawback is that in all of the above techniques, the weight vector update term, which is a fourth order polynomial of the regressor, is normalized by a second order polynomial of the regressor.
The prior art normalization techniques do not ensure a normalization of the input signal. Thus, the algorithm stability remains dependent on the input power of the adaptive filter.
For the particular application of adaptive plant identification, as diagrammatically illustrated in
a
k
=g
T
x
k
+b
k (1)
where
g≡(g1,g2, . . . ,gN)T (2)
is a vector composed of the plant parameters gi (from an unknown finite impulse response (FIR) filter 102), and
x
k=(xk,xk−1,xk−2, . . . ,xk−N+1)T (3)
is the regressor vector at time k. N is the number of plant parameters, xk is the plant input, bk is the plant noise, and the notation (.)T represents the transpose of (.). The identification of the plant is made by an adaptive finite impulse response (FIR) filter 104 whose length is assumed equal to that of the plant. The weight vector hk of the adaptive filter is adapted on the basis of the error ek, which is given by
e
k
=a
k
−h
k
T
x
k, (4)
where hk=(h1,k, h2,k, . . . , hN,k)T. The adaptation algorithm of interest is the LMF algorithm, which is described by
h
k+1
=h
k
+ƒe
k
3
x
k, (5)
where μ>0 is the algorithm step-size. The error signal ek can be decomposed into two terms as follows:
e
k
≡b
k+εk. (6)
The first term on the right hand side of equation (6), bk, is the plant noise. The second term, εk, is the excess estimation error. The weight deviation vector is defined by
v
k
≡h
k
−g. (7)
From equations (1), (4), (6) and (7),
εk=−vkTxk. (8)
Inserting equations (1), (4) and (7) into equation (5) yields
v
k+1
=v
k+μ(bk−vkTxk)3xk. (9)
In the above, the following assumptions are used: The first assumption (assumption A1) is that the sequences {xk} and {bk} are mutually independent. The second assumption (assumption A2) is that {xk} is a stationary sequence of zero mean random variables with a finite variance σx2. The third assumption (assumption A3) is that {bk} is a stationary sequence of independent zero mean random variables with a finite variance σb2. Such assumptions are typical in the context of adaptive filtering.
In order to emphasize the need for normalization in the least mean fourth algorithm, examining the normalization of the LMS algorithm is important. The stability of the LMS algorithm is dependent upon the input power of the adaptive filter. This makes it very hard, if not impossible, to choose a step-size that guarantees stability of the algorithm when there is lack of knowledge about the input power. This is solved by normalizing the weight update term by ∥xk∥2, where ∥xk∥ is the Euclidean norm of the vector xk, which is defined as ∥xk∥=√{square root over (xkTxk)}. The resulting algorithm is referred to as the normalized LMS (NLMS) algorithm. This algorithm is stable for all values of the filter input power so long as the step-size is between 0 and 2.
It is desirable to develop a version of the LMF algorithm that has a similar feature as that of the NLMS algorithm; i.e., stability for all values of the filter input power for an appropriate fixed range of the step-size. One prior art normalization technique is given as
and a second version is given by
where δ is a small positive number, 0<δ<1, and ek=(ek, ek−1, ek−2, . . . , ek−N+1)T is the error vector. The parameter λ is referred to as the mixing power parameter. The choice of λ is a compromise between fast convergence and low steady-state error. However, the stability of the above algorithms depends on the mean square input of the adaptive filter. To show this undesired feature for the algorithm of equation (10), we may consider the scalar case, N=1, with zero noise, bk=0, and binary input, xk ε{−1,1}, with μ=0.5. In this case, equation (10) implies that:
v
k+1
=v
k−0.5vk3. (12)
If v1=1, then equation (12) implies that v2=0.5, v3=0.4375, v4=0.3956, v5=0.3647, etc. Thus, vk is decaying in this case. Repeating this example with xk ε{−4,4}, while keeping all other conditions unchanged, equation (10) implies that:
v
k+1
=v
k−8vk3. (13)
Again, if v1=1, then equation (13) yields v2=−7, v3=2737, v4=1.6(1011), v5=3.5(1034), etc. Thus, the algorithm of equation (10) diverges in this case. This shows that the stability of the normalized LMF algorithm of equation (10) depends on the input power of the adaptive filter.
It can also be shown that the stability of the algorithm of equation (11) also depends on the input power. We may consider again the scalar case with μ=0.5, δ=0, λ=0.5, xk ε{−1,1}, and bk=0. In this case, equations (6) and (8) imply that ek2=vk2xk2 and equation (11) implies that:
If v1=1, then equation (14) produces v2=0.5, v3=0.4, v4=0.3448, v5=0.3082, etc. Thus, vk is decaying in this case. Repeating this example with xk ε{−4,4}, while keeping all other conditions unchanged, equation (11) implies that:
Again, if v1=1, then equation (15) produces v2=−7, v3=102.76, v4=1541.2, v5=23119, etc. Thus, the algorithm of equation (11) diverges in this case. This shows that the stability of the normalized LMF algorithm of equation (11) depends on the input power of the adaptive filter.
The above results regarding the dependence of the stability of the prior art NLMF algorithms on the input power of the adaptive filter suggest the need for an NLMF algorithm whose stability does not depend on the input power. Thus, a system and method for least mean fourth adaptive filtering solving the aforementioned problems is desired.
The system and method for least mean fourth adaptive filtering is a system that uses a general purpose computer or a digital circuit (such as an ASIC, a field-programmable gate array, or a digital signal processor that is programmed to utilize a normalized least mean fourth algorithm. The normalization is performed by dividing a weight vector update term by the fourth power of the norm of the regressor. The normalized least mean fourth algorithm remains stable as the filter input power increases.
The least mean fourth adaptive filter includes a finite impulse response filter having a desired output ak at a time k. An impulse response of the finite impulse response filter is defined by a set of weighting filter coefficients hk, and an input signal of the finite impulse response filter is defined by a regressor input signal vector, xk, at the time k. An error signal ek at the time k is calculated as a difference between the desired output ak at the time k and an estimated signal given by hkTxk, such that ek=ak−hkTxk. The set of weighting filter coefficients are iteratively updated as
where α is a fixed positive number step-size.
These and other features of the present invention will become readily apparent upon further review of the following specification and drawings.
Similar reference characters denote corresponding features consistently throughout the attached drawings.
The system and method for least mean fourth adaptive filtering is a system that uses a general purpose computer or a digital circuit (such as an ASIC, a field-programmable gate array, or a digital signal processor that is programmed to utilize a normalized least mean fourth algorithm. The normalization is performed by dividing a weight vector update term by the fourth power of the norm of the regressor. The normalized least mean fourth algorithm remains stable as the filter input power increases.
In order to examine the present least mean fourth adaptive filter, it is useful to first consider the scalar ease, in which N=1 (where N is the number of plant parameters) with zero noise. In such a case, equation (9) simplifies to:
v
k+1=(1−μxk4vk2)vk. (16)
The stability of the algorithm represented by equation (16) depends on the statistics of xk (i.e., the plant input). This dependence is removed by using a variable step-size μ that satisfies the following:
assuming that xk≠0 for all k and that α is a fixed positive number. In such a case, equation (16) becomes
v
k+1=(1−αvk2)vk. (18)
Since α does not depend on xk, the sequence {vk} generated by equation (18) also does not depend on xk. Thus, the stability of equation (18) does not depend on the statistics of xk. A sufficient condition for the decay of |vk| for all k is that:
When |vk| is decreasing, |v1| will be the largest value of |vk|. Then, a sufficient condition for equation (19) is:
Equation (20) is the step-size range that is sufficient for the convergence of the algorithm represented by equation (16).
The above can now be extended to the N-dimensional LMF algorithm described by equation (9). A factor that determines the convergence of the algorithm is its behavior at the start of adaptation. At the start of adaptation, the magnitude of the deviation vector vk is usually so large that the excess estimation error vkTxk dominates the plant noise bk. In such a case, equation (9) can be approximated as:
v
k+1
≈v
k−μ(vkTxk)3xk=Tkvk, (21)
where the transformation matrix Tk is given by:
T
k
=I−μ
x
k
x
k
T
v
k
v
k
T
x
k
x
k
T. (22)
The transformation matrix of equation (22) depends on ∥xk∥. This dependence can be removed by using a variable step-size μ that satisfies:
assuming that xk≠0 for all k and that α is a fixed positive number. In such a case, equation (22) becomes:
The matrix Tk given by equation (24) depends on the direction of the vector xk, but it does not depend on its norm, ∥xk∥. Inserting equation (23) into equation (5) yields:
Equation (25) represents the present NLMF algorithm. Due to equation (4), the weight vector update term on the right-hand side of equation (25) is proportional to the negative gradient of (ek/∥xk∥)4 with respect to hk. Thus, the algorithm represented by equation (25) may be viewed as a gradient algorithm based on the minimization of the mean fourth normalized estimation error E(ek/∥xk∥)4. Similar to the NLMS algorithm, the NLMF algorithm represented by equation (25) may be regularized by adding a small positive number to ∥xk∥4 in order to avoid division by zero.
In the following, a condition on the step-size a is derived that is sufficient for the convergence of the algorithm of equation (25). First, a step-size range is derived that is sufficient for stability of the algorithm in the initial phase of adaptation. Then, a step-size range that is sufficient for steady-state stability is derived. The final step-size range is obtained from the intersection of these two ranges.
One is typically interested in mean-square stability that is based on the evolution of the mean-square deviation E(∥vk∥2) with time, where E denotes the mathematical expectation operator. In the initial phase, the magnitude of E(vk) is usually large in comparison with the fluctuation of vk around E(vk). Thus, E(∥vk∥2)≈∥E(vk)∥2 in the initial phase of adaptation. Consequently, the derivation of the stability step-size range in the initial adaptation phase can be simplified by calculating it on the base of the mean of vk, rather than the mean square of vk. This is not the case with the steady-state condition that should be derived on the base of the mean-square of vk.
With respect to the initial phase condition, at the start of adaptation, the magnitude of the deviation vector vk is usually so large that the excess estimation error vkTxk dominates the plant noise bk. In such a case, equation (25) can be approximated by equation (21), with Tk being given by equation (24). Letting E(x|y) denote the conditional expectation of x given y, then a condition on a is derived that implies that:
∥E(vk+1|vk)∥<∥vk∥. (26)
This condition means that the magnitude of the weight deviation vector is decreasing each iteration step, in a mean sense. Due to equation (21), E(vk+1|vk)=E(Tk|vk)vk. Then, a sufficient condition for equation (26) is that the magnitude of all the eigenvalues of the matrix E(Tk|vk) are less than 1. Due to equation (24), the eigenvalues of E(Tk|vk) are equal to 1−αλi(k), where λi(k); i=1, 2, . . . , N are the eigenvalues of the matrix P(k), defined by:
Thus, a sufficient condition for equation (26) is that:
∥1−αλi(k)|<1,i=1,2, . . . ,N (28)
It is assumed that the matrix P(k) is positive definite. This assumption is a sort of persistent excitation assumption. A geometrical validation of this assumption is given below. From equation (27), the trace of the matrix P(k) is less than or equal to ∥vk∥2. Thus,
0<λi(k)<∥vk∥2,i=1,2, . . . ,N. (29)
Equation (29) implies that a sufficient condition for equation (28) is that
Physically speaking, in order to satisfy equation (30) for all k, it is sufficient to satisfy it at k=1, since the maximum weight deviation takes place at the start of adaptation. This leads to the following condition:
Equation (31) is the step-size range that is sufficient for stability of the NLMF algorithm of equation (25) in the initial phase of adaptation. This step-size range does not depend on the input power of the adaptive filter. The range given by equation (31) reflects the dependence of the stability of the LMF algorithm on the weight initialization of the adaptive filter.
The following provides a geometrical validation of the above assumption that the matrix P(k) is positive definite. For any vector u in the N-dimensional space, equation (27) implies that:
is the projection of the vector u on the vector xk, and
is the projection of the vector vk on the vector xk. When xk is persistently exciting, it spans the whole N-dimensional space. Then, with non-zero probability, the projections of given vectors u and vk on xk will be different from zero, which implies that ux2(k) vx2(k) will be positive. This implies that the right-hand side of equation (32) will be positive, which implies that the matrix P(k) is positive definite.
With regard to the steady-state condition, a sufficient condition of the steady-state stability of the mean square deviation of the regular LMF algorithm of equation (5) is given by:
For long adaptive filters, ∥xk∥4 can be approximated by E(∥xk∥4). Then, equation (23) implies that the condition of equation (35) on μ can be mapped to the following condition on α:
Equation (36) is the step-size range that is sufficient for stability of the present NLMF algorithm of equation (25) in the steady-state.
With regard to the final step-size condition, equation (31) is sufficient for convergence of the NLMF algorithm of equation (25) in the initial phase of adaptation, while equation (36) is sufficient for stability around the Wiener solution. Both properties can be achieved by using the step-size to satisfy the following condition:
For a long adaptive filter, E(∥xk∥4)/E(∥xk∥2)≈Nσx2. Then, the second term in the argument of the min {.} function on the right-hand side of equation (38) is increasing in N and the signal-to-noise ratio σx2/σb2. Thus, for the long adaptive filter, non-small signal-to-noise ratio, and non-small initial weight deviation ∥v1∥,
For illustration of equation (39), an example with N=32, binary input xkε{−σx, σx}, binary noise bkε{−σb,σb}, and where ∥v1∥=1 is considered. In this example, equation (39) holds as long as the signal-to-noise ratio σx2/σb2 is greater than ⅝. Equations (38) and (39) imply that:
Thus, the final step-size condition of equation (37) is identical with the initial phase condition of equation (31). Thus, the condition of equation (31) is sufficient for the stability of the NLMF algorithm of equation (25) in applications with a long adaptive filter and non-small signal-to-noise ratio. This condition is validated by the simulation results given below.
The following simulations were performed for the case of adaptive plant identification, as illustrated in
x
k(xk,xk−1, . . . ,xk−N+1)T, (41)
where xk is the plant input. xk is a zero mean, independent and identically distributed (HD) Gaussian sequence with variance σx2. The plant noise is a zero mean IID Gaussian sequence with variance σb2. The plant parameters are given by:
The value of ρ is chosen such that ∥g∥=1. In the exemplary ease of the simulation, N=32 and ρ=0.0183. The initial weight vector of the adaptive filter is h1=0, thus ∥v1∥=1.
To study the dependence of the stability of the conventional NLMF algorithms of equations (10) and (11) and the present NLMF algorithm of equation (25) on the input power of the filter, the algorithms are simulated over a wide range of σx, ranging from 0.2 to 1000. The considered value of σb is 0.01. For the algorithm represented by equation (10), μ=1. For the algorithm represented by equation (11), μ=1, δ=0, and λ=0.5. For the algorithm represented by equation (25), α=1. The simulations have shown that the conventional NLMF algorithms of equations (10) and (11) diverge for σx=1 and above, whereas the present NLMF algorithm of equation (25) is non-divergent for all σx.
In order to validate the step-size condition given by equation (31) of the present NLMF algorithm, the maximum step-size for which the algorithm is convergent is determined by simulations and compared with the step-size bound provided by equation (31). This is performed at several values of σx. The considered noise variance, plant parameter vector, and initial weight vector of the adaptive filter are as given above. The results are shown in
To validate the step-size condition of equation (31) for non-Gaussian plant input and plant noise, the simulations of
To validate the step-size condition given by equation (31) for non-white input of the plant, the simulations of
x
k
=βx
k−1+σx√{square root over (1−β2)}wk,0≦β<1, (43)
where wk is a zero mean, unity variance IID Gaussian sequence. The parameter β controls the degree of correlation of the sequence {xk}; the greater β is, the stronger the correlation. In the simulations, the considered value of β was 0.95, which corresponds to a strong correlation of the sequence {xk}. This implies both a strong correlation among the components of the regressor and a strong correlation between successive regressors. The noise is white Gaussian. The considered noise variance, plant parameter vector, and initial weight vector of the adaptive filter are the same as those considered in
Finally,
Echoes with long delay are observed only on long-distance connections. To clearly understand the echo phenomenon,
In order to counteract the echo phenomenon, schemes must be developed to either completely eliminate it (i.e., the ideal requirement), or to at least substantially reduce its adverse effect so as to achieve a transmission of good quality. Echo cancellation is a suitable area for the application of adaptive filtering. An adaptive echo canceller 208, 210 estimates the responses of an underlying echo-generating system in real time in the face of unknown and time-varying echo path characteristics, generates a synthesized echo based on the estimate, and cancels the echo by subtracting the synthesized echo from the received signal.
In
where {hi,k} is the estimated echo-path impulse response sample, xi is the input sample to the ith-tap delay, and N is the number of tap coefficients. While passing through the hybrid 212, the speech from customer C results in the echo signal yk. This echo, together with the speech from office O, rk, constitutes the desired response for the adaptive filter. The canceller error signal is obtained as follows:
ζk=yk−yk′+rk=ek+rk. (45)
The error signal ek of
Ideally, the system eventually converges to the condition ζk=rk. The effect of this ideal condition on the echo cancellation is naturally of some concern. Convergence of the echo to zero, however, is not an adequate criterion of performance for a system of this type, since this is possible only if yk is exactly representable as the output of a fixed-tap filter. A better performance criterion is the convergence of the filter's impulse response to the response of the echo path.
It is to be understood that the present invention is not limited to the embodiments described above, but encompasses any and all embodiments within the scope of the following claims.