The present subject matter relates generally to audio processing devices and hearing assistance devices, and in particular to efficient convex optimization for real-time robust beamforming with microphone arrays.
For hearing aid users, speech-in-noise is one of the most difficult situations to deal with, because the noise deteriorates speech intelligibility. Several methods have been proposed to resolve this issue, but are complicated if the direction of the desired speech is not known, as efforts to reduce the noise can also inadvertently reduce the speech. This inadvertent reduction of the desired speed is called target cancellation and the direction of the desired speech is described by a vector called the steering vector.
Previous methods to resolve the speech-in-noise problem included estimating the steering vector or constraining the adaptation range to avoid target cancellation. The first class of methods that try to estimate the steering vector have significant shortcomings, because the steering vector of different subjects can differ significantly and the steering vector of a single subject is different every time the subject puts on the hearing aid. The second class of methods that limit the adaptation range also has shortcomings, because the limit of the adaptation reduces target cancellation but it also reduces benefit.
A third class of methods does not use a steering vector (indicating a specific target direction), but a range of steering vectors (indicating a target region) where the speech target can come from. This third class of methods uses fixed or adaptive beamforming algorithms (or static and adaptive) to improve the speech intelligibility in noise. Adaptive beamforming algorithms reduce the noise as much as possible with the constraint that sound coming from the target region is not attenuated. Adaptive beamforming algorithms have the highest potential to improve speech intelligibility. This third class of methods that protect the target region work well, but they have been designed for applications that include multiple sensors and that have the capacity for much more computational complexity than found in a hearing aid.
What is needed is an algorithm that does adaptive beamforming, is robust against steering vector mismatches and is computational feasible for a hearing aid.
Disclosed herein, among other things, are methods and apparatus for improving speech intelligibility for speech-in-noise in audio processing and hearing assistance devices. The present subject matter includes a hearing assistance device having a microphone array configured to receive an audio signal, the audio signal including speech and noise. The hearing assistance device also includes a processor configured to process the received signal to improve speech intelligibility in noise. The processor is configured to use a barrier-type beamforming process to improve signal-to-noise ratio at the output of the microphone array. The beamforming process includes convex optimization using a logarithmic barrier function, according to various embodiments.
One aspect of the present subject matter includes a method for improving speech intelligibility for speech-in-noise for audio processing and hearing assistance devices. The method includes receiving an audio signal using a microphone array and processing the received signal to improve speech intelligibility in noise. A barrier-type beamforming process is used to improve signal-to-noise ratio at the output of the microphone array. The beamforming process includes convex optimization using a logarithmic barrier function, according to various embodiments.
This Summary is an overview of some of the teachings of the present application and not intended to be an exclusive or exhaustive treatment of the present subject matter. Further details about the present subject matter are found in the detailed description and appended claims. The scope of the present invention is defined by the appended claims and their legal equivalents.
The following detailed description of the present subject matter refers to subject matter in the accompanying drawings which show, by way of illustration, specific aspects and embodiments in which the present subject matter may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the present subject matter. References to “an”, “one”, or “various” embodiments in this disclosure are not necessarily to the same embodiment, and such references contemplate more than one embodiment. The following detailed description is demonstrative and not to be taken in a limiting sense. The scope of the present subject matter is defined by the appended claims, along with the full scope of legal equivalents to which such claims are entitled.
The present subject matter presents an efficient implementation of a robust adaptive beamforming algorithm based on convex optimization for applications in the processing-constrained environment of a digital hearing aid. Several modifications of the standard interior point barrier method are introduced for use where the array correlation is changing rapidly relative to the algorithm's convergence rate. These efficiency improvements significantly simplify the computation without affecting the algorithm's fast convergence, and are useful for real-time adaptive beamforming regardless of the rate of array correlation change. Simulation results show that this implementation is numerically stable and succeeds where many minimum-variance distortionless response (MVDR) solutions fail.
Although adaptive beamforming algorithms can improve the signal-to-noise ratio at the output of a microphone array [Cox et al., IEEE Trans. Acoust., Speech, Signal Processing, 35:1365 (1987)], they are not robust against any mismatch in the steering vector [Greenberg and Zurek, J. Acoust. Soc. Am., 91:1662 (1992)]. Several methods have been proposed in the literature to resolve the steering mismatch issue [Hoshuyama et al., IEEE Trans. Signal Processing, 47:2677 (1999); Stoica et al., IEEE Signal Processing Letters, 10:172 (2003); Vorobyov et al., IEEE Trans. Signal Processing, 51:313 (2003)]. The first two papers estimate the steering vector in real-time as part of the adaptive beamforming algorithm and the third paper establishes a protected region around the steering vector where it allows no reduction.
For the hearing aid application, the estimation of the steering vector would be difficult, because the steering vector changes every time the wearer puts on the hearing aid and the steering vector can change when the wearer touches the hearing aid. Hence the method in [Vorobyov et al., 2003] is the most promising solution to solve the robustness problem of adaptive beamformers. It minimizes the output of the microphone array while maintaining a distortionless response for the worst case (mismatched) steering vector. Furthermore it derives a convex formulation for such a robust adaptive beamforming problem using second-order cone programming (SOCP) [Vorobyov et al., 2003]. The paper has, however, not been written with a hearing-aid application in mind: it neither takes into account the hearing aid's constraints on the computational complexity nor the ever-changing sound fields in which hearing aids are typically used, which results in time-varying data statistics and steering vectors. The present subject matter proposes efficient real-time convex optimization algorithms to solve the robust adaptive beamforming problem in a rapidly changing environment. It uses the barrier method with a logarithmic barrier function to solve the SOCP problem. The focus is on the balance among robustness, real-time adaptivity, and computational efficiency.
Consider an MVDR beamformer that is robust against an arbitrary signal steering vector mismatch. The beamformer can be obtained by solving the following optimization problem [Vorobyov et al., 2003]
where w is the beamformer, R is the data covariance matrix, a is the steering vector, and A(ε) is the uncertainty set of the steering vector.
Assume that the mismatch between the actual steering vector and the nominal one can be bounded by some known constant ε. The uncertainty set can then be expressed as:
A(ε)={a|a=a0+Δ,∥Δ∥≦ε}.
The problem in (1) is a nonconvex quadratic programming with infinitely many constraints and is thus computationally intractable. However, it has been shown in [Vorobyov et al., 2003] that (1) can be rewritten in the following equivalent convex form:
In (2), the objective is a quadratic form and a is the nominal steering vector. One can apply the Cholesky factorization R=UHU to obtain wHRw=∥Uw∥2. Thus minimizing the output power wHRw is equivalent to minimizing ∥Uw∥. One can further introduce an additional variable, τ, as an upper bound on ∥Uw∥ and obtain:
The problem in (3) has the standard form of an SOCP, and can be solved using a standard convex optimization solver such as SeDuMi [Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones,” http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.49.6954, 1998].
In many real applications, the data covariance matrix R and steering vector a are time-varying. In such case, an SOCP needs be solved for each new pair of R and a. Solving each SOCP independently is very inefficient and not feasible, especially in embedded applications such as hearing aids where computational source is strictly limited. The next section presents an efficient real-time implementation of solving (2) for varying R and a using a improved logarithmic barrier method [Boyd and Vandenberghe, Convex Optimization, Cambridge University Press, 7th ed., (2004), Chapter 11].
2.1. Logarithmic Barrier
The logarithmic barrier method is used to solve the problem in (2). The barrier function that corresponds to the second-order cone constraint in (2) is:
φ(w)=−log((aHw−1)2−ε2∥w∥2) (4)
The idea of the logarithmic barrier method is to solve the following minimization problem with equality constraints only:
where t is a parameter that sets the accuracy of the approximation of the inequality constraints by the barrier function φ(w). For fixed R and a, the optimal beamformer w can be solved by choosing large enough t.
For each fixed t, the barrier method uses Newton's method to solve (5). This requires both the gradient and the Hessian of the barrier function φ(w), which can be derived from the following corollary.
Corollary 1: Assume a logarithmic function of the form
Then its gradient, given in [Boyd and Vandenberghe, 2004, Chapter 11], and its derivative, the Hessian, can be expressed as:
∇2(w)=−2g−2(w)[g(w)(ccT+ATA)−2f(w)fT(w)]ε
(8)
where
f(w)=(cTw+d)c−AT(Aw+b)
g(w)=(cTw+d)2−∥Aw+b∥2
For example, for the SOCC in (3), A=εI, b=0, c=a, and d=−1, with the real and imaginary components separated as in [Vorobyov et al., 2003].
2.2. Real-Time Implementation
This section presents an efficient real-time implementation for solving (2) in the scenario when both R and a are time-varying. Initialization consists of
At each iteration, which might be much less often than the sampling period, the following steps, which are an extension of the barrier method of [Boyd and Vandenberghe, 2004, Chapter 11], are taken:
1. Track environment change
2. Update t—If the root mean square change in x on the last iteration was less a specified threshold, increase t by a fixed percentage (next outer iteration of barrier method), unless the desired solution precision has already been reached. In practice, given slight restrictions on the desired precision and on the rate of change of R, it turns out that it is never necessary to decrease t to maintain stability.
3. Take the next step toward the optimum solution
A few efficiency improvements are obtained in the proposed algorithm when compared to the standard SOCP solver:
Three simulations illustrate the performance of the algorithm. For all simulations, three microphones in a uniform linear array measuring 1.5 cm from end-to-end with its axis in the 0° direction were used. The 2 kHz frequency band was simulated. A 10 dB target signal and a 10 dB interfering signal with 5° elevation and variable azimuth were used along with −40 dB of white noise in each mic. 20 iterations per second were performed and the averaging filter for R had a time constant of 0.5 s.
Once the interferer moves sufficiently far from the protected region, the null begins tracking the interferer. Note that the successful illustrated null tracking occurs even though the source moves 1 degree per observation. Also, the algorithm only sees the source through the delay imposed by a single-pole time averaging filter that mixes in 10% of the current observation to estimate the true R.
For the early iterations, the maximum gain is at 180°, reaching a maximum of 15.5 dB at iteration 40 and surpassing 5.0 dB only between iterations 27 and 78. Per the constraint, the gain at 5° never goes below 0 dB; it reaches a maximum of 1.2 dB at iteration 38.
Taking advantage of the most obvious sparseness of the system, the Hessian can be calculated for three microphones with 230 multiplies, 148 adds, and 2 divisions. Solving the system for three microphones using the truncated CG method takes about 188 multiplies and adds and exactly 5 divisions. These are the most expensive operations and drive the cost of the algorithm. Using historical algorithm overhead estimates, 91% of the processor time would be required to run the given method in 16 bands on a currently shipping digital hearing aid. Given everything else the hearing aid must process, this is not yet feasible, but it should soon be given increasing computational rates.
The present subject matter illustrates that the barrier method of solving an SOCP problem is well suited to adaptive acoustic beamforming with robustness to steering vector uncertainty. The method can be implemented with low computational complexity approaching the available processing power in current hearing aids. Furthermore, the barrier method has been adapted to solve a continually changing problem to sufficient precision instead of solving a static problem to great precision as is the common case. Several other techniques to minimize the computational complexity have been applied. Simulations show that the method can adapt quickly even when the interferer moves rapidly. Also, the results are robust to a user-specified level of steering vector mismatch.
The present subject matter is demonstrated for hearing aids. It is understood however, that the disclosure is not limited to hearing aids and that the teachings provided herein can be applied to a variety of audio processing and hearing assistance devices, including but not limited to, behind-the-ear (BTE), in-the-ear (ITE), in-the-canal (ITC), receiver-in-canal (RIC), or completely-in-the-canal (CIC) type hearing aids. It is understood that behind-the-ear type hearing aids may include devices that reside substantially behind the ear or over the ear. Such devices may include hearing aids with receivers associated with the electronics portion of the behind-the-ear device, or hearing aids of the type having receivers in the ear canal of the user, including but not limited to receiver-in-canal (RIC) or receiver-in-the-ear (RITE) designs. The present subject matter can also be used in hearing assistance devices generally, such as cochlear implant type hearing devices and such as deep insertion devices having a transducer, such as a receiver or microphone, whether custom fitted, standard, open fitted or occlusive fitted. It is understood that other hearing assistance devices not expressly stated herein may be used in conjunction with the present subject matter.
This application is intended to cover adaptations or variations of the present subject matter. It is to be understood that the above description is intended to be illustrative, and not restrictive. The scope of the present subject matter should be determined with reference to the appended claims, along with the full scope of legal equivalents to which such claims are entitled.
The present application claims the benefit under 35 U.S.C. 119(e) of U.S. Provisional Patent Application Ser. No. 61/394,872, filed Oct. 20, 2010, and to U.S. Provisional Patent Application Ser. No. 61/412,610, filed Nov. 11, 2010, which are incorporated herein by reference in their entirety.
Entry |
---|
[Boyd and Vandenberghe, Convex Optimization, Cambridge University Press, 7th ed., (2004), Chapter 11]. |
Cox, H., et al., “Robust adaptive beamforming”, IEEE Transactions on Acoustics, Speech and Signal Processing, 35(10), (Oct. 1987), 1365-1376. |
Greenberg, J. E, et al., “Evaluation of an adaptive beamforming method for hearing aids”, J Acoust Soc Am., 91(3), (Mar. 1992), 1662-76. |
Hoshuyama, O., et al., “A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters”, IEEE Transactions on Signal Processing, 47(10), (Oct. 1999), 2677-2684. |
Shewchuk, Jonathan Richard, “An Introduction to the Conjugate Gradient Method Without the Agonizing Pain”, [Online]. Retrieved from the Internet: <URL: http://math.nyu.edu/faculty/greengar/painless-conjugate-gradient.pdf>, (Aug. 4, 1994), 64 pgs. |
Stoica, P., et al., “Robust Capon beamforming”, IEEE Signal Processing Letters, 10(6), (Jun. 2003), 172-175. |
Vorobyov, S. A, et al., “Robust adaptive beamforming using worst-case performance optimization: a solution to the signal mismatch problem”, IEEE Transactions on Signal Processing, 51(2), (Feb. 2003), 313-324. |
Number | Date | Country | |
---|---|---|---|
61394872 | Oct 2010 | US | |
61412610 | Nov 2010 | US |