1. Technical Field
The present disclosure relates to an acoustic echo cancellation, in particular, to the acoustic echo cancellation method and system using the same based on the prior-knowledge matrix.
2. Description of Related Art
Thanks to technology, users can communicate with other people at all times anywhere. However, it is still inconvenient to communicate in some situations by using handheld communication apparatuses.
For example, drivers use handheld communication apparatuses while driving is prohibited; or when a presenter communicates with people, the handheld communication apparatus is harassing to the presenter. Thus, the hands-free communication system is widely in use in people's life. The bilateral-side or multi-side communication such as the multimedia conference, remote teaching, satellite communication, video phone, and network phone all require the hands-free communication system. The hands-free communication system means that the speaker and microphone are not disposed in the same apparatus and are located at different fixed positions in the same room.
However, there exists a problem with the hands-free communication system. Using the bilateral-side or multi-side communication, the user in the far-end room may hear his/her voice at the previous time point, and this generates an acoustic echo. Since the voice of the user in the far-end room is outputted from the speaker in the near-end room, and the voice of the user in the far-end room is reflected with the wall or other objects in the near-end room. Therefore, a part of the reflected voice is also received by the microphone with the voice of another user in the near-end room simultaneously. In other words, the acoustic echo has Multipath Inference.
The acoustic echo in the voice transmission may influence the quality of the communication; furthermore this causes the user in the far-end room to not be able to differentiate the message spoken by another user in the near-end room. Thus, there still exist many issues to be improved in the quality of the communication.
An exemplary embodiment of the present disclosure provides an acoustic echo cancellation method. The method comprises the following steps. Firstly, a prior-knowledge matrix comprising a plurality of space vectors is built. Then, an initial filter vector is generated by the prior-knowledge matrix and initial weighting vector. The weighting vector is updated based on the difference of echo signal and estimated signal in an iteration algorithm. The coefficient of the filter vector is updated according to the updated weighting vector. Then, the estimated signal is obtained by the updated filter vector and the original signal. Finally, the next echo signal is canceled by the estimated signal.
An exemplary embodiment of the present disclosure provides an acoustic echo cancellation system. The system comprises a prior-knowledge generator and an adaptive filter. The adaptive filter couples to the prior-knowledge generator. The prior-knowledge generator is configured for building a prior-knowledge matrix comprising a plurality of space vectors. The adaptive filter is configured for generating a filter vector by the prior-knowledge matrix and a weighting vector. The weighting vector is calculated based on the difference of the echo signal and the estimated signal in an iteration algorithm. The echo signal is generated by the convolution of the original signal with a room impulse response of a near-end room. Wherein the adaptive filter generates a near-end estimated signal, which is used to cancel the next echo signal.
To sum up, the acoustic echo cancellation method and system utilize prepared space vectors as the base to update the coefficient of the adaptive filter, to increase the robustness and the efficiency in the estimation processing. In other words, the convergence rate of the algorithm is raised by incorporating previous room impulse responses. Thus, the present disclosure shows better convergence rates and lower error rates compared to traditional acoustic echo cancellation methods, accordingly improving the quality of the communication.
In order to further understand the techniques, means and effects of the present disclosure, the following detailed descriptions and appended drawings are hereby referred to, such that, and through which, the purposes, features and aspects of the present disclosure can be thoroughly and concretely appreciated, however, the appended drawings are merely provided for reference and illustration, without any intention that they are used for limiting the present disclosure.
The accompanying drawings are included to provide a further understanding of the present disclosure, and are incorporated in and constitute a part of this specification. The drawings illustrate exemplary embodiments of the present disclosure and, together with the description, serve to explain the principles of the present disclosure.
Reference will now be made in detail to the exemplary embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings.
Please refer to
Please refer to
Original signal x(n) is transmitted from the far-end room (e.g., the voice signal of the user 120 is transmitted by the microphone 121 in
The prior-knowledge generator 22 is configured for building a prior-knowledge matrix H. Specifically, the prior-knowledge means utilizing the data collected previously to calculate the impulse response accurately. In the embodiment of the present disclosure, the prior-knowledge matrix H is formed by a plurality of space vectors, and each vector represents a particular channel characteristic. In other words, the vectors represent different sizes or forms of the rooms, such as 1 m*m, 2 m*m, or 3 m*m, the present disclosure is not limited thereto.
The proposed algorithm consists of two stages. The first stage is building a set of prior-knowledge. In other words, numerous training data such as man-made data or natural and simulation data are collected to prepare the prior-knowledge. The second stage is on-line operating or calculating based on the prepared prior-knowledge. Thus, if the first stage is prepared more carefully, the second stage can provide better performance.
The adaptive filter 23 generates an initial filter vector ĥ(n) by the prior-knowledge matrix H and an initial weighting vector, and accordingly generates the initial estimated signal ŷ(n). The difference of the estimated signal ŷ(n) and the echo signal y(n) is used to update the weighting vector in an iteration manner. The weighting vector then updates the filter vector ĥ(n) and accordingly the estimated signal ŷ(n) to minimize the difference of ŷ(n) and y(n). Then, the adaptive filter 23 generates a filter vector ĥ(n) based on the updated weighting vector. The computed ĥ(n) is used to compute ŷ(n) and cancel the next echo signal y(n). The error signal e(n) may be close to the zero after the estimated signal ŷ(n) is calculated thru many iterations.
It is worth noting, in the embodiment of the present disclosure, the weighting vector is formed by a weighting coefficient and a bias vector. The iteration algorithm can be a single dimensional input algorithm, such as Least Mean Square (LMS) and Normalized LMS (NLMS), or a multi-dimensional input algorithm, such as Affine Projection Algorithm (APA), Proportionate Affine Projection Algorithm (PAPA), Improvement Proportionate Affine Projection Algorithm (IPAPA), and LevenbergMarquardt Regularized APA (LMR-APA), etc. However, the present disclosure is not limited thereto. The above algorithms are further illustrated as follows.
First Stage (Off-Line)
Builds a set of the prior-knowledge, which is the prior-knowledge matrix H as equation (1). Wherein the prior-knowledge matrix H comprises K vectors [h01 h21 . . . hL−11]T, [h02 h22 . . . hL−12]T, . . . , [h0K h2K . . . hL−1K]T generated by the adaptive filter 23 adapted with reference spaces. Those K vectors are collected to generate a vector matrix H=[h1 . . . hK]. The coefficients are the vector spaces.
Second Stage (On-line)
A new matrix S=[H|IL×L] is generated by combining an identity matrix IL×L with the prior-knowledge matrix H as equation (2) as follows.
The filter vector ĥ(n) is calculated by the weighting vector ŵ(n) with the matrix S. More specifically, the weighting vector ŵ(n)=[a(n)Tb(n)T]T is an (L+K)-dimensional vector, wherein the a(n)=[a1(n) . . . aK(n)]T is a K-dimensional weighting vector and b(n)=[b1(n) . . . bK(n)]T is an L-dimensional initial vector. Thus, the filter vector ĥ(n) can be shown as equation (3) as follows.
The affine projection algorithm and the proportionate affine projection algorithm are illustrated as follows.
Vector-space A (fine Projection Algorithm (VAPA)
For calculating the ŵ(n), the equation (4) and (5) are solved by Lagrange to formulate the equation (6).
L({circumflex over (w)}(n),λ)=∥Sŵ(n)−Sŵ(n−1)∥2+[y(n)−XT(n)Sŵ(n)]TΛ (6)
Wherein Λ=[Λ0 Λ1 . . . ΛP−1] is a vector of Lagrange, and the equation (7) is derived from the equation (6).
2ST(Sŵ(n−1)−Sŵ(n))+STX(n)Λ=0 (7)
Then, assuming J=STSST and simplifying the equation (7), merges the equation (5) to obtain the equation (8).
AssumingU=SJ−1, we can obtain the equation (9).
Λ=(XT(n)UX(n))−12e(n) (9)
Therefore, by taking the equation (9) back into the equation (7), ŵ(n) can be updated. In other words, ŵ(n) is updated by equation (10) as follows.
{circumflex over (w)}(n)={circumflex over (w)}(n−1)+μ′J−1X(n)(XT(n)UX(n)+δ′Ip×p)−1e(n) (10)
Where μ′ is a step, and δ′ is a small normalized constant, which is configured for avoiding the inverse matrix to be zero.
Vector-space Proportionate Arne Projection Algorithm (VPAPA)
The derivation of the proportionate affine projection algorithm is similar to the derivation of the affine projection algorithm. The equation (6) is changed as follows.
L({circumflex over (w)}(n),λ)=∥Sŵ(n)−Sŵ(n−1)∥2+[y(n)−XT(n)G(n)Sŵ(n)]TΛ′ (11)
Wherein Λ′ is a new Lagrange, G(n) is an L*L diagonal matrix. gl(n),l=0, . . . , L−1 is shown as the elements in the diagonal matrix. The equation (12) is derived from the equation (11) as follows.
2ST(Sŵ(n−1)−Sŵ(n))+STG(n)X(n)Λ′=0 (12)
Then, assuming J=STSST and simplifying the equation (12), merges the equation (5) to obtain the equation (13).
AssumingU=SJ−1, we can obtain the equation (14).
Λ=(XT(n)UG(n)X(n))−1 2e(n) (14)
After solving the new Lagrange, the ŵ(n) can be updated by taking the equation (14) back into the equation (12). In other words, the ŵ(n) is the iteration of the vector-space proportionate affine projection algorithm in equation (15) as follows.
{circumflex over (w)}(n)={circumflex over (w)}(n−1)+μ′J−1G(n)X(n)(XT(n)UX(n)+δ′Ip×p)−1e(n) (15)
Wherein the μ′ is a step, and δ′ is a small normalized constant, which is configured for avoiding the inverse matrix to be zero.
Please refer to
Next, in the step S103, the adaptive filter 23 generates an initial filter vector h(n) by the prior-knowledge matrix H and the initial weighting vector. More specifically, the room impulse response h(n) of the near-end room 21 can be simulated by the weighting vector and the prior-knowledge matrix H. In the embodiment of the present disclosure, the adaptive filter 23 combines the vectors in H using weighting coefficients and bias coefficients in the weighting vector.
In the step S105, the adaptive filter 23 calculates estimated signal ŷ(n) to match the echo signal y(n) in an iteration algorithm by the adder 24, where y(n) is generated from an original signal x(n) via a room impulse response h(n) of a near-end room 21. The adaptive filter 23 adjusts the weighting vector and thus filter vector ĥ(n) according to the difference of y(n) and ŷ(n). It is worth noting, the update method of the estimated signal ŷ(n) is not limited by vectors, also can be single values such as that used in LMS and NLMS. The embodiment of the present disclosure is implemented to optimize the active region and non-active region of the convergence rate by the multi-dimensional input, but is not limited thereto.
In the step S107, the adaptive filter 23 generates a near-end estimated signal ŷ(n) based on the prior-knowledge matrix H, the updated weighting vector, and the original signal. The weighting vector is first updated, the coefficient of the filter vector ĥ(n) is then updated according to the updated weighting vector, and finally a new near-end estimated signal ŷ(n) is generated. The next echo signal y(n) is cancelled by the new near-end estimated signal ŷ(n), and a new near-end estimated signal ŷ(n) is then generated (that is the iteration algorithm). More specifically, after the coefficients of the weighting vector is updated in step S105, the adaptive filter calculates a new near-end estimated signal ŷ(n) and computes an error signal, e(n), which is used to update the coefficient of the weighting vector. After the coefficient of the weighting vector is updated, the estimated signal ŷ(n) becomes closer to the echo signal y(n). Thus, the error signal e(n) is close to zero. In other words, the estimated signal ŷ(n) is generated to cancel the echo signal y(n) generated from the original signal x(n) via the room impulse response h(n) of the near-end room 21.
To sum up, the acoustic echo cancellation method and system utilize a set of space vectors to update the coefficient of the adaptive filter to increase the robustness and the efficiency in the estimation processing. In other words, the convergence rate of the algorithm is raised by estimating the previous room impulse response. Thus, the present disclosure shows better convergence rate of the algorithm and lower error rate compared to the traditional acoustic echo cancellation method, and raises the quality of the communication. On the other hand, the environment using a microphone and speaker (e.g., SKYPE, hands-free system in car . . . , etc.) has better voice quality for a listener because of the convergence rate in high speed to avoid the acoustic echo.
The above-mentioned descriptions represent merely the exemplary embodiment of the present disclosure, without any intention to limit the scope of the present disclosure thereto. Various equivalent changes, alterations or modifications based on the claims of present disclosure are all consequently viewed as being embraced by the scope of the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
103128680 A | Aug 2014 | TW | national |
Number | Name | Date | Kind |
---|---|---|---|
5408530 | Makino | Apr 1995 | A |
6137881 | Oh | Oct 2000 | A |
6694020 | Benesty | Feb 2004 | B1 |
20060262939 | Buchner | Nov 2006 | A1 |
20150161994 | Tang | Jun 2015 | A1 |
Number | Date | Country | |
---|---|---|---|
20160057534 A1 | Feb 2016 | US |