The present invention relates to a technique for detecting an echo of a received signal mixed into a send signal and for canceling the detected echo from the send signal.
An echo canceler is generally configured as a combination of a linear echo processing unit for canceling a linear echo component and a residual echo suppression processing unit for suppressing the linear echo component that cannot be canceled out by the linear echo processing and a nonlinear echo component. Such a configuration is disclosed in Non-Patent Document 1, for example. The echo canceler disclosed in the Non-Patent Document 1, however, has a problem in that when an echo and send voice overlap each other, that is, when double talk occurs, the residual echo suppression processing unit suppresses both the residual echo and send voice.
To solve the foregoing problem, Patent Document 1, for example, discloses a method of flexibly controlling suppression coefficients used for the amplitude suppression of a voice signal in the residual echo suppression processing unit. The Patent Document 1 discloses an echo suppression method for obtaining an echo suppression quantity using the power of an estimated echo signal obtained by multiplying the received signal power by the amount of acoustic coupling of an echo path, which is estimated from the power of a microphone input signal on which the send voice and echo signal are superposed.
However, the foregoing conventional method that determines the echo suppression quantity from the power of the estimated echo signal calculated from the amount of acoustic coupling has two problems. First, when background noise is large, since an echo gets buried in the background noise, it cannot carry out right observation of the amount of acoustic coupling, which offers a problem of reducing the estimation accuracy of the echo signal power. Second, when the amount of acoustic coupling itself varies because of a change of an echo path, it becomes necessary to reobserve the amount of acoustic coupling, which offers a problem of being unable to suppress the echo appropriately until an appropriate echo suppression quantity can be obtained by calculating an accurate observed value. As a result, under the foregoing condition, the echo suppression quantity is not calculated correctly, which offers a problem of causing a residual echo or hindering telephone conversation because of excessive suppression of the send voice.
Thus, the foregoing conventional technique has the problems of bringing about the residual echo because of being unable to compute the appropriate echo suppression quantity or hindering the telephone conversation because of the excessive suppression of the send voice when carrying out the telephone conversation under the large noise or in an environment where the echo path varies.
The present invention is implemented to solve the foregoing problems. Therefore it is an object of the present invention to provide an echo canceler capable of calculating an appropriate echo suppression quantity even under the large noise or in the environment where the echo path varies.
An echo canceler in accordance with the present invention comprises: a signal-to-echo ratio calculating unit for computing a signal-to-echo ratio indicating a ratio of the echo component to the received signal from a first residual signal and a second residual signal, the first residual signal being obtained by using a filter coefficient sequence of an update filter which is obtained up to the previous operation of the echo canceler, and the second residual signal being obtained by using an updated filter coefficient sequence that undergoes a coefficient update which is performed, using an arbitrary update step size, on the filter coefficient sequence of the update filter which is obtained up to the previous operation of the echo canceler; and an echo suppressing unit for suppressing the echo component contained in the microphone input signal in accordance with the signal-to-echo ratio the signal-to-echo ratio calculating unit computes.
According to the present invention, it can provide a suitable telephone conversation environment by suppressing the residual echo without hindering the telephone conversation due to the excessive suppression of the send voice even when telephone conversation is carried out in the condition where large noise occurs or the echo path changes.
The best mode for carrying out the invention will now be described with reference to the accompanying drawings to explain the present invention in more detail.
In addition, as shown in
y′(n)=y(n)+s(n) (1)
Receiving the received signal x(n) and the microphone input signal y′(n), the echo canceler 100 cancels the echo signal y(n) contained in the microphone input signal y′(n), and outputs an output signal s′(n) which is an estimated signal of the near-end signal s(n).
Next the operation of the echo canceler 100 of the embodiment 1 will be described.
Assume that the subtraction filter 101 has a filter coefficient sequence ĥ(n)=[ĥ0(n), ĥ1(n), . . . , ĥN-1(n)]T, where N is a filter length. It generates a first estimated echo signal ŷ1(n) by filter processing using the received signal x(n), and obtains a first residual signal d1(n) by subtracting the signal ŷ1(n) from the microphone signal y′(n). Assuming that a residual echo signal, which is obtained by subtracting the estimated echo signal ŷ1(n) from the echo signal y(n), is e(n) (that is, e(n)=y(n)−ŷ1(n)), the first residual signal d1(n) is given by the following Expression (2).
d
1(n)=y(n)−ŷ1(n)=y(n)+s(n)−y1(n)=e(n)+s(n) (2)
It is assumed here that at the initial point where n=0, the filter coefficient sequence ĥ(0) is set at a particular initial value. Incidentally, as for the filter coefficient sequence ĥ(n), it is stored in a memory unit (not shown) and is acquired from the memory unit.
In addition, according to a prescribed adaptive algorithm, the update filter 102 executes coefficient update processing using the received signal x(n), the first residual signal d1(n) and an update step size μ(n) determined arbitrarily, and obtains the updated filter coefficient sequence ĥ(n+1) by the following Expression (3)
{circumflex over (h)}(n+1)={circumflex over (h)}(n)+μ(n)ε(n) (3)
where ε(n) is a coefficient adjusting value obtained by the coefficient update processing, which is determined by the adaptive algorithm. Incidentally, the update step size μ(n) can be a fixed value or a value varied each time by a prescribed means.
In addition, the update filter 102 generates a second estimated echo signal ŷ2(n) by the filter processing of the received signal x(n) using the updated filter coefficient sequence ĥ(n+1) obtained by the coefficient update processing of the foregoing Expression (3). In addition, it obtains a second residual signal d2(n) by subtracting the second estimated echo signal ŷ2(n) from the microphone input signal y′(n). The signal-to-echo ratio calculating unit 103 outputs a signal-to-echo ratio SE(n) from the received signal x(n), the first residual signal d1(n), the second residual signal d2(n) and the update step size μ(n) used in the update filter 102. Incidentally, a calculating method of the signal-to-echo ratio SE(n) will be described in detail later.
The residual echo suppression quantity calculating unit 104 computes a residual echo suppression quantity γ(n) by using the signal-to-echo ratio SE(n) generated by the signal-to-echo ratio calculating unit 103. For example, it can compute it by the following Expression (4).
As for another method for calculating the residual echo quantity γ(n) using the signal-to-echo ratio SE(n), the following reference describes it, for example.
As shown in the following Expression (5), the residual echo suppressing unit 105 carries out suppression processing by multiplying the first residual signal by the residual echo suppression quantity γ(n) generated by the residual echo suppression quantity calculating unit 104, thereby obtaining the estimated signal s′(n) of the near-end signal s(n).
s′(n)=γ(n)*d1(n) (5)
The above is the description of the operation of the echo canceler 100 of the embodiment 1.
Next, a computing method of the signal-to-echo ratio SE(n) by the signal-to-echo ratio calculating unit 103 will be described in detail. Incidentally, assuming that the variance of the residual signal d(n) and the variance of the residual echo signal e(n) are defined as σd2 and σe2, the signal-to-echo ratio SE(n) is defined by the following Expression (6).
Here, since the residual echo signal e(n) cannot be directly observed in general, the signal-to-echo ratio SE(n) cannot be calculated by the foregoing Expression (6). However, using the echo canceler 100 of the embodiment 1 makes it possible to obtain the signal-to-echo ratio SE(n) accurately using the observable residual signal and the update step size μ used by the update filter 102. To clarify the basis of the computing method of the signal-to-echo ratio SE(n), a deriving process of the computing method will be described in detail.
Assuming that the transfer function of the echo path 900 is given by h=[h0, h1, . . . , hN-1]T, and consider a case of estimating the transfer function by the update filter 102. Here, an identification error δ(n) between the transfer function of the echo path 900 and the estimated transfer function by the update filter 102 is defined by the following Expression (7).
δ(n)={circumflex over (h)}(n)−h (7)
Supposing that the coefficient update processing is advanced one step according to the foregoing Expression (3), then the identification error δ(n+1) is given by the following Expression (8)
When expressing the magnitude of the identification error by the sum of squared errors of the individual taps of the filter coefficients, the difference between the identification errors in one step of the coefficient update processing is given by the following Expression (9).
The following description will be made by way of example that uses the NLMS (Normalized Least Mean Square) algorithm. However, the present invention is not limited to the NLMS but can use other adaptive algorithms, and it is assumed that cases using other adaptive algorithms are contained in the present invention.
In the NLMS algorithm, the coefficient update processing corresponding to the foregoing Expression (3) is given by the following Expression (10).
Incidentally, in the foregoing Expression (10), x(n)=[x(n)x(n−1) . . . , x(n−N+1)]T is a received signal sequence, σx2 is the variance of the received signal and d(n) is the residual signal. In the real calculation, however, it is often placed as Nσx2≈xT(n)×(n) approximately, which is the signal power of the received signal sequence.
In addition, when using the NLMS algorithm, the foregoing Expression (9) is rewritten into the following Expression (11).
In the foregoing Expression (11), the fact is applied that the residual echo signal e(n) is given from the identification error δ(n) and the received signal x(n) by the following Expression (12).
e(n)=−δT(n)×(n) (12)
Furthermore, considering the fact that the received signal x(n) and the near-end signal s(n) are generally uncorrelated and independent, the expected value of Expression (11) can be approximated as the following Expression (13).
Assume that one-time coefficient update processing is executed using an arbitrary update step size μ. The difference between the variances of the residual signal due to the filter coefficients before and after the coefficient update processing can be approximated by the product of the difference Δ (n+1) of the sum of squared identification errors and the power Nσx2 of the received signal sequence x(n), and is given by the following Expression (14).
where σd′2(μ) is the variance of the residual signal due to the filter coefficients after the coefficient update processing using the update step size μ. Assume that the coefficient update processing is executed separately using different update step sizes μ1 and μ2, and that the variances of the residual signals are σd′(μ1) and σd′(μ2), respectively, then the difference between the two is given by the following Expression (15).
σd′2(μ1)−σd′2(μ2)=(μ12−μ22)σd2(μ)−2(μ1−μ2)σe2 (15)
Transforming Expression (15) further gives the following Expression (16).
In Expression (16), the left side is the reciprocal of the signal-to-echo ratio. Furthermore, Expression (16) can be further simplified by placing μ1 or μ2 at zero, which is given by the following Expression (17).
Expression (17) shows that the reciprocal of the signal-to-echo ratio can be calculated by observing the variance δd2(μ) of the residual signal due to the filter coefficient sequence at the previous step of executing the coefficient update processing and the variance σd′2(μ) of the residual signal due to the filter coefficients obtained by the coefficient update processing based on the arbitrary update step size μ.
Accordingly, the signal-to-echo ratio calculating unit 103 of the present invention calculates the signal-to-echo ratio given by the following Expression (18) from the first residual signal d1(n), second residual signal d2(n) and the update step size.
where σd
where α is a forgetting factor which satisfies 0<α<1.
Incidentally, assuming that the update step size μ(n) is one for all the time n, Expression (18) is reduced as the following Expression (20).
These are the details of the computing method of the signal-to-echo ratio SE(n).
As described above, the echo canceler 100 of the present invention is characterized by making it possible to calculate the signal-to-echo ratio SE(n) only from directly observable statistics without using the acoustic coupling quantity or other estimated values. More specifically, it can observe the variance σd
This means that applying the present invention always enables computing the signal-to-echo ratio SE(n) accurately without depending on the noise conditions or the degree of the stability of the echo, and the configuration that controls the suppression quantity of the residual echo in accordance with the signal-to-echo ratio SE(n) thus computed can achieve a residual echo suppression effect more stably than the conventional technique. As a result, the echo canceler can calculate an accurate signal-to-echo ratio without being interfered by external factors even in such conditions as large noise or echo path variations can occur, thereby being able to determine the appropriate echo suppression quantity and to suppress the residual echo always stably without deteriorating the send voice.
Next, another configuration of the echo canceler 100 will be shown.
A first update filter 102′ comprises a filter processing unit 102a′ and a subtractor 102b′, and the second update filter 106 comprises a filter processing unit 106a and a subtractor 106b. The first update filter 102′ and the second update filter 106 operate in the same manner as the foregoing first update filter 102. It is assumed here that the first update filter 102′ uses the first update step size μ1(n) and the second update filter 106 uses the second update step size μ2(n).
By adding the second update filter 106, the signal-to-echo ratio calculating unit 103 is supplied with three residual signals, that is, the first residual signal d1(n) from the subtraction filter 101, the second residual signal d2(n) from the first update filter 102′ and a third residual signal d3(n) from the third update filter 106.
The signal-to-echo ratio calculating unit 103 obtains the signal-to-echo ratio SE(n) from two residual signals dA(n) and dB(n) among the three residual signals of the first residual signal d1(n), second residual signal d2(n) and third residual signal d3(n), and from two parameters μA and μB among the three different parameters of the first update step size μ1(n), second update step size μ2(n) and “zero” according to the following Expression (21).
The following are concrete combinations of the two residual signals dA(n) and dB(n) and two parameters μA and μB in Expression (21).
Pattern 1:
Pattern 2:
Pattern 3:
The signal-to-echo ratio calculating unit 103 can select one of the signal-to-echo ratio SE1(n) computed using the pattern 1, the signal-to-echo ratio SE2(n) computed using the pattern 2 and the signal-to-echo ratio SE3(n) computed using the pattern 3 as the final signal-to-echo ratio SE(n). Alternatively, it can compute the average value of the three signal-to-echo ratios computed and make it a final signal-to-echo ratio SE(n). The residual echo suppression quantity calculating unit 104, using the foregoing Expression (4), for example, computes the residual echo suppression quantity γ(n) from the signal-to-echo ratio SE(n) obtained by the signal-to-echo ratio calculating unit 103. The residual echo suppressing unit 105 executes the suppression processing by multiplying the first residual signal d1(n) by the residual echo suppression quantity γ(n) according to the foregoing Expression (5), thereby obtaining the estimated signal s′(n) of the near-end signal s(n).
Thus, the present embodiment can obtain a highly accurate signal-to-echo ratio SE(n) by comprising the signal-to-echo ratio calculating unit 103 for computing the signal-to-echo ratio SE(n) using the plurality of residual signals supplied from the plurality of update filters.
Incidentally, although
Although the description thus far is made by way of example of the NLMS algorithm, deriving examples of the signal-to-echo ratio SE(n) using the LMS algorithm and affine projection algorithm as the adaptive algorithm in the echo canceler in accordance with the present invention will be described below.
{circumflex over (h)}(n+1)={circumflex over (h)}(n)+μ(n)d(n)×(n) (22)
{circumflex over (h)}(n+1)={circumflex over (h)}(n)+μ(n)Xp(n)[XpT(n)Xp(n)]−1dp(n) (24)
where p is the order of projection, and
X
p(n)=[x(n),x(n−1), . . . ,x(n−p+1)]T0
d
p(n)=[d1(n),d1(n−1), . . . ,d1(n−p+1)]T (26)
The examples of the foregoing Expressions (22) and (24) can be derived in the same process as the example of the NLMS. As clear from Expressions (22) and (24), to use the LMS algorithm or affine projection algorithm, the signal-to-echo ratio calculating unit 103 also requires the received signal x(n) as information for calculating the signal-to-echo ratio SE(n).
In addition, an echo canceler in accordance with the present invention can be configured by using a block adaptive filter algorithm such as the BLMS (Block LMS) or BOP (Block Orthogonal Projection Algorithm), which executes the LMS algorithm or affine projection algorithm in prescribed signal block units. Using these block adaptive filter algorithms, it becomes possible to observe a prescribed block length signal as to the first residual signal d1(n) and second residual signal d2(n). This will improve the accuracy of observations of their variances and power, thereby being able to obtain a more accurate signal-to-echo ratio.
As described above, according to the present embodiment 1, it is configured in such a manner as to comprise the signal-to-echo ratio calculating unit 103 for calculating the signal-to-echo ratio SE(n) from the received signal x(n), the first residual signal d1(n) obtained by the subtraction filter 101, and the second residual signal d2(n) obtained by the update filter 102, in accordance with the filter update step size (n) used by the update filter 102. Accordingly, it can calculate the signal-to-echo ratio only from directly observable statistics without using the amount of acoustic coupling or other estimates, thereby being able to calculate an accurate signal-to-echo ratio even in the conditions where large noise or echo path variation occurs. This makes it possible to suppress the residual echo always stably without deteriorating the send voice.
In addition, according to the present embodiment 1, it is configured in such a manner as to comprise the signal-to-echo ratio calculating unit 103 for calculating the signal-to-echo ratio SE(n) using the received signal x(n), the first residual signal d1(n) obtained by the subtraction filter 101, the second residual signal d2(n) and the first update step size (n) supplied from the first update filter 102′, and the third residual signal d3(n) and the second update step size μ2(n) supplied from the second update filter 106. Accordingly, it can obtain the highly accurate signal-to-echo ratio SE(n).
An echo detector comprising the subtraction filter 101, update filter 102 and signal-to-echo ratio calculating unit 103 shown in the foregoing embodiment 1 will be described.
In
The signal-to-echo ratio calculating unit 103 calculates, according to the foregoing Expression (18), the signal-to-echo ratio SE(n) using the received signal x(n), the first residual signal d1(n) obtained by the subtraction filter 101, the second residual signal d2(n) obtained by the update filter 102 and the filter update step size μ(n) used by the update filter 102.
The echo detecting unit 201 decides on whether the residual echo quantity is not less than a threshold with respect to the send voice signal or not using the signal-to-echo ratio SE(n) the signal-to-echo ratio calculating unit 103 computes, and outputs the decision result as an echo detection result. As the output of the echo detection result, it uses a flag flg_ec(n). When it decides that the residual echo quantity is less than the threshold with respect to the send voice signal, it decides that the echo is undetected and outputs the flag flg_ec(n)=0. In contrast, when it decides that the residual echo quantity is not less than the threshold with respect to the send voice signal, it decides that the echo is detected and outputs the flag flg_ec(n)=1.
As a concrete example using the echo detector 200, voice code transmission control of a mobile phone can be mentioned. For example, when double talk in which echo is superposed on a send volume occurs during a telephone conversation of a mobile phone, the signal-to-echo ratio calculating unit 103 computes the signal-to-echo ratio SE(n) of the residual signal after subtracting the estimated echo through the adaptive filter according to the foregoing Expression (18), and the echo detecting unit 201 detects an echo from the signal-to-echo ratio SE(n) computed. When the echo detecting unit 201 detects the echo, the voice code transmission control (not shown) thereafter transmits a code indicating a pause without transmitting a code of voice.
As described above, according to the present embodiment 2, it is configured such a manner as to comprise the signal-to-echo ratio calculating unit 103 for computing the signal-to-echo ratio SE(n) of the residual signal after subtracting the estimated echo through the adaptive filter, and the echo detecting unit 201 for deciding on whether the residual echo quantity is not less than the threshold with respect to the send voice signal in accordance with the signal-to-echo ratio SE(n) computed. Accordingly, it can detect the presence or absence of the echo accurately even if the echo path varies.
In addition, according to the present embodiment 2, it is configured in such a manner as to execute, in the voice code transmission control using the echo detector 200, the voice code transmission control in accordance with the detection result of the echo detecting unit 201. Accordingly, when an echo is detected, it transmits a code indicating a pause without sending a code of voice, thereby being able to provide a leeway for a circuit capacity and to prevent a far-end user of the telephone conversation from hearing the echo and from feeling displeasure.
The foregoing embodiment 1 shows a configuration which causes the update filter 102 to update the updated filter coefficient sequence using the received signal x(n) and the microphone input signal y′(n), and to obtain the second residual signal d2(n) soon after that using the same received signal x(n) and microphone input signal y′(n), and which uses them for computing the signal-to-echo ratio SE(n).
However, although the received signal x(n) and the near-end signal s(n) are uncorrelated in general, correlation sometimes appear between the received signal x(n) and the near-end signal s(n) and hence there are some cases where the update filter coefficient ĥ(n+1) outputs the second estimated echo signal ŷ2(n) that will eliminate part of the near-end signal s(n) in the microphone input signal y′(n) supplied. In this case, there are some cases where the variance σd
Furthermore, the first subtraction filter 101′ comprises an estimated response signal generating unit 101a′ and a subtractor 101b′, and the second subtraction filter 301 comprises an estimated response signal generating unit 301a and a subtractor 301b, Incidentally, in
Next, the operation of the echo canceler 300 of the present embodiment 3 will be described.
The first subtraction filter 101′ generates the first estimated echo signal ŷ1(n) through the filter processing using the filter coefficient sequence ĥ(n−1) obtained as a result up to the operation before the last operation, and obtains the first residual signal d1(n) by subtracting the signal ŷ1(n) from the microphone input signal y′(n). Likewise, the second subtraction filter 301 generates the second estimated echo signal ŷ2(n) through the filter processing using the updated filter coefficient sequence ĥ(n) obtained in the previous operation, and obtains the second residual signal d2(n) by subtracting the signal ŷ2(n) from the microphone input signal y′(n).
The signal-to-echo ratio calculating unit 302 calculates the signal-to-echo ratio SE(n) using the first residual signal d1(n), second residual signal d2(n) and the update step size μ(n−1) which is used previously and is input from the delay processing unit 303. The computing method is the same as that of the embodiment 1. In addition, when using the LMS algorithm, affine projection algorithm or the like as the adaptive algorithm as described in the embodiment 1, the received signal x(n) is also used. The delay processing unit 303 temporarily stores the update step size μ(n) the update filter 304 uses for the coefficient update, and supplies it to the signal-to-echo ratio calculating unit 302 at the point of time when computing the next signal-to-echo ratio SE(n).
Using the signal-to-echo ratio SE(n) generated by the signal-to-echo ratio calculating unit 302, the residual echo suppression quantity calculating unit 104 computes the residual echo suppression quantity γ(n) using the foregoing Expression (4), for example. Furthermore, the residual echo suppressing unit 105 produces the output signal s′(n) by multiplying the first residual signal d1(n) by the residual echo suppression quantity γ(n) generated by the residual echo suppression quantity calculating unit 104 using the foregoing Expression (5).
The update filter 304 updates the filter coefficient sequence ĥ(n) using the received signal x(n), the second residual signal d2(n) and the update step size μ(n) determined arbitrarily, and obtains the filter coefficient sequence ĥ(n+1) updated according to the following Expression (27).
{circumflex over (h)}(n+1)={circumflex over (h)}(n)+μ(n)ε(n) (27)
It is assumed here that the update is not carried out at the initial point where n=0 because of the lack of the coefficient update quantity, and that the filter coefficient sequence ĥ(1) maintains the initial value ĥ(0). The filter coefficient ĥ(n+1) updated by the update filter 304 is used as the filter coefficient of the second subtraction filter 301 at the next time sequence n+1.
As described above, according to the present embodiment 3, it is configured in such a manner as to comprise the update filter 304 for updating the filter coefficient sequence, the second subtraction filter 301 for obtaining the second residual signal d2(n) from the received signal x(n) and the microphone input signal y′(n) which are supplied after the update of the filter coefficient sequence, and the signal-to-echo ratio calculating unit 302 for calculating the signal-to-echo ratio SE(n) using the second residual signal d2(n). Accordingly, it can prevent the variance σd
Incidentally, although the foregoing embodiment 3 shows the configuration that uses the update step size μ(n−1) obtained in the previous operation, it is not limited to the coefficient update quantity obtained in the previous operation, but can be altered appropriately. For example, a configuration is also possible which uses the update step size μ(n−2) obtained in the operation before the last operation.
When a near-end signal generation condition or a convergence state of the update filter differs from frequency band to frequency band, applying the present invention to an adaptive algorithm using time-frequency conversion such as a high speed LMS so as to calculate the signal-to-echo ratio and execute the residual echo suppression processing for each frequency band makes it possible to expect to achieve more efficient echo canceling. Thus, the present embodiment 4 describes a configuration that employs the adaptive algorithm using the time-frequency conversion.
In
Furthermore, the subtraction filter 101 comprises an estimated response signal generating unit 101a and a subtractor 101b, and the update filter 403 comprises a filter processing unit 403a.
Next, the operation of the echo canceler 400 of the embodiment 4 in accordance with the present invention will be described.
The echo canceler 400 comprises the first to fifth time-frequency conversion units 401, 402, 404, 406 and 410 for carrying out the time-frequency conversion. Accordingly, it executes its processing after dividing the signal into blocks with a prescribed block length L. In the following description, it is assumed that the block number from the start point of the processing is denoted by k.
The subtraction filter 101 generates, when acquiring the received signal x(n), the first estimated echo signal ŷ1(n) through the filter processing using the filter coefficient sequence ĥ(k) obtained up to the previous operation, and obtains the first residual signal d1(n) by subtracting the signal ŷ1(n) from the microphone input signal y′(n).
The first time-frequency conversion unit 401 obtains the frequency elements X(ω,k) of the received signal x(n) by carrying out the time-frequency conversion of the received signal x(n) for each block length L. Here ω is an index denoting frequency. In addition, as the time-frequency conversion here, the DFT (discrete Fourier transform) can be used, for example. Likewise, the second time-frequency conversion unit 402 carries out the time-frequency conversion of the first residual signal d1(n) to obtain the frequency elements D1(ω,k) of the first residual signal.
Using the frequency elements X(ω,k) of the received signal x(n), the frequency elements D1(ω,k) of the first residual signal d1(n) and the arbitrary update step sizes μ(ω,k) determined for the individual frequency elements, the update filter 403 updates the frequency elements Ĥ(k) of the filter coefficient sequence, which are obtained as a result of the operation up to the previous operation, according to the prescribed adaptive algorithm, thereby obtaining the frequency elements Ĥ(k+1) of the updated filter coefficient sequence.
Incidentally, as an example of the update filter in such a frequency domain, there is a high speed LMS algorithm described in the following Reference 2. As an alternative method, it is also possible to use an MDF (Multi Delay Filter) based on the high speed LMS algorithm.
The update filter 403 carries out filter processing of the frequency elements X(ω,k) of the received signal x(n) using the frequency elements Ĥ(k) of the updated filter coefficient sequence, and obtains the frequency elements Ŷ2(ω,k) of the second estimated echo signal. The third time-frequency conversion unit 404 carries out reverse conversion of the frequency elements Ŷ2(ω,k) of the second estimated echo signal from the frequency elements to the time signal, and obtains the second estimated echo signal ŷ2(n). A first subtractor 405 obtains the second residual signal d2(n) by subtracting the second estimated echo signal ŷ2(n) from the microphone input signal y′(n), and the fourth time-frequency conversion unit 406 carries out the time-frequency conversion of the second residual signal d2(n) and obtains the frequency elements D2(ω,k) of the second residual signal.
Using the frequency elements D1(ω,k) of the first residual signal d1(n), the frequency elements D2(ω,k) of the second residual signal d2(n) and the update step sizes μ(ω,k) the signal-to-echo ratio calculating unit 407 computes the signal-to-echo ratios SE(ω,k) for the individual frequency elements. Incidentally, depending on the adaptive algorithm used, the frequency elements X(ω,k) of the input signal are also used for the calculation. In the case of the foregoing high speed LMS algorithm, the signal-to-echo ratio SE(ω,k) is determined by the following Expression (28), for example.
The residual echo suppression quantity calculating unit 408 computes the residual echo suppression quantity γ(ω,k) for each frequency using the signal-to-echo ratio SE(ω,k) obtained by the signal-to-echo ratio calculating unit 407. Furthermore, the residual echo suppressing unit 409 suppresses the residual echo by multiplying the frequency elements D1(ω,k) of the first residual signal d1(n) by the residual echo suppression quantities γ(ω,k) for the individual frequencies computed by the residual echo suppression quantity calculating unit 408. The fifth time-frequency conversion unit 410 carries out the reverse conversion of the output signal S′(ω,k) obtained in this way from the frequency elements to the time signal, thereby outputting the output signal s′(n).
By thus suppressing the residual echoes for the individual frequencies, the residual echo suppression which corresponds to the estimated echo signal, the near-end signal generation condition and the convergence state of the update filter becomes possible. For example, when the near-end signal condition varies depending on the frequency band such as when the near-end signal is large in a particular frequency band and is small in another frequency band, sufficient residual echo suppression is carried out in the frequency band where the near-end signal is large because the power of the residual echo signal is high, but weaker residual echo suppression is carried out in the frequency band where the near-end signal is small because the power of the residual echo signal is low. Thus, the deterioration of the near-end signal can be reduced.
As described above, according to the present embodiment 4, it is configured in such a manner as to comprise the first to fifth time-frequency conversion units 401, 402, 404, 406 and 410 for carrying out the time-frequency conversion, the signal-to-echo ratio calculating unit 407 for computing the signal-to-echo ratio corresponding to the noise condition and the convergence state of the update filter for each frequency element, and the residual echo suppression quantity calculating unit 408 and the residual echo suppressing unit 409 for carrying out the residual echo suppression using the signal-to-echo ratio calculated. Accordingly, it can perform the residual echo suppression corresponding to the magnitude of the near-end signal for each frequency band. More specifically, it suppresses a residual echo feeling by increasing the residual echo suppression quantity in a frequency band where the near-end signal is large and by reducing the residual echo suppression quantity in a frequency band where the near-end signal is small, thereby being able to reduce the deterioration in the send voice signal contained in the near-end signal.
Although the foregoing embodiment 4 shows a configuration for calculating the signal-to-echo ratio for each frequency band using the time-frequency conversion, the present embodiment 5 shows a configuration which carries out, when high frequency resolution is not required, the residual echo suppression processing corresponding to the near-end signal condition and the convergence state of the update filter for each subband using a subband filter.
Next, the operation of the echo canceler 500 of the embodiment will be described.
The first subband disassembling unit 501 divides the received signal x(n) supplied to the echo path 900 to M bands, where M is a prescribed division number, and obtains the received signals x(1)) (n), x(2)(n), . . . , x(M)(n) divided into subbands. Likewise, the second subband disassembling unit 502 divides the microphone input signal y′(n) to M frequency bands, and obtains the microphone input signals y′(1)(n), y′(2)(n), . . . , y′(M)(n) divided into subbands.
Corresponding to the M subbands, the echo canceling unit group 503 consists of a set of the echo canceling units 503(1), 503(2), . . . , 503(M) (referred to as the echo canceling unit group 503 generically from now on). The echo canceler described in the embodiment 1 or 3 is applicable to each unit in the echo canceling unit group 503, which corresponds to each subband. Accordingly, in the echo canceling unit group 503, for the M pairs of the received signals x(n) and the microphone input signals y′(n) divided into the subbands, the individual subband echo canceling units 503 carry out the processing to obtain M output signals s′(1)(n), s′(2)(n), . . . , s′(M)(n) undergoing the subband disassembly. The subband assembling unit 504 carries out subband assembly of the subband disassembled output signals s′(1)(n), s′(2)(n), . . . , s′(M)(n) and obtains the output signal s′(n).
As described above, according to the present embodiment 5, it is configured in such a manner as to comprise the first subband disassembling unit 501 for dividing the received signal x(n) into the frequency bands; the second subband disassembling unit 502 for dividing the microphone input signal y′(n) into the frequency bands; the echo canceling unit group 503 for carrying out the residual echo suppression processing of the individual signals passing through the division into the frequency bands; and the subband assembling unit 504 for carrying out subband assembly of the output signals of the subband disassembly. Accordingly, even if the near-end signal condition and the convergence state of the update filter differ from frequency band to frequency band, it can reduce the residual echo more quickly and to a sufficiently smaller value.
Incidentally, as for the echo cancelers from the foregoing embodiments 3 to 5, an echo detector for detecting an echo can be configured in the same manner as the echo detector shown in the embodiment 2 by making a decision as to whether the residual echo quantity is not less than a threshold with respect to the send voice signal using the signal-to-echo ratio SE(n) of the residual signal after subtracting the estimated echo computed through the adaptive filter. Incidentally, when configuring the echo detector, the fifth time-frequency conversion unit 410 shown in the embodiment 4 and the subband assembling unit 504 shown in the embodiment 5 can be removed. Applying the configurations of the echo cancelers from the embodiment 3 to 5 to the echo detectors enables detecting the residual echo accurately even if the echo path varies.
Incidentally, it is to be understood that a free combination of the individual embodiments, variations of any components of the individual embodiments or removal of any components of the individual embodiments are possible within the scope of the present invention.
As described above, an echo canceler and an echo detector in accordance with the present invention are capable of suppressing the occurrence of the residual echo even if the telephone conversation is carried out under large noise or the echo path varies, and are capable of providing a good telephone conversation environment without hindering the telephone conversation by excessively suppressing the send voice. Accordingly, it is applicable to a technique for detecting an echo of the received signal mixed into the send signal and for canceling out the detected echo from the send signal.
100, 110, 300, 400, 500 echo canceler; 101 subtraction filter; 101′ first subtraction filter; 101a, 101a′, 301a estimated response signal generating unit; 101b, 101b′, 301b subtractor; 102, 403 update filter; 102′ first update filter; 102a, 102a′, 106a, 403a filter processing unit; 102b, 102b′, 106b subtractor; 103, 302, 407 signal-to-echo ratio calculating unit; 104, 408 residual echo suppression quantity calculating unit; 105, 409 residual echo suppressing unit; 106 second update filter; 200 echo detector; 201 echo detecting unit; 301 second subtraction filter; 303 delay processing unit; 401 first time-frequency conversion unit; 402 second time-frequency conversion unit; 404 third time-frequency conversion unit; 406 fourth time-frequency conversion unit; 410 fifth time-frequency conversion unit; 501 first subband disassembling unit; 502 second subband disassembling unit; 503 echo canceling unit group; 504 subband assembling unit; 900 echo path; 901 speaker; 902 microphone.
Number | Date | Country | Kind |
---|---|---|---|
2011-105350 | May 2011 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2012/002194 | 3/29/2012 | WO | 00 | 5/31/2013 |