The present invention relates generally to techniques for echo control in acoustic systems, and more particularly, to improved methods and systems for improved echo suppression in an echo controller.
In many acoustic systems, such as traditional telephone systems and evolving voice-over-IP (VoIP) systems, it is desirable to minimize acoustic and electrical echoes. Typically, acoustic signals are first processed using echo cancellation techniques and then any residual echoes are processed using echo suppression techniques. For a detailed discussion of conventional echo control techniques, see, for example, M. M. Sondhi and D. A. Berkley entitled “Silencing Echoes On The Telephone Network,” Proceedings of the IEEE, Vol. 68, No. 8, pages 948-963, August 1980.
For example, a media processor in a voice-over-IP network typically includes an echo controller, such as a software-based echo canceler-suppressor. The echo controller eliminates acoustic and electrical echo originating, for example, from endpoints and line trunks that communicate via time-domain multiplex (TDM) connections. The echo controller is engaged in a variety of call topologies in which there is a TDM-to-IP speech-signal conversion boundary. For example, in a call from an IP terminal to a TDM terminal within the same port network, a media processor engages an echo controller to control acoustic echo originating at the TDM terminal that would otherwise be perceived by the IP phone user. When present, this acoustic echo is caused by the loudspeaker-to-microphone coupling in the TDM phone's handset, headset or speakerphone.
While existing echo suppression techniques provide adequate performance for most applications, they suffer from a number of limitations, which if overcome, could further improve the reduction of echoes in acoustic systems. A need therefore exists for improved echo suppression techniques for use in the echo suppressor component of an echo controller. Another need exists for an echo controller that demonstrates an improved ability to respond to acoustic echo originating from, for example, TDM speakerphones not equipped with an acoustic echo canceler. A further need exists for an echo controller that improves acoustic-echo control performance in essentially all call scenarios, such as speakerphones, handsets, and headsets, and improves electrical-echo control performance in call connections involving a TDM trunk.
Generally, methods and apparatus are provided for reducing echo from a received signal. A suppression gain is applied to an output of an echo canceler that has processed the received signal. The suppression gain includes a region of sloping attenuation about a decision point. The echo canceler optionally estimates an echo path and subtracts the estimate from the received signal. The suppression gain includes a non-zero lower bound, gmin, on a maximum attenuation applied by a suppressor, that is based on operating conditions of the echo canceler.
The region of sloping attenuation applies a variable amount of attenuation that depends on the size of an output error ē(n) of the echo canceler relative to the received signal, x(n). The decision point, T, is established to ensure that a residual echo is sufficiently attenuated without significant attenuation of speech from a near-side talker and is based, for example, on an estimate of the echo return loss (ERL) associated with the echo path and an echo return loss enhancement (ERLE).
A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings.
The present invention provides improved echo suppression techniques for use in the echo suppressor component of an echo controller.
For a more detailed discussion of conventional echo cancellation techniques, see, for example, M. M. Sondhi and D. A. Berkley entitled “Silencing Echoes On The Telephone Network,” Proceedings of the IEEE, Vol. 68, No. 8, pages 948-963, August 1980. Generally, as shown in
For a variety of practical reasons, the echo canceler 140 provides less than ideal modeling and cancellation of the echo, and so an echo suppressor 150 is required to reduce the magnitude of the echo to a level that is not noticeable to the far-side talker. The suppressor 150 implements a dynamic attenuator, or gain control, in the form e′(n)=g(n)e(n), where g(n) is a time-varying gain function satisfying 0≦g(n)≦1. Ideally, attenuation is applied only when it is certain that y(n) contains no speech from the near-side talker. Otherwise, near-side speech will be attenuated or clipped.
Most of the control paths in
where 0<α<1 is a smoothing constant providing a mechanism of averaging. In one implementation, a can be chosen to provide a time constant of about 15 milliseconds. Similar to Eq. (1), the envelopes
ERL=
Generally, ERL is a measure of the bulk level, or loudness, of the echo. If there is loss in the echo path (the echo is weak), ERL is positive; and if there is gain in the echo path (loud echo), ERL is negative. The ERL computed using Eq. (2) must be computed when only far-side speech is present; if near-side speech or near-side noise is present, the measure will be disturbed. Alternatively, and more accurately, the ERL can be computed as the sum of the squares of ĥ(n) after the adaptive filter in the echo canceler 140 has converged sufficiently to the echo path. Finally, assuming only far-side speech is present, the echo return loss enhancement (ERLE) is defined by
ERLE=
The ERLE is a measure of the performance of the echo canceler and is more positive as the canceler converges to the true echo path. The ERL and ERLE are used by the suppressor 150 to compute decision thresholds and gain quantities.
Center Clipping Suppression Techniques
An echo suppressor 150 implements a multiplicative gain function.
A disadvantage of the center clipper is its binary-like behavior about the threshold point. The attenuation varies from unity to infinite attenuation for residual signals ē(n) that are hovering around the point
Soft-Response Echo Suppression Algorithm
According to one aspect of the present invention, the shortcomings of the center clipper are addressed by modifying the gain function to include a region of variable, sloping attenuation about the decision point.
The value of ē(n) relative to
1) Update the following envelopes and performance measures during step 410, as described in equations (1)-(3):
2) Calculate the suppression threshold below which attenuation is applied during step 420:
T=ERL+ERLE−Guard, where Guard>0 in decibels, e.g., Guard=10 dB.
It is noted that the threshold guard, Guard, can be chosen to provide a more conservative, or liberal, threshold point. Because ERL and ERLE are estimated quantities, the threshold guard provides a means of compensating for the variances of these estimates. In one exemplary implementation, Guard is made positive to achieve more conservative control of residual echo. With Guard positive, T is smaller for a given ERL and ERLE, and therefore
3) Determine the maximum necessary attenuation during step 430. Compute the required attenuation to meet a total echo reduction of 55 dB (or another amount):
gmin,dB(n)=−55 dB+ERL+ERLE.
It is noted that the lower bound is computed assuming a 55 dB level of total echo control (which has been found to be sufficient in most applications).
4) Compute the suppression gain during step 440:
gdb(n)=max[gmin,dB,min[0,ē(n)−
(To simplify the immediate discussion, it is assumed that the slope of the soft region in
5) Compute the corresponding linear value of gain during step 450: g(n)=10g
6) Apply the suppression gain to the output of echo canceler 140 during step 460: e′(n)=g(n)e(n).
Suppose that for some processing time index n, ERL=15 dB, ERLE=12 dB, and let Guard=10 dB. Then, steps 420 through 440 of the soft-response suppression algorithm 400 yield:
Step 420. T=ERL+ERLE−Guard=15+12−10=17 dB.
Step 430. gmin(n)=−55 dB+ERL+ERLE=−55+15+12=−28 dB.
Step 440. gdB(n)=max[−28, min[0, ē(n)−
So, attenuation is applied if ē(n) is 17 dB or more below
Extensions
The threshold T should be bounded to reflect the maximum performance expected in any real situation. For example, if the sum ERL+ERLE is never expected to exceed 40 dB or be less than 0 dB, the threshold should be computed as T=max[0, min(40, ERL+ERLE)]−Guard.
In step 440 of the soft-response suppression algorithm 400, 1 dB of attenuation is added for every 1 dB that ē(n) is below the quantity
gdB(n)=max{gmin,dB,min[0,(
where b>0, integer, specifies the slope of the line in
It is to be understood that the embodiments and variations shown and described herein are merely illustrative of the principles of this invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention.
Number | Name | Date | Kind |
---|---|---|---|
6442275 | Diethorn | Aug 2002 | B1 |
6834108 | Schmidt | Dec 2004 | B1 |
Number | Date | Country | |
---|---|---|---|
20060193464 A1 | Aug 2006 | US |