Aspects of the present disclosure are best understood from the following detailed description when read with the accompanying figures, in which:
It is to be understood that the following disclosure provides many different embodiments, or examples, for implementing different features of various embodiments. Specific examples of components and arrangements are described below to simplify the present disclosure. These are, of course, merely examples and are not intended to be limiting. In addition, the present disclosure may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
Radio access network (RAN) 110 may include a base station controller (BSC) 112 that is coupled with one or more base transceiver stations (BTSs) 114-115 that provide an over-the-air interface with one or more wireless mobile terminals 116-117. BTSs 114-115 include equipment for transmitting and receiving radio signals with mobile terminals 116-117. BTSs 116-117 may be adapted to encrypt and decrypt communications made with BSC 112 which may provide control to a plurality of BTSs. In the illustrative example, BSC 110 interfaces with a switching wireless media gateway (WMG) 150 that provides for communications between devices in RAN 110 with other access networks, such as PSTN access networks 120a-120b and packet network 130. RAN 110 and WMG 150 may comprise part of a cellular or mobile telecommunications network, such as a network compliant with the Global System for Mobile (GSM) Communications standards, Universal Mobile Telecommunications System (UMTS), or another mobile or cellular radio system.
PSTN access network 120a may include various devices, such as residential telephones 122a-122m, and/or one or more private branch exchanges (PBXs) 124. PBX 124 may connect with switch 151 via one or more trunk lines 128. Various devices 126a-126n, such as telephones, communication terminals, facsimile machines, and the like, may be connected with PBX 124. Residential telephones 122a-122m may be coupled by local loops 125a with switch 151, such as a class 5 switch, that may be deployed as a central office. In other implementations, telephones 122a-122m may be coupled with switch 151 by digital loop carriers, PBXs, digital concentrators, and/or other aggregators, or may otherwise be configured to communicate with switch 151 through PSTN access network 120a. Loops 125a may include digital loops and/or analog loops, and may be configured to transmit time-division multiplexed (TDM) and other PSTN data, among others. Loops 125a may comprise, for example, a respective twisted copper pair terminating telephones 122a-122m. A hybrid transformer 161 may couple two-wire local customer loops 125a to four-wire long-distance trunks.
In a similar manner, PSTN access network 120b may include a variety of communication devices 123a-123p that may be interconnected with a switching media gateway (MGW) 152 via local loops 125b or other suitable couplings. In this implementation, media gateway 152 may provide switching services and media handling across various platforms. Accordingly, MGW 152 may interface with one or more network types, such as PSTN access network 120b and packet network 130. In other implementations, MGW 152 may interface with a RAN and thus may include one or more wireless network interfaces. Moreover, MGW 152 may provide both Class 4 and Class 5 switching services and thus may aggregate traffic from other network entities, such as switch 151 interconnected therewith, and may provide switching services to termination points in networks 120b and 130. A hybrid transformer 162 may couple two-wire local customer loops 125b to four-wire long-distance trunks.
Packet network 130, such as the Internet or another packet switching network, may include interconnected computer networks, data processing systems, communication devices, packet switching infrastructure, and the like. Packet network 130 may interface with one or more switches, such as switching MGW 152. In the illustrative example, MGW 152 interfaces with PSTN access network 120b and packet network 130, and thus may include both TDM switching and packet switching capabilities.
Switch 153 may aggregate traffic from any number of telecommunication nodes, such as MGWs 150 and 152 connected therewith via respective trunks 160 and 161, other network switches, and the like, and thus may be implemented as a Class 4 switch. Accordingly, any device in RAN 110, PSTN access networks 120a and 120b, and packet network 130 may communicate with any other device in RAN 110, PSTN access networks 120a and 120b, and packet network 130. In accordance with embodiments disclosed herein, mechanisms for calculating a minimum echo path delay are provided for utilization by an AEC deployed in network 100.
Echo canceller 200 may cancel the reflection from near-end send path signal 252 that is applied to a subtractor 220. An estimated echo is generated by an adaptive filter 240 which adapts to model the tail circuit (near-end echo path) response thus producing a replica of the echo returning from the near-end. The output of adaptive filter 240 is a replica of the echo signal and is subtracted from the hybrid device output, i.e., the near-end send path signal 252. The echoes may be substantially eliminated from this signal and it is then used again to adjust the filter weights, e.g., by supply of subtractor 220 output as an input to adaptive filter 240. Any remaining residual echo may be further attenuated by a nonlinear processor 230.
An acoustic echo controller suppresses echo generated from an acoustic coupling at a terminal device. The performance of such acoustic echo control is significantly impacted by the AEC's knowledge of the echo path delay. Mechanisms for accurately calculating a minimum echo path delay described herein provide for improved determination of an echo path delay, and thus enhanced call quality may be had by improved AEC performance.
A common deficiency of AEC 340 performance is due to errors in the echo path delay estimation. An echo path delay estimate that diverges from the true echo path delay may result in additional degradation of the receive path signal. Conventional echo path delay estimates comprise an estimate that is coded into or otherwise interfaced with AEC 340 and often exhibits large errors with respect to the true echo path delay. For example, an echo path delay estimate may be parametrically estimated. However, such echo path delay estimates are error prone and are not dynamically calculated based on a particular call configuration or the particular transmission characteristics of the echo path. Problematically, an echo path delay estimate that is smaller than the actual echo path delay may result in signal clipping or distortion, and an echo path delay estimate that is larger than the actual echo path delay may result in passing of acoustic echo on the receive path. Prior art echo path delay estimation mechanisms do not provide for transmission delay characteristic that are measured from the transmission infrastructure or transmission characteristics of the network. In accordance with embodiments disclosed herein, mechanisms for calculating a minimum echo path delay that account for transmission characteristics are provided in a manner that provides for determination of a more accurate echo path delay than that is available in conventional systems.
Media gateway 510 is an example of a media gateway, such as media gateway 152 depicted in
After establishing a media path between terminals 521 and 531, terminal 521 may supply a voice signal to system 500 that is transmitted to media gateway 510 and is received thereby as a receive path input signal, Rin, at a TDM switching matrix 512, and a receive path output signal, Rout, may then be processed for transmission to terminal 531. In the depicted example, the receive path output signal, Rout, is processed by an encoder 550, e.g., that may packetize and encode the receive path output signal, Rout, for transmission on packet network 530 via a packet switching matrix 514. The encoded receive path output signal is then transmitted to terminal 531 via packet network 530 and wireless network 535. TDM interface cards and packet interface cards may be respectively coupled with TDM switching matrix 512 and packet switching matrix 514 for providing respective TDM and packet input/output interfaces to system 500.
Terminal 531 may supply a send path signal that is received by media gateway 510 and that is processed thereby for delivery to terminal 521. In the present example, processing of a near end send path signal received by media gateway 510 includes processing by a jitter buffer 560 that provides a buffering delay of the near end send path signal. The buffered send path signal output from jitter buffer 560 may then be supplied to a decoder 570, e.g., that de-packetizes and decodes the send path signal, Sin. AEC 540 may then transmit the de-packetized send path signal, Sin, to terminal 521.
Media gateway 510 may include a processor 516 that may interface with memory device 518 for fetching executable instructions therefrom. Processor 516 may interface with various functions or modules, such as switch matrices 512 and 514, AEC 540, encoder 550, jitter buffer 560, and decoder 570 for managing the operation of various media gateway subsystems and functions.
As discussed above, an acoustic echo 580 may be introduced into the transmission path due to an acoustic coupling between the loudspeaker and microphone of terminal 531. Accordingly, an echo path may be defined that includes a transmission path, or path delay, of a transmission signal that may be returned, at least in part, to the originator as an echo, and a processing delay, e.g., processing latencies introduced by encoder 550, jitter buffer 560, and decoder 570. In the present example, an echo path 590 is defined that includes the receive transmission path from AEC 540 to terminal 531 and the send transmission path from terminal 531 to AEC 540. An echo path delay of echo path 590 comprises the transmission duration, or path delay, of a signal transmission from AEC 540 to terminal 531 and from terminal 531 back to AEC 540, and the processing delay introduced by processing entities deployed in the transmission path.
In accordance with an embodiment, AEC 540 may calculate an echo path 590 delay estimate based on transmission and processing characteristics of echo path 590. In the example depicted in
In the send path, a jitter buffer delay D2 specifies the buffer duration of jitter buffer 560 that results in a processing latency in the send path, and a decoder delay D3 specifies a processing latency of decoder 570. The sum of the encoder delay D1, the jitter buffer delay D2, and the decoder delay D3 comprises a minimum duration for which any acoustic reflection of a receive path signal output at port 541 of AEC 540 may be returned as an echo signal at port 542 of AEC 540. Thus, in accordance with an embodiment, a minimum echo path 590 delay (EPD_Min) is calculated according to the following:
EPD_Min=D1+D2+D3 eq. 1
The minimum echo path delay calculation provides for improved AEC operation.
In accordance with another embodiment, the minimum echo path 590 delay may be calculated based on a round trip delay. For example, a round trip delay measurement may be implemented by real time control protocol (RTCP) functions. RTCP functions may be implemented in software stored in memory 518 or may be coded into AEC 540. In any manner, a round trip delay between media gateway 510 and terminal 531 may be calculated by transmission metrics transmitted in RTCP packets. In this implementation, an IP network round trip delay D4 is calculated as an estimate of the minimum echo path delay from media gateway 510 to terminal 531 and back to media gateway 510 as depicted in
In one embodiment, the minimum echo path delay may be calculated using RTCP according to the following:
EPD_Min=D4=tr−LSR−DLSR, eq. 2
where tr is the receiver report (RR) reception time as indicated in an RTCP receiver report packet, LSR is the last sender report (SR) timestamp, and DLSR is the delay since the last sender report. As is known, a sender report and receiver report respectively comprise RTCP data structures that provide reception quality feedback using RTCP report packets. For a detailed description of the RR and SR, see, for example, RFC 3550, “RTP: A Transport Protocol for Real-Time Applications” by Schulzrinne, et al., the description of which is incorporated herein by reference.
The IP network round trip delay D4 may be periodically calculated, and on upon detection of a voice signal supplied on the receive path, AEC 540 may allow the minimum echo path delay EPD_Min to elapse prior to applying acoustic echo suppression or cancellation on the send path from terminal 531 to 521.
In accordance with another embodiment, an echo path model may be calculated that is based on the call configuration for calculating the minimum echo path delay. In a particular implementation, a pre-defined test signal used for echo path evaluation purposes is encoded with an encoder used in the echo path. The test signal may comprise an audio signal that may be supplied in-band on the echo path. In
In this example, the decoded signal that represents the decoded output of decoder 570 may be analyzed, e.g., by a frequency domain analysis, to obtain a numeric characterization, e.g., a spectrum signature, of the simulated echo signal. The frequency domain analysis may be made on a particular frame size, e.g., 40 ms of the signal. With the frequency domain analysis of the test signal, AEC 540 may initiate an echo path timer and apply an instance of the test signal in the voice path at port 541. AEC 540 may then begin monitoring send path input signals, Sin, supplied at port 542 of AEC 540. Analysis of send path input signals, Sin, may be sequentially made on pre-defined frame sizes. On detection of a send path input signal, Sin, having a spectrum signature that matches the simulated echo signal spectrum signature, the echo path timer value may be read. The value of the echo path timer upon detection of an input signal, Sin, having a spectrum signature that matches the simulated echo signal spectrum signature calculated for the test signal may then be assigned as the value of the minimum echo path delay. Upon detection of a voice signal supplied on the receive path, AEC 540 may allow the minimum echo path delay EPD_Min to elapse prior to applying acoustic echo suppression or cancellation on the send path from terminal 531 to 521.
The minimum echo path delay calculation subroutine is invoked (step 602), and a variable, i, may be initialized (step 604). A first processing entity(i) in the echo path may then be identified (step 606), and a processing delay, Delay(i), associated therewith may then be read or otherwise obtained by AEC 540 (step 608). For example, for a particular call, an encoder, such as encoder 550 depicted in
After obtaining the processing delay, Delay(i), associated with the processing entity(i), the variable i may then be incremented (step 610), and the echo path delay calculation subroutine may then proceed to evaluate whether an additional processing entity(i) is included in the echo path (step 612). In the event that another processing entity(i) is included in the echo path, the echo path delay calculation subroutine may return to step 608 to obtain the processing delay, Delay(i), associated with the processing entity(i). When the processing delays have been read for each respective processing entity in the echo path, the minimum echo path delay calculation subroutine may then proceed to calculate the minimum echo path delay, EPD_Min, by summing each processing delay for the processing entities included in the echo path (step 614). The echo path delay calculation subroutine cycle may then end (step 616).
The echo path delay calculation subroutine is invoked (step 702), and AEC 540, or another processing entity or service of media gateway 510, receives an RTCP receiver report from terminal 531 (step 704). The time, tr, at which the receiver report is received is recorded (step 706), and the subroutine then subtracts the last sender report timestamp LSR from the receiver report receipt time tr (step 708). The total round-trip time between media gateway 510 and terminal 531 may then be calculated by subtracting the delay since last sender report (DLSR) from the difference between the receiver report receipt time, tr, and the LSR (step 710). The minimum echo path delay may then be set to the round-trip time (step 712). The echo path delay calculation subroutine cycle may then end (step 714).
The minimum echo path delay calculation subroutine of
The minimum echo path delay calculation subroutine is invoked (step 802), and a spectrum signature of a test signal is obtained (step 804). The spectrum signature may be obtained from AEC 540, memory 518, another storage device or may be calculated by AEC 540 as described more fully hereinbelow with reference to
When a signature match is identified at step 814 thus indicating an acoustic echo of the test signal has been received at media gateway 510, the minimum echo path delay EPD_Min may be set to the difference between the time, Time2, at which the block having the spectrum signature matching the test signal spectrum signature was received and the time, Time1, at which the test signal was supplied to the echo path (step 816). The echo path delay calculation subroutine cycle may then end (step 818).
The test signal spectrum signature calculation subroutine is invoked (step 902), and a test signal is obtained by the spectrum signature calculation subroutine (step 904). The test signal may be maintained by AEC 540, by memory 518, or another suitable storage device with which AEC 540 is interfaced. The codec used in the echo path is identified (step 906), and the test signal is encoded therewith (step 908). The test signal is then decoded with the codec (step 910), and the decoded test signal may then be attenuated (step 912), e.g., by 25 dB, to simulate an acoustic reflection of the test signal that may be coupled between the loudspeaker and microphone of a terminal deployed in system 500. The attenuated test signal may again be encoded with the codec (step 914) to simulate encoding of the acoustic echo that may occur at a terminal, such as terminal 531, in the echo path. The encoded and attenuated test signal may then be decoded to simulate decoding of the echo by the media gateway (step 916). A frequency domain analysis may then be performed on the decoded signal obtained at step 916 to obtain a spectrum signature of a simulated echo of the test signal (step 918). The spectrum signature of the test signal may then be stored, e.g., in AEC 540, memory 518, or another suitable storage device (step 920). The test signal spectrum signature may be stored in association with an identifier of the codec used for coding and decoding at steps 908, 910, 914, and 916. The spectrum signature calculation subroutine cycle may then end (step 922).
The processing steps of
The minimum echo path delay calculation subroutine described with reference to
Various utilizations of the minimum echo path delay calculations and double talk detection mechanisms described herein may be used by an AEC. For example, acoustic echo suppression may be delayed by an AEC until expiration of the echo path timer in response to detection of a receive path voice signal. Additionally, an interval that defines an estimated range of the echo path delay may be determined that includes the calculated minimum echo path delay. In accordance with an embodiment, the estimated range of echo path delay may be specified according to the following:
EROEPD=(EPD_Min,EPD_Max) eq. 3
where EPD_Min is calculated by any of the techniques described above that account for network transmission characteristics. The maximum echo path delay (EPD_Max) value may be assigned a value that specifies an interval of the echo path delay that may be allowed to expire in addition to the minimum echo path delay prior to acoustic echo suppression being applied in a send path after detection of a voice signal in the receive path. For instance, the EPD_Max value may be pre-defined and dependent on the particular mechanism used for calculating the minimum echo path delay, particular network configuration, or based on other network metrics or latency estimates. Those skilled in the art will recognize numerous other applications for improving echo control in a network system by exploiting the minimum echo path delay calculation techniques described herein.
As described, various mechanisms are provided for calculating a minimum echo path delay. The mechanisms described herein provide for an accurate calculation of a minimum echo path delay based on transmission characteristics of the echo path. Moreover, the minimum echo path delay calculations may be periodically repeated thus providing for dynamic calculation of the minimum echo path delay in a manner that accounts for variations in echo path characteristics. Minimum echo path delay calculations described herein may be utilized by network-based echo controllers for provide enhanced echo suppression or calculation. Voice signal distortion and echo pass through commonly encountered due to errors in double-talk detection resulting from erroneous echo path delay estimates may be significantly reduced or eliminated by exploiting the accurate calculation of the minimum echo path delay, and thus call quality may be significantly enhanced.
The network and device examples depicted in
The flowcharts of
Aspects of the present invention may be implemented in software, hardware, firmware, or a combination thereof. The various elements of the system, either individually or in combination, may be implemented as a computer program product tangibly embodied in a machine-readable storage device for execution by a processing unit. Various steps of embodiments of the invention may be performed by a computer processor executing a program tangibly embodied on a computer-readable medium to perform functions by operating on input and generating output. The computer-readable medium may be, for example, a memory, a transportable medium such as a compact disk, a floppy disk, or a diskette, such that a computer program embodying the aspects of the present invention can be loaded onto a computer. The computer program is not limited to any particular embodiment, and may, for example, be implemented in an operating system, application program, foreground or background process, driver, network stack, or any combination thereof, executing on a single computer processor or multiple computer processors or another instruction execution apparatus. Additionally, various steps of embodiments of the invention may provide one or more data structures generated, produced, received, or otherwise implemented on a computer-readable medium, such as a memory.
Although embodiments of the present disclosure have been described in detail, those skilled in the art should understand that they may make various changes, substitutions and alterations herein without departing from the spirit and scope of the present disclosure. Accordingly, all such changes, substitutions and alterations are intended to be included within the scope of the present disclosure as defined in the following claims.