Method and System For Echo Estimation and Cancellation

FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

[Not Applicable]

MICROFICHE/COPYRIGHT REFERENCE

[Not Applicable]

FIELD OF THE INVENTION

Certain embodiments of the invention relate to processing of audio signals. More specifically, certain embodiments of the invention relate to a method and system for echo estimation and cancellation.

BACKGROUND OF THE INVENTION

In audio applications, systems that provide audio interface and processing capabilities may be required to support duplex operations, which may comprise the ability to collect audio information through a sensor, microphone, or other type of input device while at the same time being able to drive a speaker, earpiece of other type of output device with processed audio signal. In order to carry out these operations, these systems may utilize audio coding and decoding (codec) devices that provide appropriate gain, filtering, and/or analog-to-digital conversion in the uplink direction to circuitry and/or software that provides audio processing and may also provide appropriate gain, filtering, and/or digital-to-analog conversion in the downlink direction to the output devices.

As audio applications expand, such as new voice and/or audio compression techniques and formats, for example, and as they become embedded into wireless systems, such as mobile phones, for example, novel codec devices may be needed that may provide appropriate processing capabilities to handle the wide range of audio signals and audio signal sources. In this regard, added functionalities and/or capabilities may also be needed to provide users with the flexibilities that new communication and multimedia technologies provide. Moreover, these added functionalities and/or capabilities may need to be implemented in an efficient and flexible manner given the complexity in operational requirements, communication technologies, and the wide range of audio signal sources that may be supported by mobile phones.

The audio inputs to mobile phones may come from a variety of sources, at a number of different sampling rates, and audio quality. Polyphonic ringers, voice, and high quality audio, such as music, are sources that are typically processed in a mobile phone system. The different quality of the audio source places different requirements on the processing circuitry, thus dictating flexibility in the audio processing systems.

Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with the present invention as set forth in the remainder of the present application with reference to the drawings.

BRIEF SUMMARY OF THE INVENTION

A system and/or method for echo estimation and cancellation, substantially as shown in and/or described in connection with at least one of the figures, as set forth more completely in the claims.

Various advantages, aspects and novel features of the present invention, as well as details of an illustrated embodiment thereof, will be more fully understood from the following description and drawings.

BRIEF DESCRIPTION OF SEVERAL VIEWS OF THE DRAWINGS

FIG. 1 is a module diagram of an exemplary wireless system, which may be utilized in accordance with an embodiment of the invention.

FIG. 2 is a module diagram illustrating an exemplary audio CODEC interconnection, in accordance with an embodiment of the invention.

FIG. 3 is a module diagram of an exemplary audio system, in accordance with an embodiment of the invention.

FIG. 4 is a flow diagram illustrating exemplary steps in echo estimation and cancellation, in accordance with an embodiment of the invention.

FIG. 5 is a diagram illustrating exemplary echo and uplink signal estimation utilizing subband non-linear processing, in accordance with an embodiment of the invention.

FIG. 6 is a diagram illustrating exemplary non-linear processing gain calculation, in accordance with an embodiment of the invention.

DETAILED DESCRIPTION OF THE INVENTION

Certain aspects of the invention may be found in a method and system for echo estimation and cancellation. Exemplary aspects of the invention may comprise one or more circuits and/or processors in a wireless device that is operable to communicate downlink (DL) and uplink (UL) signals. The one or more circuits and/or processors may be operable to estimate combined echo return loss and echo return loss enhancement (ERL+ERLE). Non-linear processing, subband analysis, and the estimated ERL+ERLE values may be utilized to calculate a subband gain vector for mitigating residual echo. The ERL+ERLE may be estimated by averaging a difference in the DL and UL signals. A maximum value of the difference in one or more subbands may be determined over a period of time. A non-linear distortion adjustment factor may be estimated for the combined echo return loss and echo return loss enhancement (ERL+ERLE). An estimation error for the combined echo return loss and echo return loss enhancement (ERL+ERLE) may be calibrated specific to the wireless device. The estimating of the combined echo return loss and echo return loss enhancement (ERL+ERLE) may be suspended for a period of time after a transition in the DL or UL signals. Comfort noise may be added to the UL signal to mask the residual echo. The residual echo may be mitigated following a dual echo canceller. The estimating of the combined echo return loss and echo return loss enhancement (ERL+ERLE) may be suspended when the DL signal is not present. A noise level may be included in the gain value calculation.

FIG. 1 is a module diagram of an exemplary wireless system, which may be utilized in accordance with an embodiment of the invention. Referring to FIG. 1, the wireless device 150 may comprise an antenna 151, a transceiver 152, a baseband processor 154, a processor 156, a system memory 158, a logic module 160, a Bluetooth radio/processor 162, a CODEC 164, an external headset port 166, an analog microphone 168, stereo speakers 170, a Bluetooth headset 172, a hearing aid compatible (HAC) coil 174, a dual digital microphone 176, and a vibration transducer 178. The antenna 151 may be used for reception and/or transmission of RF signals.

The transceiver 152 may comprise suitable logic, circuitry, interfaces, and/or code that may be enabled to modulate and upconvert baseband signals to RF signals for transmission by one or more antennas, which may be represented generically by the antenna 151. The transceiver 152 may also be enabled to downconvert and demodulate received RF signals to baseband signals. The RF signals may be received by one or more antennas, which may be represented generically by the antenna 151. Different wireless systems may use different antennas for transmission and reception. The transceiver 152 may be enabled to execute other functions, for example, filtering the baseband and/or RF signals, and/or amplifying the baseband and/or RF signals. Although a single transceiver 152 is shown, the invention is not so limited. Accordingly, the transceiver 152 may be implemented as a separate transmitter and a separate receiver. In addition, there may be a plurality transceivers, transmitters and/or receivers. In this regard, the plurality of transceivers, transmitters and/or receivers may enable the wireless device 150 to handle a plurality of wireless protocols and/or standards including cellular, WLAN and PAN.

The baseband processor 154 may comprise suitable logic, circuitry, interfaces, and/or code that may be enabled to process baseband signals for transmission via the transceiver 152 and/or the baseband signals received from the transceiver 152. The processor 156 may be any suitable processor or controller such as a CPU, DSP, ARM, or any type of integrated circuit processor. The processor 156 may comprise suitable logic, circuitry, and/or code that may be enabled to control the operations of the transceiver 152 and/or the baseband processor 154. For example, the processor 156 may be utilized to update and/or modify programmable parameters and/or values in a plurality of components, devices, and/or processing elements in the transceiver 152 and/or the baseband processor 154. At least a portion of the programmable parameters may be stored in the system memory 158.

Control and/or data information, which may comprise the programmable parameters, may be transferred from other portions of the wireless device 150, not shown in FIG. 1, to the processor 156. Similarly, the processor 156 may be enabled to transfer control and/or data information, which may include the programmable parameters, to other portions of the wireless device 150, not shown in FIG. 1, which may be part of the wireless device 150.

The processor 156 may utilize the received control and/or data information, which may comprise the programmable parameters, to determine an operating mode of the transceiver 152. For example, the processor 156 may be utilized to select a specific frequency for a local oscillator, a specific gain for a variable gain amplifier, configure the local oscillator and/or configure the variable gain amplifier for operation in accordance with various embodiments of the invention. Moreover, the specific frequency selected and/or parameters needed to calculate the specific frequency, and/or the specific gain value and/or the parameters, which may be utilized to calculate the specific gain, may be stored in the system memory 158 via the processor 156, for example. The information stored in system memory 158 may be transferred to the transceiver 152 from the system memory 158 via the processor 156.

The system memory 158 may comprise suitable logic, circuitry, and/or code that may be enabled to store a plurality of control and/or data information, including parameters needed to calculate frequencies and/or gain, and/or the frequency value and/or gain value. The system memory 158 may store at least a portion of the programmable parameters that may be manipulated by the processor 156.

The logic module 160 may comprise suitable logic, circuitry, and/or code that may enable controlling of various functionalities of the wireless device 150. For example, the logic module 160 may comprise one or more state machines that may generate signals to control the transceiver 152 and/or the baseband processor 154. The logic module 160 may also comprise registers that may hold data for controlling, for example, the transceiver 152 and/or the baseband processor 154. The logic module 160 may also generate and/or store status information that may be read by, for example, the processor 156. Amplifier gains and/or filtering characteristics, for example, may be controlled by the logic module 160.

The BT radio/processor 162 may comprise suitable circuitry, logic, and/or code that may enable transmission and reception of Bluetooth signals. The BT radio/processor 162 may enable processing and/or handling of BT baseband signals. In this regard, the BT radio/processor 162 may process or handle BT signals received and/or BT signals transmitted via a wireless communication medium. The BT radio/processor 162 may also provide control and/or feedback information to/from the baseband processor 154 and/or the processor 156, based on information from the processed BT signals. The BT radio/processor 162 may communicate information and/or data from the processed BT signals to the processor 156 and/or to the system memory 158. Moreover, BT radio/processor 162 may receive information from the processor 156 and/or the system memory 158, which may be processed and transmitted via the wireless communication medium.

The CODEC 164 may comprise suitable circuitry, logic, interfaces, and/or code that may process audio signals received from and/or communicated to input/output devices. The input devices may be within or communicatively coupled to the wireless device 150, and may comprise the analog microphone 168, the stereo speakers 170, the Bluetooth headset 172, the hearing aid compatible (HAC) coil 174, the dual digital microphone 176, and the vibration transducer 178, for example. The CODEC 164 may be operable to up-convert and/or down-convert signal frequencies to desired frequencies for processing and/or transmission via an output device. The CODEC 164 may enable utilizing a plurality of digital audio inputs, such as 16 or 18-bit inputs, for example. The CODEC 164 may also enable utilizing a plurality of data sampling rate inputs. For example, the CODEC 164 may accept digital audio signals at sampling rates such as 8 kHz, 11.025 kHz, 12 kHz, 16 kHz, 22.05 kHz, 24 kHz, 32 kHz, 44.1 kHz, and/or 48 kHz. The CODEC 164 may also support mixing of a plurality of audio sources. For example, the CODEC 164 may support audio sources such as general audio, polyphonic ringer, I2S FM audio, vibration driving signals, and voice. In this regard, the general audio and polyphonic ringer sources may support the plurality of sampling rates that the audio CODEC 164 is enabled to accept, while the voice source may support a portion of the plurality of sampling rates, such as 8 kHz and 16 kHz, for example.

The audio CODEC 164 may utilize a programmable infinite impulse response (IIR) filter and/or a programmable finite impulse response (FIR) filter for at least a portion of the audio sources to compensate for passband amplitude and phase fluctuation for different output devices. In this regard, filter coefficients may be configured or programmed dynamically based on current operations. Moreover, filter coefficients may be switched in one-shot or may be switched sequentially, for example. The CODEC 164 may also utilize a modulator, such as a Delta-Sigma (Δ-Σ) modulator, for example, to code digital output signals for analog processing.

The external headset port 166 may comprise a physical connection for an external headset to be communicatively coupled to the wireless device 150. The analog microphone 168 may comprise suitable circuitry, logic, and/or code that may detect sound waves and convert them to electrical signals via a piezoelectric effect, for example. The electrical signals generated by the analog microphone 168 may comprise analog signals that may require analog to digital conversion before processing.

The stereo speakers 170 may comprise a pair of speakers that may be operable to generate audio signals from electrical signals received from the CODEC 164. The Bluetooth headset 172 may comprise a wireless headset that may be communicatively coupled to the wireless device 150 via the Bluetooth radio/processor 162. In this manner, the wireless device 150 may be operated in a hands-free mode, for example.

The HAC coil 174 may comprise suitable circuitry, logic, interfaces, and/or code that may enable communication between the wireless device 150 and a T-coil in a hearing aid, for example. In this manner, electrical audio signals may be communicated to a user that utilizes a hearing aid, without the need for generating sound signals via a speaker, such as the stereo speakers 170, and converting the generated sound signals back to electrical signals in a hearing aid, and subsequently back into amplified sound signals in the user's ear, for example.

The dual digital microphone 176 may comprise suitable circuitry, interfaces, logic, and/or code that may be operable to detect sound waves and convert them to electrical signals. The electrical signals generated by the dual digital microphone 176 may comprise digital signals, and thus may not require analog to digital conversion prior to digital processing in the CODEC 164. The dual digital microphone 176 may enable beamforming capabilities, for example.

The vibration transducer 178 may comprise suitable circuitry, logic, interfaces, and/or code that may enable notification of an incoming call, alerts and/or message to the wireless device 150 without the use of sound. The vibration transducer may generate vibrations that may be in synch with, for example, audio signals such as speech or music.

In operation, control and/or data information, which may comprise the programmable parameters, may be transferred from other portions of the wireless device 150, not shown in FIG. 1, to the processor 156. Similarly, the processor 156 may be enabled to transfer control and/or data information, which may include the programmable parameters, to other portions of the wireless device 150, not shown in FIG. 1, which may be part of the wireless device 150.

The CODEC 164 in the wireless device 150 may communicate with the processor 156 in order to transfer audio data and control signals. Control registers for the CODEC 164 may reside within the processor 156. The processor 156 may exchange audio signals and control information via the system memory 158. The CODEC 164 may up-convert and/or down-convert the frequencies of multiple audio sources for processing at a desired sampling rate.

The wireless device 150 may comprise echo estimation and cancellation capability, and may utilize a dual echo cancellation (ECAN) module within the CODEC 164. Residual echo may be generated by audio signals from a wireless device speaker looping back to the source via the wireless device microphone. Residual echo at the outputs of the ECAN may be suppressed utilizing subband non-linear processing (NLP) in the CODEC 164.

Residual echo may be mitigated by calculating a subband gain vector utilizing Non-linear processing, subband analysis, and measured echo return loss and echo return loss enhancement values. By estimating and cancelling echo utilizing the CODEC 164, the audio performance of the wireless device 150 may be improved. The subband gain vector may comprise gain values for each of a plurality of subbands, such that gain levels may be low in instances where echo is estimated to be present, and may be higher in instances where no echo is estimated to be present.

FIG. 2 is a module diagram illustrating an exemplary audio CODEC interconnection, in accordance with an embodiment of the invention. Referring to FIG. 2, there is shown a CODEC 201, a digital signal processor (DSP) 203, a memory 205, a processor 207, and an audio I/O devices module 209. There is also shown input and output signals for the digital audio processing module 211 comprising an I²S FM audio signal, control signals 219, voice/audio signal 221, a multi-band SSI signal 223, a mixed audio signal 225, a vibration driving signal 227, and a voice/music/ringtone data signal 229. The memory 205 may be substantially to the system memory 158. In another embodiment of the invention, the memory 205 may comprise a separate memory from the system memory 158.

The CODEC 201 may be substantially similar to the CODEC 164 described with respect to FIG. 1, and may comprise a digital audio processing module 211, an analog audio processing module 213, and a clock 215. The digital audio processing module 211 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to process received digital audio signals for subsequent storage and/or communication to an output device. The digital audio processing module 211 may comprise digital filters, such as decimation and infinite impulse response (IIR) filters, for example. The analog audio processing module 213 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to process received analog signals for communication to the audio I/O devices module 209 and/or the digital audio processing module 211. The analog audio processing module 213 may enable conversion of analog signals to digital signals and may filter received signals before processing, for example. In addition, the analog audio processing module 213 may provide amplification of received audio signals.

The clock 215 may comprise suitable circuitry, logic, interfaces, and/or code that may generate a common clock signal that may be utilized by the DSP 203, the processor 207, the digital audio processing module 211, and the analog audio processing module 213. In this manner, the synchronization of multiple audio signals during processing, transmission, and/or playback may be enabled.

The DSP 203 may comprise suitable circuitry, logic, interfaces, and/or code that may process signals received from the digital audio processing module 211 and/or retrieved from the memory 205. The DSP 203 may also store processed data in the memory 205 or communicate processed data to the digital audio processing module 211. In an embodiment of the invention, the DSP 203 may be integrated on-chip with the CODEC 201. The dual EC 110 may be as described with respect to FIG. 1, and may be implemented in the DSP 203 and/or the processor 207.

The processor 207 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to perform routine processor functions with, for example, minimal power requirements. In one embodiment of the invention, the processor 207 may comprise an advanced RISC machine processor. Notwithstanding, the invention is not so limited, and other types of processor may be utilized. The processor 207 may be communicatively coupled with the memory 205, and may be operable to store data on and/or retrieve data from the memory 205. The processor 207 may also be operable to communicate data and/or control information between the DSP 203 and/or memory 205 to enable for more signal processing tasks by the DSP 203. For example, the processor 207 may communicate with the DSP to enable signal processing of audio signals.

In operation, the CODEC 201 may communicate with the DSP 203 in order to transfer audio data and control signals, with the exception of FM radio listening and recording, where digital FM samples may be read from an I2S directly off a Bluetooth FM receiver, such as the Bluetooth radio/processor described, with respect to FIG. 1. Control registers for the CODEC 201 may, for example, reside in the DSP 203. For voice data, audio samples may not be buffered between the DSP 203 and the CODEC 201. For music and ring-tone, audio data from the DSP 203 may be written into a FIFO, for example, within the CODEC 201 which may then fetch the data samples. A similar method may be utilized for the high quality audio 221, which may sample at 48 KHz, for example. Audio data passing between the DSP 203 and the CODEC 201 may be accomplished via interrupts. These interrupts may comprise interrupts for voice/music/ring-tone data 229, the mixed audio signal 225 at 44.1 KHz/48 KHz for Bluetooth/USB, high quality audio 221 at 48 KHz, and for the vibration driving signal 227. Interrupts may be shared between different inputs and outputs.

The audio sample data for the voice/music/ringtone data 229 in the audio receive path and the high quality audio 221 in the audio transmit path may comprise 18-bit width per sample, for example. In instances where 16-bit audio data may be present, the same 18-bit format may be used, with the two least significant bits (LSBs) zeroed, for example.

In an embodiment of the invention, the DSP 203 and the processor 207 may exchange audio data and control information via a shared memory, for example, memory 205. The processor 207 may write pulse-code modulated (PCM) audio directly into the memory 205, and may also pass coded audio data to the DSP 203 for computationally intensive processing. In this instance, the DSP 203 may decode the data and write the PCM audio back into the memory 205 for the processor 207 to access or to be delivered to the CODEC 201. The processor 207 may communicate with the CODEC 201 via the DSP 203.

In an exemplary embodiment of the invention, the CODEC 201 may be operable to estimate and cancel echo in audio signals. Subband nonlinear processing may be utilized to estimate residual echo at the outputs of an echo canceller in the CODEC 201. Although downlink (DL) and uplink (UL) signals may overlap in the time domain during double talk, where both wireless device users in a conversation are speaking at the same time, it is not as likely that the signals overlap completely in the frequency domain. In this manner, subbands within the frequency domain may be assessed for echo and noise, and attenuated appropriately.

Residual echo may be mitigated by calculating a subband gain vector utilizing non-linear processing, subband analysis, and measured echo return loss and echo return loss enhancement values. By estimating and cancelling echo utilizing the CODEC 164, the audio performance of the wireless device 150 may be improved. The subband gain vector may comprise gain values for each of a plurality of subbands, such that gain levels may be low in instances where echo is estimated to be present in a particular subband, and may be higher in instances where no echo is estimated to be present in a subband.

FIG. 3 is a module diagram of an exemplary audio system architecture in accordance with an embodiment of the invention. Referring to FIG. 3, there is shown an audio system architecture 300 comprising a speech decoder 301, DC remover module 303, a downlink dynamic range controller (DL DRC) 305, a speech encoder 307, a mute control 309, an uplink dynamic range controller (UL DRC) 311, and a synthesis/filter module 313. FIG. 3 also shows a subband non-linear processor (NLP) 315, a noise suppressor/comfort noise generator (NS/CNG) 317, a DL subband analysis module 319, an UL subband analysis module 321, a dual echo canceller (EC) 323, an summer 325, a side tone expander 327, a side tone filter/gain module 329, a DC remover 331, and switches 333A and 333B. Additionally, FIG. 3 shows Bluetooth (BT) filters 335A and 335B, an Rx CODEC 337, a Tx filter 339, a Tx PGA/processing module 341, a Tx CODEC 343, an Rx filter 345, a Tx PGA/processing module 347, a BT Tx 349, a BT Rx 351, a speaker 353, and a microphone 355. There is also shown a noise level signal N(n), a DL level signal R(n), and UL signals S(n) and S_o(n).

The speech decoder 301 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to decode a received speech signal and generate an output signal that may be further processed and played back by an output device, such as the speaker 353, for example.

The DC remover 303 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to remove the DC portion of a received signal from the speech decoder 301. The DL DRC 305 may comprise suitable circuitry, interfaces, logic, and/or code that may be operable to control the dynamic range of a received audio signal. In this manner, distortion may be reduced at high volume situations, such as when a cell phone user may utilize a speaker phone mode with a high volume setting, for example.

The speech encoder 307 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to encode a received speech signal for subsequent processing and transmission, for example. The received signal may be generated by an input device, such as the microphone 355, for example.

The mute control module 309 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to mute a received audio signal. In this manner, a wireless device such as a mobile phone, may playback a received audio signal via a speaker, but not transmit another received signal, such as from a microphone.

The UL DRC 311 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to control the dynamic range of a received audio signal. In this manner, distortion may be reduced at high volume situations, such as when a cell phone user may utilize a speaker phone mode with a high volume setting, or be in a high noise environment, for example.

The synthesizer/filter module 313 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to generate noise cancellation signals and filter unwanted signals. The filtering capability in the synthesizer/filter module 313 may comprise a high pass filter, for example.

The subband NLP 315 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to suppress residual echo. The subband NLP may receive as inputs, the noise level signal N(n), the UL signal S(n), and the DL signal R(n), generated by the DL subband analysis module 319 and the UL subband analysis module 321. The subband NLP output may be communicatively coupled to the NS/CNG module 317.

The NS/CNG module 317 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to suppress noise and/or generate a comfort noise signal, which may be used to mask residual echo.

The DL subband analysis module 319 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to suppress residual echo. The input of the DL subband analysis module 319 may be communicatively coupled to the output of the DL DRC module 305, and may analyze the non-linear characteristics of the received signal, which may be received by the wireless device 150, described with respect to FIG. 1.

The UL subband analysis module 321 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to suppress residual echo in an upload signal, such as one generated by the microphone 355. The input of the UL subband analysis module 321 may be communicatively coupled to the output of the dual EC 323. The output of the UL subband analysis module 321 may be communicatively coupled to the NS/CNG module 317 and the subband NLP module 315.

The dual EC 323 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to cancel echoes in audio signals. The inputs of the dual EC 323 may be communicatively coupled to the DC remover 331 and the output of the DL DRC 305. The output of the dual EC 323 may be communicatively coupled to the UL subband analysis module 321.

The summer 325 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to receive a plurality of input signals and generate an output signal that may be the sum of the input signals. The inputs of the summer 325 may be communicatively coupled to the DL DRC module 305 and the side tone expander module 327. The output of the summer 325 may be communicatively coupled to the switch 333A.

The side tone expander module 327 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to amplify audio signals in a desired frequency range and attenuate signals in another frequency band. In this manner, the amplitude of desired signals may be selectively amplified while decreasing the magnitude of other signals.

The side tone filter/gain module 329 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to shape the side tone frequency that may be generated by the UL signal at the output of the DC remover 331. The output of the side tone filter/gain module 329 may be communicatively coupled to the side tone expander module 327.

The DC remover 331 may be substantially similar to the DC remover 303, but may be operable to remove DC signals from a Tx signal generated by the microphone 355 and/or the BT Rx 351, for example.

The switch 333A may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to switch between a DL signal generated by the summer 325 for communication to the Rx CODEC 337 or the BT filter 335A. Similarly, the switch 333B may comprise suitable circuitry, logic, and/or code that may be operable to switch between the Tx CODEC 343 and the BT filter 335B, and communicate the desired signal to the DC remover 331.

The BT filters 335A and 335B may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to filter out undesired signals and allow desired BT signals to pass. The BT filter 335A may be communicatively coupled to the summer 325, in instances where the switch 333A is switched to the BT filter 335A. The output of the BT filter 335A may be communicatively coupled to the BT Tx 349. The input of the BT filter 335B may be communicatively coupled to the BT Rx 351, and the output may be communicatively coupled to the switch 333B.

The Rx CODEC 337 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to process received audio signals for communication to an output device, such as the speaker 353. The Rx CODEC 337 may comprise the Rx filter 339 and the PGA/processing module 341. The Rx filter 339 may comprise suitable circuitry, logic, and/or code that may be operable to filter out undesired signals while allowing a desired audio signal to be communicated to the PGA/processing module 341. The Rx filter 339 may comprise digital infinite impulse response (IIR) filters, such as biquads, for example. The PGA/processing module 341 may comprise suitable circuitry, logic, and/or code that may be operable to amplify a received audio signal as well as perform other audio processing tasks for enhancing the desired audio signal quality.

The Tx CODEC 343 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to process received audio signals received from an input device, such as the microphone 355. The Tx CODEC 343 may comprise the Tx filter 345 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to filter undesired signals while allowing desired signals received from the PGA/processing module 347 to pass. The Tx filter 345 may comprise digital infinite impulse response (IIR) filters, such as biquads, for example. In an embodiment of the invention, the Rx CODEC 337 and the Tx CODEC 343 may be integrated in a hardware block, such as the digital audio processing module 211, described with respect to FIG. 2.

The PGA/processing module 347 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to amplify a signal received from the microphone 355 as well as to perform other audio processing tasks for the desired audio signal quality.

The BT Tx 349 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to wirelessly transmit a BT signal to a BT device, such as a BT headset, for example. The input of the BT Tx 349 may be communicatively coupled to the output of the BT filter 335A. The BT Rx 351 may comprise suitable circuitry, logic, and/or code that may be operable to receive a BT signal from a BT device, such as a BT headset, for example.

The speaker 353 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to generate and output an audio signal from an electrical signal received from the Rx CODEC 337. The microphone 355 may comprise suitable circuitry, logic, and/or code that may be operable to generate an electrical signal from a received audio signal, and communicate the generated electrical signal to the Tx CODEC 343 for processing, for example.

In operation, in the DL path, a speech signal from the speech decoder 301 may pass through the DC remover 303 followed by the DL DRC 305. The DL DRC 305 may perform pre-emphasis, gain control, expansion and compression to increase subjective loudness, to reduce background noise and to prevent speaker overload. The output of the DL DRC 305 may be communicated to the Rx CODEC 337 via the switch 333A. The RX CODEC 337 may comprise digital IIR filter to compensate for the response of the speaker 353. The Rx CODEC 337 may also comprise digital & analog gain stages, delta-to-sigma DAC, power amplifier, and analog filters, for example.

For the UL path, the Tx CODEC 343 may also comprise digital IIR filters, such as biquads, for example, to compensate for the microphone 355 response. The Tx CODEC 343 may also comprise gain stages, a sigma-to-delta ADC, and a power amplifier, for example. A speech signal from the Tx CODEC 343 may be communicated to a high pass filter to remove DC, the DC remover 331. The output from the DC remover 331 may be utilized by the side tone filter/gain module 329 to generate a side tone. In this manner, the side tone frequency may be shaped or otherwise processed using side tone filtering and gain. The signal generated by the Tx CODEC 343 may contain acoustic coupled echo, local UL speech signal, and noise. The dual EC may then be utilized to reduce acoustic echo.

Due to nonlinearity, residual echo usually may still be present after the dual EC 323. The following modules, such as the UL subband analysis module 321, the DL subband analysis module 319, the subband NLP 315, the NS/CNG module 317, and the synthesis/filter module 313, may suppress residual echo using subband non-linear processing. Subband nonlinear processing may be utilized to estimate residual echo at the outputs of the dual EC 323. Received signals may be divided into a plurality of subbands, where the energy level in the individual subbands may be utilized to estimate the residual echo. Although downlink (DL) and uplink (UL) signals may overlap in the time domain during double talk, where both wireless device users in a conversation are speaking at the same time, it is not as likely that the signals overlap completely in the frequency domain. Thus, one or more selected subbands within the frequency domain may be assessed for echo and noise, and attenuated appropriately.

Echo return loss (ERL) comprises the ratio between the signal to be transmitted, such as S_o(n), and the echo level, typically expressed in dB. Echo return loss enhancement (ERLE) comprises the improvement, or reduction, in echo level introduced by an echo canceller, such as the dual EC 323. The difference between the DL signal R(n) and the UL signal S(n) in DL single talk mode may comprise the combined echo return loss and echo return loss enhancement (ERL+ERLE). This difference, ERL+ERLE, may comprise the total attenuation from the input of the RX CODEC 337 to the output of the dual EC 323, and may be measured for an extended period of time, since the echo delay time may not be known. In addition, since the estimation may be unreliable in a transition period, the measurement may ignore sudden increases in magnitude, such as when a user starts to speak.

Residual echo may be mitigated by calculating a subband gain vector utilizing non-linear processing, subband analysis, and measured echo return loss and echo return loss enhancement values. By estimating and cancelling echo utilizing the CODEC 164, the audio performance of the wireless device 150 may be improved. The subband gain vector, g_nlp, may comprise gain values for each of a plurality of subbands, such that gain levels may be low in instances where echo is estimated to be present in a particular subband, and may be higher in instances where no echo is estimated to be present in a subband.

The ERL+ERLE may be monitored in time frames, such as 10 ms wide, for example, and may monitor until a time limit is met, and the maximum value of ERL+ERLE measured may be averaged with the current measured difference, the average representing the estimated residual echo. The output of the subband NLP 315, g_nlp, may be communicated to the NS/CNG module 317 for noise suppression and comfort noise generation, for example.

The suppressed echo may be further masked by comfort noise generated by the NS/CNG module 317. The background noise may also be suppressed using the subband noise suppressor in the NS/CNG module 317. The signal may then be communicated to the synthesis/filter module 313 followed by the UL DRC 311 for dynamic range control. The signal may then be processed by the mute control module 309 which may mute the signal when selected by the user, for example, followed by the speech encoder 307 which may encode the speech signal before subsequent processing and transmission, for example.

FIG. 4 is a flow diagram illustrating exemplary steps in echo estimation and cancellation, in accordance with an embodiment of the invention. In step 403, following start step 401, the DL signal at n−2, R(n−2), may be compared to a threshold value, VAD_TH. In instances where R(n−2) is greater than VAD_TH, the exemplary steps may proceed to step 405, where it may be determined if the DL signal has peaked by comparing R(n−2) to R(n−3) and R(n−1). In instances where R(n−2) is a peak value, the exemplary steps may proceed to step 407 where a difference value between R(n−2) and the maximum of the UL signal at (n−2), (n−1), and (n) may be calculated. This difference value, Diff, may be utilized to determine the new ERL+ERLE value, ERL_ERLE_New, which may be equal to the maximum of the previous value of ERL_ERLE_New or the calculated difference value. The exemplary steps may then proceed to step 409 where a counter that may represent the time in the present echo estimation cycle, may be incremented. If, in step 405, the DL signal R(n−2) is not a peak value, the exemplary steps may proceed directly to the counter increment step 409. If in step 411, the counter exceeds a predetermined time period, one second, for example, the exemplary steps may proceed to step 413, where the counter may be reset to zero and the ERL_ERLE may be calculated as the average of the current value of ERL_ERLE and ERL_ERLE_New, followed by end step 415.

FIG. 5 is a diagram illustrating exemplary echo and uplink signal estimation utilizing subband non-linear processing, in accordance with an embodiment of the invention. Referring to FIG. 5, there is shown an echo estimator 500 comprising a DL subband analysis module 501, a UL subband analysis module 503, a noise suppression and comfort noise module 505, an ERL+ERLE estimator module 507, adders 509A-509D, a non-linear distortion estimator module 511, and a gain calculation/post-processing module 513.

The DL subband analysis module 501 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to perform subband analysis on a received signal. This analysis may comprise dividing the received signal into small frequency segments, or subbands, and measuring the magnitude of the signal in each subband. Due to differences in voices, speech patterns, and different sounds spoken by users at a given time, individual subbands may be useful to differentiate between UL and DL signals. Similarly, the UL subband analysis module 503 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to perform subband analysis on a signal to be transmitted by the wireless device 150.

The noise suppression and comfort noise module 505 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to determine an optimal noise suppression level and/or comfort noise level to be incorporated in a signal to be transmitted. The noise suppression and comfort noise module 505 may be communicatively coupled to the gain calculation/post-processing module 513.

The ERL+ERLE estimator module 507 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to determine the value of combined echo return loss and echo return loss enhancement (ERL+ERLE) by comparing UL and DL signals, as described with respect to FIG. 4. The output of the ERL+ERLE estimator module 507 may be communicatively coupled to the summer 509A.

The adders 509A-509D may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to generate an output signal that may comprise the sum of the received input signals.

The non-linear distortion estimator module 511 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to estimate the amount of non-linear distortion in the non-linear processing. Non-linear distortion may decrease the ability to remove echo in a signal. Thus, adding a non-linear distortion adjustment may improve the echo estimation and cancellation. The output of the non-linear distortion estimator module 511 may be communicatively coupled to the summer 509C.

The gain calculation/post-processing module 513 may comprise suitable circuitry, logic, interfaces, and/or code that may be operable to determine a desired gain level to be applied to a signal to be transmitted by the wireless device 150. In addition, the gain calculation/post-processing module 513 may be operable to perform post-processing on the signal, such as smoothing, for example. The gain calculation/post-processing module 513 may receive as inputs, signals generated by subband analysis of DL and UL signals, and with echo levels estimated and cancelled, and adjusted for non-linear distortion. In this manner, the signal transmitted by the wireless device 150 may have enhanced audio quality, even in a double-talk situation, for example.

In operation, the DL subband analysis module 501 and the UL subband analysis module 503 may perform subband analysis of DL and UL signals, respectively, the outputs of which, R(n) and S(n), may be utilized by the ERL+ERLE estimator module 507 to determine combined echo return loss and echo return loss enhancement for echo estimation. This may be utilized to estimate the echo level in the DL signal by subtracting the output of the ERL+ERLE estimator module 507 from R(n) by the summer 509A. In addition, the ERL_ERLE_ADJ signal may be utilized to fine tune the echo estimation level, and may be an adjustment specific to the wireless device 150 to be determined at the time of manufacture, for example.

The distortion ERL_ERLE_ADJ level may then be added to the adjusted signal by the summer 509C, the resulting output may comprise the echo_lev signal, which along with the noise suppression and comfort noise module 505 output, N(n), may be subtracted from the UL signal S(n) by the summer 509D. The resulting output signal, UL_lev, may then be communicated to the gain calculation/post-processing module 513. The gain calculation/post-processing module 513 may also receive a noise suppression and comfort noise signal, N(n) to calculate a desired gain level, g_nlp, which may correspond to g_nlp, shown in FIG. 3. An optimum gain level g_nlp may optimize the signal quality in the wireless device 150.

FIG. 6 is a diagram illustrating exemplary non-linear processing gain calculation, in accordance with an embodiment of the invention. Referring to FIG. 6, there is shown DL, UL, and noise levels, and associated adjustment and margin levels. The echo may be masked using noise or the UL signal. If a significant UL signal is present as shown, the echo may be lower than the UL level and echo suppression may be less important. However, if the UL signal is low, such as when the mobile device 150 user is not speaking, the echo may need to be suppressed for suitable signal quality. The echo may be suppressed by the value g_nlp to below the noise level, thereby reducing/eliminating echo perceived from the signal transmitted by the wireless device 150. The variable ERL_ERLE_ADJ_Distmay be the distortion adjustment value described with respect to FIG. 5 that may be dependent on the distortion on devices in the wireless device 150, such as the stereo speakers 170, for example. Similarly, the variable ERL_ERLE_ADJ may comprise an estimation error that may be specific to the wireless device 150, and may be determined at manufacture, for example.

In an exemplary embodiment of the invention, a method and system is disclosed for echo estimation and cancellation. In various embodiments of the invention, one or more circuits and/or processors in a wireless device that is operable to communicate downlink (DL) and uplink (UL) signals, may be operable to estimate combined echo return loss and echo return loss enhancement (ERL+ERLE). Non-linear processing, subband analysis, and the measured combined echo return loss and echo return loss enhancement values may be utilized to calculate a subband gain vector, g_nlp, for mitigating residual echo. The combined echo return loss and echo return loss enhancement may be estimated by averaging a difference in the DL and UL signals. A maximum value of the difference in one or more subbands may be determined over a period of time. A non-linear distortion adjustment factor ERL_ERLE_ADJ_Distmay be estimated for the combined echo return loss and echo return loss enhancement. A combined echo return loss and echo return loss enhancement estimation error may be calibrated specific to the wireless device 150. The estimating of the combined echo return loss and echo return loss enhancement may be suspended for a period of time after a transition in the DL or UL signals. Comfort noise may be added to the UL signal to mask the residual echo. The residual echo may be mitigated following a dual echo canceller 323. The estimating of the combined echo return loss and echo return loss enhancement may be suspended when the DL signal is not present. A noise level N(n) may be included in the subband gain vector, g_nlp, calculation.

Another embodiment of the invention may provide a machine and/or computer readable storage and/or medium, having stored thereon, a machine code and/or a computer program having at least one code section executable by a machine and/or a computer, thereby causing the machine and/or computer to perform the steps as described herein for echo estimation and cancellation.

Accordingly, aspects of the invention may be realized in hardware, software, firmware or a combination thereof. The invention may be realized in a centralized fashion in at least one computer system or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware, software and firmware may be a general-purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.

One embodiment of the present invention may be implemented as a board level product, as a single chip, application specific integrated circuit (ASIC), or with varying levels integrated on a single chip with other portions of the system as separate components. The degree of integration of the system will primarily be determined by speed and cost considerations. Because of the sophisticated nature of modern processors, it is possible to utilize a commercially available processor, which may be implemented external to an ASIC implementation of the present system. Alternatively, if the processor is available as an ASIC core or logic module, then the commercially available processor may be implemented as part of an ASIC device with various functions implemented as firmware.

The present invention may also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods. Computer program in the present context may mean, for example, any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form. However, other meanings of computer program within the understanding of those skilled in the art are also contemplated by the present invention.

While the invention has been described with reference to certain embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the scope of the present invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the present invention without departing from its scope. Therefore, it is intended that the present invention not be limited to the particular embodiments disclosed, but that the present invention will include all embodiments falling within the scope of the appended claims.

Method and System For Echo Estimation and Cancellation

Information

Publication Number

Date Filed

Date Published

Inventors

CPC

US Classifications

International Classifications

Abstract

Description

Claims

CROSS-REFERENCE TO RELATED APPLICATIONS/INCORPORATION BY REFERENCE