This invention relates in general to methods and systems that transmit and receive audio communication, and more particularly, speakerphone systems .
In recent years, portable electronic devices, such as cellular telephones and mobile communication devices, have become commonplace. Many of these devices include a high-audio transducer for providing speakerphone operation. The single transducer is usually of a small size to allow for compact placement within the mobile device. Due to their small size, the transducer is generally limited in providing wide band audio fidelity. For example, low frequency sounds require sufficient speaker cone displacement to produce the low frequency acoustic pressure wave. In addition, high frequencies are often phase modulated and compressed by the high energy low frequency signals. These are both non-linear effects. For example, an audio signal with significant low frequency signal can cause large cone excursions which cause the diaphragm stiffness to increase. High frequency signals within the audio signal can be compressed as the cone is pushed out to maximum excursion. Additionally, high frequency signals generated when the cone is at maximum excursion take less time to travel to a listener than high frequency signals generated when the cone is at a maximum negative displacement. This effect produces a phase modulation of high frequencies by low frequencies. Consequently, moving coil speakers in small size transducers have mechanical and acoustic nonlinearities that cause them to produce distorted output and reduce the overall audio quality.
Mobile communication devices that support speakerphone operation generally include an echo suppressor for suppressing an echo signal. For example, during speakerphone mode, a microphone of the speakerphone may unintentionally capture the acoustic output of the speakerphone. This can be the case when the speakerphone is of a significant volume level to be fed back into the phone through the microphone and sent over the communication network to the talker. The talker can potentially hear their own voice which can be distracting. To mitigate such problems, an echo suppressor attempts to predict an echo signal from the talker signal and suppress the echo signal captured on the microphone signal. For example, the talker signal is generally considered the audio input signal to the transducer, which is the signal the echo suppressor generally uses for predicting the echo. The audio input signal can be fed to the transducer to produce an acoustic output signal. The acoustic output signal generally undergoes a linear transformation as a result of the acoustic environment as the sound pressure wave propagates from the transducer to the microphone. The echo suppressor generally employs an adaptive linear filter for estimating the environment that can generally be represented as a linear transformation of the acoustic output signal. Because the echo is generally a time shifted and scaled version of the acoustic output signal, the echo suppressor is generally able to determine a linear transformation of the echo environment.
Echo suppressor performance degrades when the adaptive linear filter attempts to model a non-linear transformation. The non-linear transformation can come from the environment, or from the source that generated the acoustic output signal. For example, a small speaker introduces distortions due to mechanical non- linearities, such as those common with large cone excursions including stiffness and inductance effects, and acoustic non-linearities, such as those due to speaker porting arrangements. A speaker port can be a vent or opening which allows for the movement of air from the speaker cone for producing an acoustic pressure wave. Small speakers which are embedded within a communication device can require side ports or front ports for releasing the acoustic pressure. As the pressure wave passes through the port, the pressure wave can undergo compression which introduces non- linear deviations in the pressure wave at the port boundaries. The port placement, size, and arrangement, can introduce acoustic non-linearities onto the resulting acoustic output signal. An echo suppressor based on modeling a linear transformation will degrade in performance due to these non-linearities, and may be unable to adequately suppress the echo signal. Nonlinear mechanisms can occur in the path from the source (transducer) input to the sensor (microphone), and nonlinear estimators can be used to estimate the nonlinear parts of the path. For example, a neural net algorithm can be trained to learn non-linearities within the path. Non-linear estimators generally form models directly from the path data, and not generally from the mechanics of a transducer or from the acoustic porting arrangement.
The present embodiments of the invention concern a method and system for modeling transducer non-linearities. The method can include converting a transducer signal to a displacement signal, and applying at least one correction to the displacement signal. The displacement signal can be proportional to a transducer cone displacement. The correction can include applying at least one distortion to the displacement signal which can be a memory-less and nonlinear operation. The distortion can also be applied as a fixed or adaptive process using a convergence error of an adaptation process. For example, the adaptation process can be the Least Mean Squares (LMS) algorithm in an echo suppressor.
In one aspect, the transducer signal can be an input signal to the transducer, or an acoustic output signal of the transducer. In another aspect, a correction can include accounting for at least one mechanical transducer non-linearity, which can produce a distorted displacement signal. The mechanical transducer non-linearity can be a transducer cone excursion, a diaphragm stiffness, or a diaphragm displacement. The method can further include applying a time derivative operator to the displacement signal for producing a velocity signal, accounting for at least one acoustic transducer non-linearity to produce a distorted velocity signal, and converting the distorted velocity signal into an acceleration signal. An acoustic transducer non-linearity can include non-linear acoustic jetting through at least one transducer port. For example, the acceleration signal can be an estimate of the sound pressure level produced by the transducer. In one arrangement, the acceleration signal can be fed to the echo suppressor for removing a transducer signal from a microphone signal.
An embodiment for modeling transducer non-linearities can concern a method for echo suppression. The method can include converting a transducer signal to a displacement signal, applying at least one correction to the displacement signal to produce a distorted signal, and using the distorted signal as an input for echo cancellation for suppressing an echo from a microphone input signal. The displacement signal can be proportional to a transducer cone displacement. The correction can produce the distorted signal which suppresses at least one non-linear component of the transducer signal. In one aspect, the distorted signal facilitates a convergence of the echo cancellation by suppressing non-linear components. A convergence error of the adaptation process which the echo cancellation can be used during the correction.
The present embodiments also concerns a system for modeling transducer non-linearities for suppressing an echo. The system can include a displacement unit for converting an input signal to a displacement signal, and a first non-linear estimator for modeling at least one transducer non-linearity. The input signal can be a digital voltage or an analog voltage applied to the input of the transducer. The displacement signal can be proportional to a transducer cone displacement. The non-linear estimator can also apply at least one correction to the displacement signal. For example, the non-linear estimator can apply a memory-less non-linear distortion to the displacement signal which takes transducer non-linearities into account. In one arrangement, the distortion unit can receive the displacement signal from the displacement unit to produce a distorted displacement signal. The system can further include a transducer for producing an acoustic signal in response to the input signal, a microphone for converting the acoustic signal into an audio signal, and an echo suppressor, responsive to said distorted signal, for suppressing a linear component of the audio signal. For example, the audio signal can include a linear component that is a linear function of the acoustic signal and a non-linear component which is a non-linear function of the transducer. For instance, the transducer can impart at least one non-linear component onto said acoustic signal related to at least one transducer non-linearity. The distorted signal compensates for at least one transducer non-linearity thereby facilitating a convergence of the echo suppressor.
In one arrangement, the non-linear estimator can account for at least one mechanical transducer non-linearity, which can be a transducer cone excursion, a diaphragm stiffness, or a diaphragm displacement. The system can further include a differential operator for converting the distorted displacement signal into a velocity signal, and a second non-linear estimator for applying a second distortion to said velocity signal, and for converting said distorted velocity signal into an acceleration signal. The second distortion can model at least one acoustic transducer non-linearity for producing a distorted velocity signal. For example, an acoustic transducer non-linearity can be a non-linear acoustic jetting through at least one transducer port and which is proportional to an instantaneous acoustic velocity.
In yet another arrangement, the system can further include a spectral whitener for flattening a spectrum of said distorted signal. For example, the first and second non-linear estimator can provide significant spectral shaping which can affect the convergence of an adaptive process within the echo suppressor. The spectral whitener can receive the distorted signal from one of the non-linear estimators and provide a whitened signal to an input of said echo suppressor. Accordingly, the first non-linear estimator and said second non-linear estimator can receive a convergence error from the echo canceller and adapt using a gradient search algorithm.
The system can also include a sensor coupled to said transducer for physically measuring a cone displacement. For example, the displacement unit can convert an input signal to a displacement signal using the physically measured cone displacement.
The terms “a” or “an,” as used herein, are defined as one or more than one. The term “plurality,” as used herein, is defined as two or more than two. The term “another,” as used herein, is defined as at least a second or more. The terms “including” and/or “having,” as used herein, are defined as comprising (i.e., open language). The term “coupled,” as used herein, is defined as connected, although not necessarily directly, and not necessarily mechanically. The term “suppressing” can be defined as reducing or removing, either partially or completely.
The terms “program,” “software application,” and the like as used herein, are defined as a sequence of instructions designed for execution on a computer system. A program, computer program, or software application may include a subroutine, a function, a procedure, an object method, an object implementation, an executable application, an applet, a servlet, a source code, an object code, a shared library/dynamic load library and/or other sequence of instructions designed for execution on a computer system.
The present embodiments concern a method and system for correcting transducer non-linearities for use with an echo suppressor. The method can include converting a transducer signal to a displacement signal that is proportional to a transducer cone displacement. Memory-less nonlinear distortions which take transducer nonlinearities into account can be applied to the displacement signal. The distorted displacement signal can be converted to a velocity signal, and fed to a second memory-less nonlinear distortion section that takes nonlinear acoustic jetting through ports into account. The distorted velocity signal can be converted to an acceleration signal for providing a good estimate of the sound pressure level (SPL) produced by the transducer. The acceleration signal can be fed into a LMS based echo suppressor to remove the transducer signal from a microphone signal.
In one arrangement, the present embodiments of the invention provide for the modeling of transducer non-linearities. The method can include converting a transducer signal to a displacement signal, and applying at least one correction to the displacement signal. The displacement signal can be proportional to a transducer cone displacement. The correction can include applying at least one distortion to the displacement signal which can be a memory-less and nonlinear operation. The distortion can also be applied as a fixed or adaptive process using a convergence error of an adaptation process.
In one aspect, a correction can include accounting for at least one mechanical transducer non-linearity. The mechanical transducer non-linearity can be a transducer cone excursion, a diaphragm stiffness, or a diaphragm displacement. In another aspect, a correction can include accounting for at least one acoustic transducer non-linearity to produce a distorted velocity signal, and converting the distorted velocity signal into an acceleration signal. An acoustic transducer non-linearity can include non-linear acoustic jetting through at least one transducer port. Non-linear jetting can be the acceleration of air through a port which disrupts a continuous movement of air through the port. In one arrangement, the acceleration signal can be fed to the echo suppressor for removing a transducer signal from a microphone signal.
In
The system 100 can include an echo suppressor 170 for suppressing the direct path signal 106 and the echo signal 107. The echo suppressor 170 can suppress echo to produce an echo suppressed signal 124 such that the caller 121 does not hear an echo of their voice when they are speaking. The echo suppressor 170 can employ a Least Mean Squares (LMS) algorithm for modeling a linear transformation of the user environment. The echo suppressor 170 can adequately suppress an echo when the signal is sufficiently representative of a linear transformation of the original acoustic signal 103. The echo suppressor 170 can also produce a convergence error 171 for providing a performance measure.
Briefly, the echo suppressor 170 attempts to model a linear transformation between the signal received at the microphone 104 and the audio signal 122 provided as input to the transducer 102. The audio line 122 fed to the transducer 102 can also be considered the input signal. The convergence error 171 reveals how well the echo suppressor 170 is capable of modeling the environment, and accordingly, how well the echo suppressor 170 can suppress the echo. A low convergence error can generally imply good modeling performance whereas a high convergence error can generally imply poor modeling performance. A low convergence error can also be the result of minimal echo in the environment, or of a minimal amplitude direct path signal. A minimal amplitude direct path signal can exist when the transducer 102 is properly insulated from the microphone 104 to avoid any high audio leakage.
Accordingly, significant direct path signal contributions can exist when the transducer 102 is not adequately sealed off from the microphone input 104 For example, a transducer 102 that is not properly sealed can leak sound pressure waves from the transducer housing arrangement to the microphone path.
The majority of the nonlinear mechanisms produced in transducers are related to the speaker's diaphragm displacement, not the voltage into the speaker. This makes predicting a nonlinear correction term significantly more difficult for nonlinear estimators that learn directly from the path data. Accordingly, taking into account transducer attributes can lessen the difficulty of modeling a non-linear transformation during learning or adaptation.
Accordingly, the system 100 can include a non-linear corrector 110 for modeling non-linearities within the system 100. For example, speaker distortion (particularly for dispatch radio/speakerphone applications) can be dominated by nonlinearities produced by and directly dependent upon large cone excursions. The nonlinearities in the transducer 102 output can reduce echo cancellation performance which can limit dispatch radio operation to single-duplex. The non-linear corrector 110 can provide a means for effectively dealing with the transducer nonlinearities to improve echo suppression performance. The non-linear corrector 110 can also apply at least one correction that is a memory-less and nonlinear operation. The transducer non-linearities can be both mechanical and acoustic.
Briefly, the non-linear estimator 110 can improve the ability for nonlinear adaptive algorithms to model nonlinear behavior. The echo suppressor 170 can more accurately model a linear transformation of the acoustic signal 103 when the non-linear corrector 110 removes (or suppresses) non-linearities on the acoustic signal 103. The non-linear corrector 110 can incorporate a convergence error 171 of an adaptation process within the echo suppressor 170 that can be a fixed or adaptive process. For instance, the transducer 102 can impart non-linear mechanical effects onto the acoustic signal 103 due to speaker cone displacement, stiffness, and inductance. In another example, air ports or leaks within the mobile communication device can induce acoustic non-linearities such as those due to changes in sound pressure or velocity. In one arrangement, the non-linear corrector 110 can incorporate these mechanical and acoustic non-linear attributes to improve linear modeling behavior within the echo suppressor 170.
Referring to
The first non-linear estimator 214 can use the displacement estimate to predict a non-linear transfer function HNL1 that describes non-linear distortions as a result of diaphragm displacement. Consequently, the first non-linear estimator 214 accounts for the mechanical transducer 102 non-linearities and produces a distorted displacement signal. In one arrangement, the differential operator 216 can incorporate acoustic non-linearities and convert the distorted displacement signal into an acoustic velocity signal. A second non-linear estimator 218 can use the acoustic velocity estimate to predict a second non-linear transfer function HNL2 that describes non-linear distortions as a result of acoustic jetting through ports. Consequently, the second non-linear estimator 218 accounts for acoustic transducer 102 non-linearities and produces a distorted velocity signal.
The velocity signal can be fed to a whitener 220 that can apply a compensatory equalization to account for spectral shaping at the displacement unit 212 and the differential operator 216. Recall, the displacement unit 212 applies a displacement distortion to prepare the displacement signal for the first non-linear estimator 214. The differential operator 216 applies a velocity distortion to prepare the velocity signal for the second non-linear estimator 218. Briefly, the first non-linear estimator 214 and second non-linear estimator 218 are employed to predict mechanical and acoustic non-linear distortions generated by the transducer, respectively. Consequently, the velocity signal can be whitened to restore the audio signal from these distortions applied at the displacement unit 212 and the differential operator 216. The whitener 220 can produce an acceleration signal that can be input to the echo suppressor 170, which can facilitate a convergence of the echo canceller. acceleration signal can provide an estimate of the sound pressure level produced by the transducer. The first and second non-linear estimators can account for transducer non-linearities and produce a whitened acceleration signal substantially devoid of non-linearities. Accordingly, the echo suppressor 170, can be capable of modeling the remaining linear portion of the echo signal using standard LMS based techniques.
Referring to
First, a large cone excursion can cause the voice coil 306 to leave the area of maximum magnetic flux density in the magnetic gap 308. For example, a large voltage swing can produce a large cone excursion causing the force per unit current to decrease, and thus causing the output to no longer be a linear function of the input. Mathematically, the magnet BL motor factor can become a function of displacement x, or BL=BL(x).
Second, a large cone excursion can cause the diaphragm's stiffness to increase. The increase can be due to the diaphragm's roles 311 becoming “unrolled” for large excursions. Accordingly, more force can be required to move the diaphragm 307 a given distance. Again, this stiffness can cause the transducer output to no longer be linearly related to the input. Mathematically the transducer's stiffness k becomes a function of displacement x, or k=k(x).
Third, a large cone excursion can cause the voice coil 306 to move out of the magnetic circuit created by the metal magnetic structure 304. Since the voice coil's inductance is a function of the metal 304 surrounding the voice coil 306, the inductance can become a function of the diaphragm's displacement x, which can produce nonlinear distortion. Mathematically, the inductance L is a function of L, or L=L(x).
The three mentioned transducer mechanical non-linearities can each be represented using a general compression curve. For example, referring to plot 350, a compression curve for cone excursion 352, diaphragm stiffness 354, and magnetic induction 356 is shown. The x-axis represents the diaphragm displacement and the y-axis represents one of the three mechanical non-linearities. Referring to
A non-linear estimator can use one of the compression curves 350 to determine a level of applied correction, or distortion. For example, the first non-linear estimator 214 evaluates a cone excursion factor, a diaphragm stiffness factor, and a magnetic inductance factor using compression curves 352, 354, and 356 for each factor, respectively. The first non-linear estimator 214 looks along the x-axis of a compression curve 350 for the mapped diaphragm displacement, and identifies the associated y-axis point on the compression curve. The y-axis on the compression curve 350 describes the correction factor, or distortion level, to apply to the audio input signal 122 to account for the non-linear behavior of the transducer at the corresponding audio input voltage level. Similarly, the second non-linear estimator 218 evaluates an acoustic factor using its compression curve. For example, the compression curves 350 can relate a mapping between acoustic port sizes and effects on acoustic velocity. For instance, the acoustic jetting through ports can be represented as a compression curve, or function, of the acoustic velocity. The second non-linear estimator 218 can look up an acoustic velocity on the x-axis and identify a corresponding acoustic factor on the y-axis of the compression curve.
Referring to
At step 401, the method 400 can start. At step 402, a transducer signal can be converted to a displacement signal that is proportional to a transducer cone displacement. The transducer signal can be an audio input voltage to a transducer. For example, referring to
At step 404, a correction can be applied to the displacement signal. For example, referring to
The first non-linear estimator 214 converts the speaker voltage to a displacement signal using a memory-less nonlinear corrector to predict the distortions generated by the speaker. A memory based system keeps a history whereas a memory-less system does not keep a history. For example, referring to
The memory-less based approach results in less computation time, with a faster convergence using an adaptive approach. Up until step 404, which includes the processing by the displacement unit 212 and first non-linear estimator 214 of
Accordingly, at step 406, a time derivative operator can be applied to the distorted displacement signal for producing a velocity signal. Referring to
At step 408, a second correction can be applied to the distorted displacement signal in view of the velocity estimate to account for at least one acoustic transducer non-linearity. For example, the non-linear acoustic jetting through at least one transducer port is an acoustic transducer non-linearity that is proportional to an instantaneous acoustic velocity. For example, the second non-linear estimator 218 calculates an instantaneous acoustic velocity from the velocity estimates provided by the differential operator 216, thereby operating on the acoustic velocities for modeling the acoustic nonlinearities. The second non-linear estimator 218 applies a second distortion to the velocity signal to model at least one acoustic transducer non-linearity; for example, a measure of the acoustic jetting in view of the instantaneous velocity.
At step 410, the distorted velocity signal is converted into an acceleration signal. For example, referring to
At step 412, the acceleration signal can be provided as input to an echo suppressor for suppressing an echo from a microphone input signal. For example, referring to
In summary, speaker distortions are dominated by non-linearities produced by and directly dependent upon large cone excursions. By converting the speaker voltage to a displacement signal, the memory-less non-linear corrections applied by the first 214 and second 218 non-linear estimators can predict distortions made by the speaker and correct for the non-linearities. Consequently, the non-linearities within the resulting whitened acceleration signal will be minimal which improves the ability for the echo suppressor to model a linear transformation of the user environment. This can increase echo suppressor performance on the far end. The method 400, described by the processing blocks 212 to 220, provides a means for removing transducer non-linearities when a speaker is driven to produce non-linear distortions due to high volume levels. In the preferred embodiment, the method 400 and system 100 are used within the context of an echo suppressor to increase the convergence of the adaptation system with the echo suppressor. For example, referring to
While the preferred embodiments of the invention have been illustrated and described, it will be clear that the invention is not so limited. Numerous modifications, changes, variations, substitutions and equivalents will occur to those skilled in the art without departing from the spirit and scope of the present invention as defined by the appended claims.