The present disclosure is related to methods and apparatuses for Automatic Gain Control (AGC) of audio signals and various devices, such as communications devices, that incorporate such AGC methods and apparatuses for controlling the gain applied to the audio signals.
Usually, audio signal inputs including speech signal inputs originate from different sources and therefore may have varying levels of amplitude. This is problematic for telephony as well as other applications because a listener may experience an undesirable increase in volume level corresponding to an input signal having a high level of amplitude. Sudden decreases in volume levels are likewise undesirable.
An Automatic Gain Control (AGC) system is used to apply and maintain the needed gain dynamically to the speech/audio signals according to the specifications of the chosen audio playing or rendering equipment, for example a video telephone, speaker phone, mobile telephone, etc. An AGC system should be designed to pre-determine the specified level of amplitude dynamically without any sacrifice in audio/speech quality.
Thus for AGC systems in general, the dynamic range of the gain applied must be within the given specification of an audio rendering or playing equipment to be fed. The gain must be either incremented or decremented according to specific laws (a “gain law”) that prevents possible distortions in the quality of the speech due to, for example, amplitude variation. Further, an AGC system should prevent an excessive gain increment during prolonged intervals of silence as may exist in any speech/audio signal.
Existing AGC systems utilize gain characteristics that are based primarily on a linear characteristic. For such linear based AGC systems, the gain steps applied are either constant or increased or decreased based on empirical settings, that is, settings that have been developed over time via experimentation. None of the known linear based AGC systems appear to consider the various design requirements/considerations together, for example the dynamic range, distortion prevention and silence/noise intervals, to achieve smooth variations in the gain characteristics.
During decay period 115, the input signal 101 is at a low level as shown by its level with respect to the full scale (FS) vertical amplitude axis 111. The vertical amplitude along the vertical amplitude axis 111 is expressed in decibels full scale (dBFS) but is shown in
During the attack period 117, which represents a sudden increase in the input signal 101 amplitude, the gain characteristic 103 must decrement linearly towards the minimum preset value gmin 105. The slope of the gain characteristic 103 during the attack period 117 depends on the initial time t2 and final time t4 of the attack period 117.
During hold period 119 and hold period 121, which follows the ends of the attack period 117 and attack period 129, respectively, the gain characteristic 103 is held constant. For example, after the brief attack period 129, the gain is held constant during hold period 121 and does not increase again until the decay period 123.
A method and apparatus for Automatic Gain Control (AGC) that provides an output signal having a smooth gain variation for any input signal is disclosed herein. The various embodiments assure that the output signal will reach a required output level in a specified attack or decay period. A curvilinear reference gain characteristic is applied to the gain variations. One example uses an exponential curve and an initial and final value of the gain law for both attack and decay periods as presets. The various embodiments further compute intermediate gain values of the curvilinear paths to achieve the smooth variations in output signal level. Other possible curvilinear characteristics may be used such as, but not limited to, logarithmic, hyperbolic, sinusoidal, etc. that are expressible as exponentials. Intermediate gain values may be determined either by table lookup or by on the fly computation of the curvilinear characteristic.
To facilitate the description of the various embodiments, the following abbreviations and symbolic parameters are used throughout the description and are defined herein as follows:
Thus the video telephony device 201 as illustrated in
The far end input 203 is processed by a decoder 207 which may be for example, an adaptive multi-rate (AMR) decoder. However any suitable decoder may be employed. The decoder output is coupled to a far end automatic gain control (AGC) module 209 of an overall AGC module 202. The AGC module 202 therefore may, in some embodiments consist of the far end AGC module 209 and a near-end AGC module 213. However, any suitable configuration may be employed; for example, the far end AGC module 209 may be physically separate from the near end AGC module 213 in some embodiments.
The far end AGC module 209 is further coupled to amplifiers 219 for the purpose of providing an output signal to, for example, speaker 221. Likewise the near end AGC module 213 is coupled to a noise reduction module 215 which is further coupled to an acoustic echo canceller 217. The acoustic echo canceller 217 receives a near end input signal from microphone 223 and amplifiers 219. The input signal to the amplifiers 219 may be produced by other suitable devices alternatively, or in addition to the microphone 223.
As shown in
In accordance with the embodiments the far end AGC module 209 and the near end AGC module 213 forming the overall AGC module 202 both have attack and decay gain characteristics based upon a curvilinear function. Therefore, in some embodiments, far end AGC module 209 may further include a curvilinear gain module 227, a speech sample averager 229, a threshold module 231, a decay period gain module 233 and an attack period gain module 235. Likewise the near end AGC module 213 may further include a curvilinear gain module 237 a speech sample averager 239, a threshold module 241, a decay period gain module 243 and an attack period gain module 245.
The curvilinear gain modules 227, 237 compute initialization values and perform determination of attack and decay gain characteristics based on a number of audio attack and decay frames, an applicable sample rate and based upon a non-linear curvilinear function. The gain determinations may be made based upon computations or may be made using a lookup table 249 stored in memory 247 where the memory is coupled to, and accessible by, the far end AGC module 209 and the near end AGC module 213.
The memory 247 may also store empirically determined parameters 251 for use in initialization. The memory 247 may also store the code 253 for implementing the AGC processes on the multimedia or other processor 200 and by the AGC module 202. It is to be understood that the memory 247 may be integrated with the processor 200, or with the AGC module 202 or may be a separate component coupled to the processor 200 or to an AGC module 202. Therefore, any configuration or arrangement of components that provides the AGC system as herein described remains in accordance with the embodiments.
It will also be appreciated that AGC calculations and otherwise any processing of received signals may be performed in a dedicated device such as a receiver having a dedicated processor, a processor coupled to an analog processing circuit or receiver analog “front-end” with appropriate software for performing a receiver function, an application specific integrated circuit (ASIC), a digital signal processor (DSP), or the like, or various combinations thereof, as would be appreciated by one of ordinary skill. Therefore, any appropriate logic may be employed for the various embodiments wherein the logic may include one or more of hard wired circuits, processors, an ASIC, a DSP, etc. Memory devices such as memory 247 and/or processors such as processor 200 may further be provisioned with routines and algorithms for operating on input data and providing output such as operating parameters to improve the performance of the amplifiers and/or other processing blocks associated with, for example, reducing noise and echo, and otherwise appropriately handling the input signals.
The inventive functionality and inventive principles herein disclosed may be implemented with or in software or firmware programs or instructions and integrated circuits (ICs) such as digital signal processors (DSPs) or application specific ICs (ASICs) as is well known by those of ordinary skill in the art. Therefore, further discussion of such software, firmware and ICs, if any, will be limited to the essentials with respect to the principles and concepts used by the various embodiments.
Additionally, any of the various modules herein described may also be implemented via software or firmware programs or instructions and integrated circuits (ICs) such as digital signal processors (DSPs) or application specific ICs (ASICs) or by any appropriate logic wherein the logic may include one or more of hard wired circuits, processors, an ASIC, a DSP, etc. and/or software and firmware programs or instructions that may be run on such modules.
It will further be appreciated that devices such as video telephony device 201 are exemplary only and may refer to various other devices such as cellular or mobile phones, two-way radios, messaging devices, personal digital assistants, personal assignment pads, personal computers equipped for wireless operation, a cellular handset or device, or the like, or equivalents thereof provided such units are arranged and constructed for operation in accordance with the various inventive concepts and principles embodied in exemplary AGC systems herein described, and methods for, among other things, utilizing a curvilinear function to determine an attack period gain and a decay period gain as discussed and described herein.
The speech sample averagers 229, 239 compute an average amplitude of a set of speech samples to be used for comparison with threshold values by threshold module 231, 241 to determine whether the input signal represented by the set of speech samples, is in an attack period or a decay period. The attack or decay determination is then used to route the process to either the decay module 233, 243 or the attack module 235, 245 as appropriate, to determine the gain to apply.
Turning now to
Thus the high level operation of 408 may be described as follows. In 409 the average amplitude is compared with an upper and lower threshold. The upper and lower threshold parameters are determined for example, as shown in block 403. In 411 the AGC system uses the threshold values to determine if the speech sample is in an attack period or in a decay period. In 413, the AGC module 202 computes an appropriate gain to achieve a desired output level in light of whether the signal is in an attack or decay period. In 415 the gain is applied to the speech samples to produce the output signal.
As previously mentioned
If the second threshold is not exceeded then the gain remains at a constant level as shown in block 515. However if the average of the amplitude of the speech samples is greater than the second threshold a temporary gain value is computed as shown in block 517. Note that the gain computation accomplished in block 517 is independent of whether the input signal is in attack period or in a decay period. Therefore subsequently, as shown in 519, a determination is made as to whether the signal is indeed in an attack period or in a decay period.
If in 519 it is determined that the input signal is in a decay period then appropriate gain is computed to achieve the desired output level for that decay period as shown in 521. However if the signal is found to be in attack period then as shown in block 523 an appropriate gain is computed to achieve the desired output level given that the input signal is in an attack period. In block 525, the appropriate gain as computed is applied to the samples of the input signal.
In 527 a fail safe level protection may be also applied to the input signal to, for example, accommodate a saturation level of the amplifiers. Therefore by fail safe level protection 527, the electronic equipment is protected from possible damage. In 529, the output signal is applied to the speaker or alternatively, may be buffered in memory. At that point the process returns to block 405 where a new set of speech samples is read.
For the AGC system of the embodiments, various user selectable, that is, programmable parameters are employed including; an attack threshold (θa in dBFS), a clipping threshold (0, in dBFS), decay time and attack time (both in msec).
In contrast to existing AGC systems as described with respect to
The reference gain characteristics applied by the various embodiments are based on a curvilinear law that not only generalizes the required gain decrements and increment during attack and decay periods respectively, but also may incorporate a linear type of characteristic, as exemplified by
For illustration and with respect to
The gain characteristics for various attack, decay and hold periods, in accordance with the embodiments are illustrated in
Therefore, an example of operation of the embodiments is illustrated beginning with
Variable to used as indices may be initialized for example, as i=0, Na=0 and Nd=0. Further, various parameters may be chosen such as an attack threshold “θa” (in decibels full scale dBFS), a clipping threshold “θc” (in dBFS), a hold time “Th” (in ms), a decay time “Td” (in ms) and an attack time “Ta” (in ms). Exemplary values for these parameters are as follows: N=16, θa=0.2819=−11 dBFS, θc=0.3981=−8 dBFS, Th=50 ms, Ta=128 ms, Td=4000 ms.
In 403, initial pre-calculations are performed as follows:
Compute Na:
Considering the reference gain characteristic of the AGC system as exponential in nature, the computation of the exponential gain characteristic and its instantaneous values during the attack period Ta can be viewed as:
On using Taylor's series expansion given in equation (4), the reference exponential gain may be expressed in equation (3) and may be computed dynamically during the remaining steps of the algorithm. Typically, 10 coefficients are sufficient to compute fa. Mainly, equation (2) gives the procedure to calculate the reference gain at any instant of time 0 through Na. An example of a gain characteristic for attack mode in accordance with the embodiments is illustrated in
The computations of 403 in
On substituting equation (9) in equation (8), the reference exponential gain may be computed dynamically during the remaining steps of the algorithm. Typically, 10 coefficients are sufficient to compute fd. Mainly, equation (8) gives the procedure to calculate the reference gain at any instant of time index i=0 through Nd. An example of a gain characteristic for decay mode in accordance with the embodiments is illustrated in
In 405, new speech samples are input x(n), n=0, 1, (N−1) and form the current frame. For the embodiments, the Frame size, N is chosen based on the required smoothness in accordance with the gain characteristics shown in
In 407, an average value
Where, “| |” represents the absolute value of x(n). In 509 and 511 the computed average value is compared with thresholds.
If
Returning to
For the various embodiments, the parameter θth1 is selected based on the perceivable human hearing level and may be for example therefore, around θth1=−60 dBFS. The parameter θth2 is selected mainly to control the perceivable distortion during the transition from a long tail of low level speech in a signal to a sharp raise in a leading edge of a speech signal to make sure that excessive gain is not applied during these instances. An exemplary value for θth2 is about −30 dBFS.
From block 517 the AGC process continues via connection “C” and onto block 701 of
The following description provides further details of the processing of block 523 illustrated in
In block 703, if the previous frame status does not equal “decay,” that is, if the result of block 703 is false, then the process proceeds to block 709. If the result of block 709 if false, that is, if na<Na, then the process checks if δg<ga(i) in block 711. If na<Na and δg<ga(i), that is, if block 709 is false and block 711 is true, then in block 713, the process sets na=0, and further proceeds to block 717. However, if the result of block 709 is true, that is, if na>Na, then the process also proceeds to block 713, and sets na=0.
If na<Na and δg>ga(i), that is, if block 709 is false and block 711 is also false, then in block 715, the process sets na=na+1 and computes G(i)=δg+(gs×ga(i)) in block 719. In 721, if G(i)×
Returning to block 701, and where the determination is made of whether an input signal is in an attack or decay period, if the result of block 701 it false, that is, if
If previous frame status in block 705 is “attack,” such that the result of block 705 is true, then the process moves to block 707 and sets nh=Nh. After block 707, or if the result of block 705 is false, then the process moves to block 725. In block 725, if the result is false, that is, if nd≦Nd the process moves to block 727 and checks δg. If the result of block 727 is also false, that is, if nd≦Nd in block 725 and if δg≧gr(i) in block 727, then in block 731 the process sets gd(i)=1.0.
In block 725, if the result is true, that is, if nd>Nd the process moves to block 729 and sets gr(i)=δg, computes gs=G(i−1)−gr(i) and sets nd=0. If the result of block 725 is false and the result of block 727 is true, that is, if nd≦Nd and δg<gr(i) the process also moves to block 729 sets gr(i)=δg, computes gs=G(i−1)−gr(i) and sets nd=0. After block 729 the process move to block 731 and sets gd(i)=1.0.
If nh>0 in block 733 then in block 739 the process sets nh=Nh−1 and proceeds to block 741. However, if the result of block 733 is false, that is, if nh≦0, the in block 735 the process sets nd=nd+1 and computes the gain G(i)=δg+(gs×gd(i)). After block 737 or 739 the process continues in block 741 and checks if (G(i)×
If in block 527, y(n)>θc, that is, if the clipping threshold is exceeded, then the process sets y(n)=θc, where n=0, 1, 2, . . . , (N−1). In 529 the output may be provided to a speaker, using appropriate digital-to-analog conversion processing or circuitry, or may be stored in memory, for example memory 247 as y(n), where n=0, 1, . . . (N−1).
At that point, the process checks whether the end of the speech/audio frame has been reached in 801 and if so, the AGC processing ends in 802. However if the frame continues then process continues via connection “E” which returns to block 405 of
Additionally, the above described exemplary embodiment is able to improve the crest factor with respect to an input speech/audio signal. The crest factor may be defined as the ratio between the peak value of an input signal to its root mean square (RMS) value.
Therefore, methods and apparatuses for automatic gain control have been disclosed with characteristics that yield good speech/audio quality. In the embodiments disclosed, the gain is adjusted based on curvilinear gain characteristics that assure a higher slope in the beginning and that increments/decrements the slope smoothly and accordingly to whether the signal is proceeding in an attack period or in a decay period as applicable.
Number | Date | Country | Kind |
---|---|---|---|
867/CHE/2008 | Apr 2008 | IN | national |