1. Field of the Invention
The present invention relates to signal processing, and, more specifically but not exclusively, to techniques for attenuating relatively high-power signals in, for example, telephone communication networks.
2. Description of the Related Art
As used herein, the term “acoustic signal” refers to audible sound, while the term “audio signal” refers to electronic signals, such as the electronic signals generated by a microphone receiving an acoustic signal and the electronic signals converted by a loudspeaker into an acoustic signal. If the term “signal” is used without a qualifying adjective, it should be assumed to refer to an audio signal, not an acoustic signal.
In a telephone network, relatively high-power audio signals may be generated having signal values that are outside the range of values that may be represented digitally by digital processing in the telephone network. Relatively high-power audio signals commonly occur in the use of mobile phones. For example, a relatively high-power audio signal may be generated when a mobile phone user attempts to overcome a relatively high amount of background noise by moving the mobile phone's microphone closer to his or her mouth or speaking louder into the microphone.
When high-power audio signal values outside the range of digital representation are processed by digital processing in the network, the signal values are clipped such that (i) all signal values above the largest positive value that can be represented digitally are truncated to the largest positive value and (ii) all values below the smallest negative value that can be represented digitally are truncated to the smallest negative value. Clipping, which distorts the audio signal and results in decreased voice quality, may occur in various processing modules of the telephone network. For example, clipping may occur in an analog-to-digital converter, a codec, or a voice quality enhancement (VQE) module such as an acoustic echo control module or line echo canceller.
To reduce the effects of clipping in digital processing, high-power audio signals may be attenuated upstream of the digital processing using a high-level compensation (HLC) module. Generally, there are two types of prior-art HLC modules: analog modules implemented in hardware and digital modules implemented in hardware and/or software. There are a relatively large number of prior-art analog HLC modules; however, the number of prior-art digital HLC modules is relatively small. Analog HLC modules are discussed in U.S. Pat. No. 5,128,566 and U.S. Pat. No. 7,110,557, the teachings of both of which are incorporated herein by reference in their entirety. A discussion of prior-art digital HLC modules is found in U.S. Pat. No. 7,110,557.
Typical prior-art digital HLC modules attenuate high-power signals by converting signal levels from a linear domain representation into a logarithmic domain representation, applying threshold logic to the logarithmic domain representation as discussed below in relation to
where xlog is the logarithmic-domain representation of an input signal level magnitude value, x is the linear-domain representation of the input signal level magnitude value, and parameter c equals 20×log10(xmax). Note that, in the present example, where xmax is equal to one, parameter c is equal to zero. In other embodiments, where xmax is not equal to one, c may have a value other than zero.
The signal output by the HLC module has level magnitude values, represented in the same logarithmic domain (i.e., dB), that may range from −90 to a first magnitude threshold Tr1log, where magnitude threshold Tr1log corresponds to the maximum level magnitude that may be represented digitally by digital processing downstream of the HLC module. This range of output level magnitude values is represented on the y-axis of coordinate plane 100.
Attenuation of the input signal levels may be characterized by two linear transfer functions. The first linear transfer function y=m1x+b1, where m1=1 and b1=0 is plotted as first line segment 104 on coordinate plane 100 between point 102 having coordinates (−90, −90) and point 106 having coordinates (Tr2log, Tr2log), where Tr2log is a second magnitude threshold. The second linear transfer function y=m2x+b2, where slope m2<1 and y-intercept b2<0, is plotted as second line segment 108 on coordinate plane 100 between point 106 and point 110 having coordinates (|xmax,log|, Tr1log).
Input signal level magnitude values that are less than or equal to second level magnitude threshold Tr2log (i.e., relatively low magnitude values) are not attenuated as represented by the first linear function, which corresponds to first line segment 104. As shown, relatively low input signal level magnitude values are not attenuated since slope m1 of the first linear function equals 1 and y-intercept b1 equals 0. In other words, relatively low input signal level magnitude values are output from the HLC module unchanged.
Input signal level magnitude values that are greater than the second magnitude threshold Tr2log (i.e., relatively high magnitude values) are attenuated according to the second linear function, which corresponds to second line segment 108. Although the first and second line segments share a common endpoint (i.e., point 106), because the slopes of the two line segments are different, the change in slope at point 106 results in a sharp transition from the absence of attenuation in line segment 104 to the presence of attenuation in line segment 108. Such a sharp transition may degrade the quality of the input signal to an unacceptable level that is unpleasant to the listener.
In one embodiment, the present invention is a machine-implemented method for processing a digital input audio signal. The method comprises (a) receiving the digital input audio signal and (b) applying a transfer function to the digital input audio signal to generate a digital output audio signal. The transfer function comprises a non-linear, attenuating portion, such that, when the non-linear, attenuating portion is applied to the digital input audio signal, the digital output audio signal is an attenuated version of the digital input audio signal.
In another embodiment, the present invention is a machine that processes a digital input audio signal. The machine is adapted to (a) receive the digital input audio signal and (b) apply a transfer function to the digital input audio signal to generate a digital output audio signal. The transfer function comprises a non-linear, attenuating portion, such that, when the non-linear, attenuating portion is applied to the digital input audio signal, the digital output audio signal is an attenuated version of the digital input audio signal.
In yet another embodiment, the present invention is a non-transitory machine-readable storage medium, having encoded thereon program code, wherein, when the program code is executed by a machine. The machine implements a method for processing a digital input audio signal, wherein the method comprises (a) receiving the digital input audio signal, and (b) applying a transfer function to the digital input audio signal to generate a digital output audio signal. The transfer function comprises a non-linear, attenuating portion, such that, when the non-linear, attenuating portion is applied to the digital input audio signal, the digital output audio signal is an attenuated version of the digital input audio signal.
Other aspects, features, and advantages of the present invention will become more fully apparent from the following detailed description, the appended claims, and the accompanying drawings in which like reference numerals identify similar or identical elements.
Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments necessarily mutually exclusive of other embodiments. The same applies to the term “implementation.”
When clipping occurs in a line echo canceller, another problem, other than signal distortion, arises: the difference between the level of speech and the level of the corresponding hybrid echo decreases. To further understand this problem, consider
As depicted in
Incoming audio signal Rin is processed by high-level compensation (HLC) module 202, which attenuates incoming audio signal Rin as discussed in further detail below such that all samples of incoming signal Rin are within the range of values that may be represented digitally by line echo canceller 204. The processed incoming audio signal Rout is provided to hybrid 206, which may be implemented as a two-wire-to-four-wire converter that separates the upper and lower channels. Hybrid 206 routes (i) the processed incoming audio signal Rout to back end 208 for further processing (including rendering by the near end's loudspeaker (not shown)) and (ii) outgoing audio signal Sgen received from back end 208 (e.g., corresponding to audio signals generated by the near end's microphone (not shown)) toward the far end. Back end 208, which is part of the near-end user equipment, may include, among other things, the loudspeaker and the microphone of the user equipment.
In routing outgoing audio signal Sgen through hybrid 206, unwanted hybrid echo may be combined with outgoing audio signal Sgen to generate outgoing audio signal Sin having diminished quality. Line echo canceller 204 estimates the hybrid echo in signal Sin based on incoming signal and cancels the hybrid echo when doubletalk is not occurring (i.e., when both the near-end user and the far-end user are not both talking at the same time as determined by line echo canceller 204). When doubletalk is occurring, line echo canceller 204 does not cancel hybrid echo because doing so may distort the sounds generated at back end 208 that are represented in outgoing audio signal Sin.
Typically, line echo canceller 204 detects the occurrence of doubletalk by considering the level of incoming audio signal the level of outgoing audio signal Sin, and the difference in signal levels between incoming audio signal Rout and outgoing audio signal Sin (i.e., the echo return loss (ERL)). When the level of the incoming audio signal is above a first specified level threshold, the level of outgoing audio signal Sin is above a second specified level threshold, and the difference in signal levels is greater than a specified difference threshold (i.e., the level of incoming audio signal Rout is much greater than the level of outgoing audio signal Sin), line echo canceller 204 determines that doubletalk is not occurring.
For a moment, suppose that HLC module 202 is not implemented in near end 200. Further, suppose that (i) the incoming audio signal received by line echo canceller 204 and hybrid 206 is a relatively high-power audio signal (i.e., has signal values that are outside the range of values that may be represented digitally by line echo canceller 204) and (ii) doubletalk is not occurring (i.e., audio signals are not being generated at back end 208). In this situation, the incoming audio signal is passed through hybrid 206 without clipping, and hybrid echo represents most, if not all, of outgoing audio signal Sin. Outgoing audio signal Sin may have a level that is above the second specified level but within the range that may be represented digitally by line echo canceller 204. Therefore, outgoing signal Sin is not clipped by line echo canceller 204.
Now suppose that that the incoming audio signal, which has relatively high power, is clipped by line echo canceller 204. Clipping the incoming audio signal but not outgoing audio signal Sin results in a decrease in the difference in signal levels between the incoming audio signal and outgoing audio signal Sin. If this decrease is significant enough, such that the difference between the incoming and outgoing audio signals is less than the specified difference threshold, then line echo canceller 204 may detect that doubletalk is occurring, even when it is not, and stop cancelling hybrid echo.
To prevent this adverse effect of clipping (i.e., false detection of doubletalk), HLC module 202 attenuates incoming signal Rin such that all samples of incoming signal Rin are within the range of values that may be represented digitally by line echo canceller 204. Rather than attenuating samples of incoming signal Rin having relatively high level magnitudes (i.e., magnitudes greater than or equal to magnitude threshold Tr2log) according to a linear transfer function similar to that used to generate line segment 108 of
HLC module 202 does not attenuate relatively low input signal magnitude values (i.e., magnitude values less than or equal to second magnitude threshold Tr2log) as represented by the linear transfer function y=m3x+b3, where m3=1 and b3=0. The linear function is plotted as first line segment 304 on coordinate plane 300 between point 302 having coordinates (−90, −90) and point 306 having coordinates (Tr2log, Tr2log), where parameter Tr2log is a second magnitude threshold. However, unlike the prior-art HLC module discussed above in relation to
Curve 308 has six characteristics that are of particular interest.
In selecting a suitable non-linear function for implementing the “soft” attenuation of HLC module 202, the curve defined by the selected transfer function should satisfy characteristics #1, #2, and #3 and at least one of characteristics #4 to #6 in Table I (or Table II discussed below). As used herein, the term “soft attenuation” refers to attenuation that is based on a curve that satisfies characteristics #1, #2, and #3 and at least one of characteristics #4 to #6 in Table I (or Table II discussed below). Further, the terms “soft transfer function” and “soft non-linear transfer function” refer to a function that is characterized by a curve that satisfies characteristics #1, #2, and #3 and at least one of characteristics #4 to #6 in Table I (or Table II discussed below). Selection of a suitable non-linear transfer function is discussed in further detail below.
As shown on the x-axis of coordinate plane 500, the input signal level magnitude values, represented in the linear domain, range from zero to a maximum possible magnitude value |xmax| equal to one. Further, as shown on the y-axis of coordinate plane 500, the output signal level magnitude values, represented in the linear domain, range from zero to a first level magnitude threshold Tr1, which is a linear-domain representation of first magnitude threshold Tr1log of
A linear transfer function y=m5x+b5, where m5=1 and b=0 characterizes the non-attenuation of relatively low input signal level magnitude values. The linear function is plotted as line segment 502 on coordinate plane 500 between the origin of coordinate plane 500 (i.e., (0, 0)) and point 504 having coordinates (Tr2, Tr2), where Tr2 is a linear-domain representation of second magnitude threshold Tr2log of
Note that, if a monotonically increasing, non-linear transfer function ƒ(x) is selected such that F(xlog)=g(ƒ(g−1(x))), where g(x) is a function of the transition from linear domain to logarithmic domain (e.g., g(x)=log(x)), g−1(x) is a function of the transition from the logarithmic domain to linear domain (e.g., g−1(xlog)=exp(xlog)), and non-linear transfer function F(xlog) is a direct image of non-linear transfer function ƒ(x), then transfer function F(xlog) will also increase monotonically. Additionally, if non-linear function ƒ(x) forms a convex upwards curve and has a derivative at Tr2log equal to 1, then non-linear transfer function F(xlog) will also form a convex upwards curve (i.e., d2F/dx2<0 for all x in interval [Tr2, xmax)) and have a derivative at point 504 equal to one (i.e., dF/dx|x=Thr2=1). Therefore, it is possible to perform “soft” attenuation in the linear domain and still achieve the same voice quality as when “soft” attenuation is performed in the logarithmic domain.
ƒ(x)=anxn+an-1xn-1+ . . . +a1x+a0 (2)
where n is the degree of the polynomial and the values of coefficients an, an-1, . . . , a0 are unknown. When selecting a polynomial, a relatively low-degree polynomial is preferred because it may be implemented in HLC module 202 with lower complexity than a high-degree polynomial. For ease of explanation, suppose that a quadratic polynomial, where n=2 is selected.
After selecting the type of non-linear function, steps 704 to 708 are performed to determine values for the coefficients (e.g., an, an-1, . . . , a0) that yield a non-linear function in which the two endpoint conditions of Table II and at least two of the four remaining conditions of Table II are satisfied. Steps 704 to 708 may be implemented by a computer executing a suitable computer program.
In step 704, the first and second magnitude thresholds (e.g., Tr1 and Tr2) are selected. These thresholds may be selected, for example, randomly or from a set of specified threshold values. In step 706, the coefficients (e.g., an, an-1, . . . , a0) are calculated based on the first and second magnitude thresholds. The coefficients are calculated based on equations for the coefficients that are determined by solving a system of equations that correspond to three or more of the characteristics in Table II. The system of equations may be selected by the designer based on the desired characteristics of the non-linear transfer function ƒ. For example, suppose that the designer selects a system of equations comprising the equations of the first, second, and fifth characteristics in Table II, where ƒ(x) is a quadratic polynomial as described above. Substituting magnitude thresholds Tr1 and Tr2 into those equations yields Equations (3), (4), and (5) as follows:
a
2
Tr22+a1Tr2+a0=Tr2 (3)
a
2
+a
1
+a
0
=Tr1 (4)
2a2Tr2+a1=1 (5)
Solving Equations (3) to (5) for coefficients a2, a1, and a0 yields Equations (6), (7), and (8) below:
Note that derivation of Equations (6) to (8) may be determined by the designer prior to implementing flow diagram 700 in software, and Equations (6) to (8) may be implemented in software to calculate values for the coefficients in step 706.
Once values for the coefficients have been determined, step 708 is performed to check whether or not one or more of the remaining characteristics in Table II are satisfied. The characteristics that are checked may be selected by the designer. For example, suppose that the designer selects characteristics #3 and #4 to check. The equations corresponding to characteristics #3 and #4 may be represented as shown in Equations (9) and (10), respectively, below:
2a2x+a1≧0 for all x in interval [Tr2,1] (9)
2a2≦0 for all x in interval [Tr2,1] (10)
In this case, it has been determined that characteristics #1 to #5 are not satisfied for all possible pairs of magnitude thresholds (Tr1, Tr2). However, characteristics #1 to #5 are satisfied for a subset of possible pairs (Tr1, Tr2), where 0<Tr2<Tr1<1. From Equation (9), it can be determined that characteristic #3 is satisfied when Tr2≦2Tr1−1. If the selected characteristics are satisfied, then processing is stopped. If the selected characteristics are not satisfied, then step 704 selects a new pair of magnitude thresholds, and steps 706 and 708 are repeated.
As another example, suppose that, instead of selecting a system of equations comprising the equations corresponding to the first, second, and fifth characteristics of Table II as described above, the designer selects a system of equations comprising the equations corresponding to the first, second, and sixth characteristics. The equations corresponding to the first and second characteristics may be represented as shown in Equations (3) and (4) above, and an equation corresponding to the sixth characteristic may be represented as shown in Equation (11) below:
2a2+a1=0 (11)
Solving Equations (3), (4), and (11) for coefficients a2, a1, and a0 yields Equations (12), (13), and (14) below:
Similar to Equations (6) to (8), the derivation of Equations (12) to (14) may be determined by the designer prior to implementing flow diagram 700 in software, and Equations (12) to (14) may be implemented in software to calculate values for the coefficients in step 706.
Once values for the coefficients have been calculated using Equations (12) to (14), one or more of the remaining characteristics in Table II are checked in step 708. For example, suppose that the designer selects characteristics #3 and #4 to check, which are represented in Equations (9) and (10) above, respectively. It has been determined that characteristics #1 to #4 and #6 are satisfied for all possible pairs of magnitude thresholds (Tr1, Tr2), where 0<Tr2<Tr1<1.
Initially, magnitude thresholds Tr1 and Tr2 are provided to coefficient determination block 808, which determines the values of coefficients a2, a1, and a0 based on magnitude thresholds Tr1 and Tr2. Coefficient determination block 808 may calculate coefficients a2, a1, and a0 using Equations (6), (7), and (8), respectively, or Equations (12), (13), and (14), respectively, upon receiving thresholds values Tr1 and Tr2. Alternatively, coefficients a2, a1, and a0 may be calculated prior to operating HLC module 202 using Equations (6), (7), and (8), respectively, or Equations (12), (13), and (14), respectively, and may be retrieved from memory based on the values of magnitude thresholds Tr1 and Tr2.
Incoming signal Rin is provided to sign determination block 802 and magnitude determination block 804, one sample Rin(i) at a time. Sign determination block 802 determines the sign of each sample Rin(i) as shown in Equation (15) below:
z(i)=sign(Rin(i)) (15)
where sign( ) is a function that extracts the sign of Rin(i). Magnitude determination block 804 determines the magnitude of each sample Rin(i) as shown in Equation (16) below:
r(i)=|Rin(i)| (16)
Block 806 compares each magnitude value r(i) to magnitude threshold Tr2. If a magnitude value r(i) is less than or equal to magnitude threshold Tr2 (i.e., r(i) is a relatively low magnitude), then block 812 calculates a non-attenuated magnitude value r0(i) according to the linear transfer function corresponding to line segment 502 of
r
0(i)=r(i) (17)
If a magnitude value r(i) is greater than magnitude threshold Tr2 (i.e., r(i) is a relatively high magnitude), then block 810 calculates an attenuated magnitude value r0(i) as shown in Equation (18) as follows:
r
0(i)=a2r(i)2+a1r(i)+a0 (18)
Each magnitude value r0(i) calculated by block 812 and block 810 is provided to block 814, which applies the appropriate sign to the signal Rout as shown in Equation (19) below:
R
out(i)=z×r0(i) (19)
Compared to comparable prior-art digital HLC modules that perform attenuation of magnitude values represented in the logarithmic domain, the HLC module implemented by data flow diagram 800 of
Further, compared to comparable prior-art digital HLC modules that apply linear attenuation to relatively high magnitude values, HLC modules of the present invention may reduce the amount of distortion that results from attenuating relatively high magnitude values.
Although HLC module 202 of
Although
Further, although the present invention was described as applying a first linear transfer function (e.g., 304, 502) having a slope equal to one to input signal level magnitude values less than a specified threshold (e.g., Tr2log, Tr2) and a non-linear transfer function (e.g., 308, 506) to input signal level magnitude values greater than the specified threshold, the present invention is not so limited. Various embodiments of the present invention may be envisioned in which the linear transfer function has a slope other than one. Further, various embodiments of the present invention may be envisioned that do not apply a linear transfer function to input signal level magnitude values less than a specified threshold, but rather, apply a non-linear transfer function to all input signal level magnitude values. For example, in
The present invention may be implemented as circuit-based processes, including possible implementation as a single integrated circuit (such as an ASIC, an FPGA, or a digital signal processor), a multi-chip module, a single card, or a multi-card circuit pack. As would be apparent to one skilled in the art, various functions of circuit elements may also be implemented as processing blocks in a software program. Such software may be employed in, for example, a digital signal processor, micro-controller, general-purpose computer, or other processor.
The present invention can be embodied in the form of methods and apparatuses for practicing those methods. The present invention can also be embodied in the form of program code embodied in tangible media, such as magnetic recording media, optical recording media, solid state memory, floppy diskettes, CD-ROMs, hard drives, or any other non-transitory machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. The present invention can also be embodied in the form of program code, for example, stored in a non-transitory machine-readable storage medium including being loaded into and/or executed by a machine, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. When implemented on a general-purpose processor or other processor, the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits.
Unless explicitly stated otherwise, each numerical value and range should be interpreted as being approximate as if the word “about” or “approximately” preceded the value of the value or range.
It will be further understood that various changes in the details, materials, and arrangements of the parts which have been described and illustrated in order to explain the nature of this invention may be made by those skilled in the art without departing from the scope of the invention as expressed in the following claims.
The use of figure numbers and/or figure reference labels in the claims is intended to identify one or more possible embodiments of the claimed subject matter in order to facilitate the interpretation of the claims. Such use is not to be construed as necessarily limiting the scope of those claims to the embodiments shown in the corresponding figures.
It should be understood that the steps of the exemplary methods set forth herein are not necessarily required to be performed in the order described, and the order of the steps of such methods should be understood to be merely exemplary.
Although the elements in the following method claims, if any, are recited in a particular sequence with corresponding labeling, unless the claim recitations otherwise imply a particular sequence for implementing some or all of those elements, those elements are not necessarily intended to be limited to being implemented in that particular sequence.
The embodiments covered by the claims in this application are limited to embodiments that (1) are enabled by this specification and (2) correspond to statutory subject matter. Non-enabled embodiments and embodiments that correspond to non-statutory subject matter are explicitly disclaimed even if they fall within the scope of the claims.
Number | Date | Country | Kind |
---|---|---|---|
2011107922 | Mar 2011 | RU | national |