An RMS detector uses the concept of the k-NN (classifying using nearest neighbors)-algorithm in order to obtain RMS values. A rms detector using first-order regressor with a variable smoothing factor is modified to penalize samples from center of data in order to obtain RMS values. Samples which vary greatly from the background noise levels, such as speech, scratch, wind and other noise spikes, are dampened in the RMS calculation. When background noise changes, the system will track the changes in background noise and include the changes in the calculation of the corrected RMS value. A minimum tracker runs more often (e.g. two or three times) than the rate as in prior art detectors and methods, tracks the minimum rms value, which is to compute a normalized distance value, which in turn is used to normalize the smoothing factor. From this data, a corrected or revised RMS value is determined as the function of the previous RMS value multiplied by one minus the smoothing factor plus the smooth factor times the minimum rms value to output the corrected RMS for the present invention. The rms value is used to generate a reset signal for the minimum tracker and is used to avoid deadlock in the tracker, for example, when the background signal increases/decreases over time.

The present invention relates to an ambient noise Root Mean Square (RMS) level detector. In particular, the present invention is directed toward an improved noise RMS detector that is robust to speech presence, wind noise, and other sudden variations in noise levels.


A personal audio device, such as a wireless telephone, includes an adaptive noise canceling (ANC) circuit that adaptively generates an anti-noise signal from a reference microphone signal and injects the anti-noise signal into the speaker or other transducer output to cause cancellation of ambient audio sounds. An error microphone is also provided proximate the speaker to measure the ambient sounds and transducer output near the transducer, thus providing an indication of the effectiveness of the noise canceling. A processing circuit uses the reference and/or error microphone, optionally along with a microphone provided for capturing near-end speech, to determine whether the ANC circuit is incorrectly adapting or may incorrectly adapt to the instant acoustic environment and/or whether the anti-noise signal may be incorrect and/or disruptive and then takes actions in the processing circuit to prevent or remedy such conditions.

Examples of such adaptive noise cancellation systems are disclosed in published U.S. Patent Application 2012/0140943, published on Jun. 7, 2012, and in published U.S. Patent Application 2012/0207317, published on Aug. 16, 2012, both of which are incorporated herein by reference. Both of these references are assigned to the same assignee as the present application and name at least one inventor in common and thus are not prior art to the present application, but are provided to facilitate the understating of ANC circuits as applied in the field of use.

Referring now to FIG. 1, a wireless telephone 10 is illustrated in accordance with an embodiment of the present invention is shown in proximity to a human ear 5. Wireless telephone 10 includes a transducer, such as speaker SPKR that reproduces distant speech received by wireless telephone 10, along with other local audio events such as ring tones, stored audio program material, injection of near-end speech (i.e., the speech of the user of wireless telephone 10) to provide a balanced conversational perception, and other audio that requires reproduction by wireless telephone 10, such as sources from web-pages or other network communications received by wireless telephone 10 and audio indications such as battery low and other system event notifications. A near-speech microphone NS is provided to capture near-end speech, which is transmitted from wireless telephone 10 to the other conversation participant(s).

Wireless telephone 10 includes adaptive noise canceling (ANC) circuits and features that inject an anti-noise signal into speaker SPKR to improve intelligibility of the distant speech and other audio reproduced by speaker SPKR. A reference microphone R is provided for measuring the ambient acoustic environment and is positioned away from the typical position of a user's/talker's mouth, so that the near-end speech is minimized in the signal produced by reference microphone R. A third microphone, error microphone E, is provided in order to further improve the ANC operation by providing a measure of the ambient audio combined with the audio reproduced by speaker SPKR close to ear 5, when wireless telephone 10 is in close proximity to ear 5. Exemplary circuit 14 within wireless telephone 10 includes an audio CODEC integrated circuit 20 that receives the signals from reference microphone R, near speech microphone NS, and error microphone E and interfaces with other integrated circuits such as an RF integrated circuit 12 containing the wireless telephone transceiver.

In general, the ANC techniques measure ambient acoustic events (as opposed to the output of speaker SPKR and/or the near-end speech) impinging on reference microphone R, and by also measuring the same ambient acoustic events impinging on error microphone E, the ANC processing circuits of illustrated wireless telephone 10 adapt an anti-noise signal generated from the output of reference microphone R to have a characteristic that minimizes the amplitude of the ambient acoustic events at error microphone E. Since acoustic path P(z) (also referred to as the Passive Forward Path) extends from reference microphone R to error microphone E, the ANC circuits are essentially estimating acoustic path P(z) combined with removing effects of an electro-acoustic path S(z) (also referred to as Secondary Path) that represents the response of the audio output circuits of CODEC IC 20 and the acoustic/electric transfer function of speaker SPKR including the coupling between speaker SPKR and error microphone E in the particular acoustic environment, which is affected by the proximity and structure of ear 5 and other physical objects and human head structures that may be in proximity to wireless telephone 10, when wireless telephone is not firmly pressed to ear 5.

Such adaptive noise cancellation (ANC) systems may employ a Root Mean Square (rms) detector to detect average background noise levels. Such an RMS detector needs to track background noise levels slowly but not so slowly as to become insensitive to environmental variations. An ideal RMS detector should be robust to speech presence, robust to scratching (contact) on the microphone, robust to wind noise, and a have a low computational complexity. For the purposes of describing the present ambient noise RMS detector, the lower case rms variable is utilized to refer to the prior art techniques and the upper case RMS to represent the corrected signal of the present ambient noise RMS detector, as set forth below. The present ambient noise RMS detector may utilize the prior art rms value in generating the RMS signal.

Perhaps the most well-known background noise estimation method, based on minimum statistics, was the rms detector introduced by Ranier Martin. See, Martin, Ranier, Noise Power Spectral Density Estimation Based on Optimal Smoothing and Minimum Statistics, IEEE Transactions on Speech and Audio Processing, Col. 9, No, 5, July 2001, incorporated herein by reference, as well as Martin, Ranier, Spectral Subtraction Based on Minimum Statistics, in Proc. 7th EUSIPCO '94, Edinburgh, U.K., Sep. 13-16, 1994, pp/. 1182-1195, also incorporated herein by reference. Israel Cohen has made another RMS detector based on the Martin design. See, Cohen, Israel, Noise Spectrum Estimation in Adverse Environments: Improved Minima Controlled Recursive Averaging, IEEE Transactions on Speech and Audio Processing, Vol. 11, Issue 5, September 2003, incorporated herein by reference as well as Cohen, Israel, Noise Estimation by Minima Controlled Recursive Averaging for Robust Speech Enhancement, IEEE Signal Processing Letters, Vol. 9, No. 1, January 2002, also incorporated herein by reference. Both the Martin and Cohen methods and designs employ a method to track the minimum RMS value. Both methods also use a first-order regressor with a variable smoothing factor.

The Cohen design may be less complex compared and provides better performance compared to the Martin design. The Cohen design depends on a couple of thresholds and parameters that should be adjusted for different applications. The Cohen design also uses less memory than the Martin design in that previous values of rms are kept to find the minimum value. The problem with the Cohen design is that it is susceptible to non-stationary noise such as spike noise. For example, when used in an adaptive noise cancellation system (ANC) on a cellular phone or the like, spike noise such as wind noise or scratching (user's/talker's hand scratching or rubbing the case) may create spikes to which the Cohen design would over-react. As a result, the performance of an ANC system, for example, in a cellular telephone or the like, may be degraded, as the rms detector over-reacts to these spike noises.

A simple rms detector based on a first order regression may produce an output illustrated in FIG. 2. This first order regression may be calculated as shown in equation (1):




























where α represents a smoothing factor, rms(n) represents the rms value for the sample n and input(n) represents the input signal for sample n, and n is a sample integer number. Thus, the rms value in equation (1) is calculated by multiplying a smoothing factor (subtracted from one) times the previous rms value and then adding the absolute value of the input value times this same smoothing factor. The smoothing factor α may be selected from one of two values, αatt or αdec depending on whether the absolute value of the input signal is greater or less than the previous rms value.

The problem with such a simple rms detector is that it not only tracks background noise, but also speech, scratch, and wind noise. As illustrated in FIG. 2, the outer darker line 210 represents a speech signal, with occasional spike noise 220 as shown. The lighter line 230 represents the rms signal, calculated with a slow attack and fast decay, as shown in Equation (1). As can be seen in FIG. 2, the rms value 230 calculated using Equation (1) ends up tracking these spike signals 220, which maybe undesirable for an adaptive noise cancellation (ANC) circuit. By tracking the spike signals 220, the ANC circuit may end up generating inappropriate anti-noise, and as a result, create artifacts in the reproduced audio signal for the user.


The present ambient noise RMS detector represents an improvement over the prior art rms detector from a adaptive or machine learning perspective. The present ambient noise RMS detector uses the concept of a k-NN (classifying using nearest neighbors) algorithm in order to obtain RMS values. The k-nearest neighbor algorithm (k-NN) is a method for classifying objects based on closest training examples in the feature space. k-NN is a type of instance-based learning, or lazy learning where the function is only approximated locally and all computation is deferred until classification. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common amongst its k nearest neighbors (k is a positive integer, typically small). If k=1, then the object is simply assigned to the class of its nearest neighbor.

The same method can be used for regression, by simply assigning the property value for the object to be the average of the values of its k nearest neighbors. It can be useful to weight the contributions of the neighbors, so that the nearer neighbors contribute more to the average than the more distant ones. (A common weighting scheme is to give each neighbor a weight of 1/d, where d is the distance to the neighbor. This scheme is a generalization of linear interpolation.)

The present invention incorporates a prior art rms detector using first-order regressor with a variable smoothing factor but adds additional features to penalize samples from center of data in order to obtain RMS values. Thus, samples which vary greatly from the background noise levels, such as speech, scratch, and other noise spikes, are dampened in the RMS calculation. However, when background noise increases/decreases (changes in general), the system will track this change in background noise and include that in the calculation of the corrected RMS value.

Output from a prior art rms detector using a first-order regressor with a variable smoothing factor is fed to a minimum tracker, which is also known in the art. The minimum tracker tracks the minimum rms value, Rmin over time. This revised minimum value is used to compute a normalized distance value d, which represents the ratio expressed as the absolute value of the difference between the previously calculated rms value, and the RMS value calculated in the present ambient noise RMS detector divided by the RMS value calculated by the present ambient noise RMS detector. This value d in turn is used to normalize the smoothing factor α by dividing the smoothing factor by the maximum of d or 1.

Once these values are calculated, a corrected or revised RMS value can be determined as the function of the previous RMS value multiplied by one minus the smoothing factor plus the smooth factor times the minimum rms value to output the corrected RMS for the present ambient noise RMS detector. The rms value may be used to generate a reset signal for the minimum tracker. This reset signal may be operated on an order of 0.1 to 1 seconds and is used to avoid deadlock in the tracker, for example, when the background signal increases over time.

The effect of the present ambient noise RMS detector, as demonstrated in the Figures attached herewith, is to provide a background RMS value which is largely immune from sudden spikes in value, such as due to speech, “scratching” (when a person physically touches the microphone, for example), or wind noise, particularly when compared to the prior art techniques.

While discussed herein in the context of cellular telephones and adaptive noise cancellation circuits used therein, the present ambient noise RMS detector has applications for a number of audio devices and the like. For example, the RMS detector of the present invention may be applied to audio and audio-visual recording equipment, computing devices equipped with microphones, speech recognition systems, speech activated systems (e.g., in automobiles), and even event detectors, such as alarm systems, where it may be desirable to filter background sounds from sudden noises, such as glass-break or speech by intruders. While disclosed in the context of cellular phones and adaptive noise cancellation circuits, the present ambient noise RMS detector should in no way be construed as being limited to that particular application.


The present ambient noise RMS detector improves upon the techniques of prior art rms detectors such as taught by Martin and Cohen by using an improved algorithm in the RMS detector. FIG. 3 is a block diagram of the present ambient noise RMS detector. Referring to FIG. 3, a raw rms value is calculated from the input signal using known prior art techniques. Blocks 110, 120, and 130 are elements of a first-order regressor with a variable smoothing factor. The input signal, which in this instance may be a background noise signal with speech, is fed to block 110 where the absolute value of the signal is taken. This absolute value signal in turn is fed to low-pass-filter 120 and then to downsampler 130. The net effect is to output a raw rms value such as described above in connection with Equation (1). As these first three elements of the block diagram are known in the art, they will not be described in further detail.

Both the Martin and Cohen methods and designs discussed above also employ a method to track the minimum rms value, Rmin, and tracking the minimum rms value is one function of the present ambient noise RMS detector. Speech, scratching (physical contact) on the microphone, wind noise, and any spike noise are all unlikely background noise in that they are not always present but appear as noise spikes in the ambient noise signal. This fact can be leveraged by comparing a short-term minimum RMS value with a long-term one to determine whether such a spike has occurred. FIG. 4 is a graph illustrating how the minimum RMS value is tracked. For every instantaneous transition, short-term rms values Rmin and Rtmp may be calculated as:





























where Rmin is the minimum rms value over time, and Rtmp is a temporary minimum rms value to track background noise changes.

The reset mechanism for the ambient noise detector is then calculated simultaneously with equation (2). This reset mechanism calculates a long-term rms value every 0.1 to 1 seconds for values Rmin and Rtmp as:





















As illustrated in FIG. 4, this approach has the effect of delaying the change in minimum RMS value Rmin in response to changes in the base rms calculation of background noise rms value BK rms. As the background rms signal increases from level A to level B, the temporary minimum value Rtmp, calculated according to Equations (2) and (3) above, rises from level A to level B, delayed over time, as illustrated in FIG. 4. The value of minimum RMS value Rmin rises from level A to level B delayed even further (the same is true for decreasing from level B to level A), as illustrated in FIG. 4. Although FIG. 4 only shows the case where level A is less than level B, the same effect occurs when level A is greater than level B as well.

In Cohen's method from this minimum RMS value Rmin calculation, it may be possible to calculate RMS using a first approach based on the probability of the presence of disturbance in the background noise signal:



p(l)→1→αd(l)→1  (4)

Here, p(1) is the probability of the presence of any disturbance (e.g., speech presence), and as this probability approaches one, the smoothing factor value approaches one. This probability value may be calculated as follows:































where αp represents a smoothing factor, and δ is the threshold which determines the level of any disturbance compared to Rmin(1).

One problem with this RMS tracking technique is that there are too many parameters to adjust. In addition, its reaction time is slow and is not robust. Speech rms can leak to the background RMS value. While the prior art Cohen design has additional components to make the system more robust, the system still suffers from these same operational problems. Thus, the present ambient noise RMS detector improves on the algorithms of equations (4) and (5) to provide an improved minimum RMS value Rmin tracking technique and RMS calculation.

Referring back to FIG. 3, in the present ambient noise RMS detector, the output raw rms value is then fed to a minimum tracker 140. In block 150, the normalized distance d between the current RMS and the instantaneous rms value is computed as:










where rms(l) is a raw rms value for sample l and RMS(l) is a corrected RMS factor.

In block 160, the smooth factor is normalized with this distance d:










where αd(l) represents the normalized smoothing factor for sample l and α0 represents a standard smoothing factor, and max(d, 1) is the maxima of the normalized distance and 1. The normalized smoothing factor is then fed to block 170:

RMS(l)=(1−αd(l))·RMS(l−1)+αd(lRmin(l)  (8)

where RMS(l) is the corrected RMS value, and RMS(l−1) is a previous corrected RMS value, αd(l) represents the normalized smoothing factor for sample l as calculated in equation (7) and minimum RMS value Rmin is the minimum rms value calculated in equation (3).

The raw rms value is also fed to block 190, which then generates a reset signal Reset. The reset signal Reset is triggered in order to reset the system to avoid any deadlock, for example, when the background noise signal rises gradually. The reset mechanism is shown in equation (3) as discussed previously.

FIGS. 4-6 are graphs illustrating the operation of the present ambient noise RMS detector. In FIG. 5A, the instantaneous RMS and ambient RMS are shown for a sample input signal comprising background noise with speech. In FIG. 5A, the background noise appears as the baseline signal 510 and the speech portion appears in the center as the elevated portion 520. The instantaneous rms appears as the thick line (510, 520), while the final calculated ambient RMS appears as the thin line 530 below the thick line. In FIG. 5B, the value α is shown calculated from the instantaneous rms according to equation (7) above and block 160 in FIG. 3. FIG. 5C shows the calculation of d according to equation (6) above and block 150 of FIG. 3. FIG. 5D shows the resulting minimum RMS value Rmin as determined from equation (8) above and block 170 of FIG. 3.

FIG. 6 is a graph comparing a signal containing background noise with speech, showing a comparison between the old method of the prior art and the technique and apparatus of the present invention. The rms(l) signal is shown as the wide dark signal 610 in FIG. 6 with the speech disturbance 620 in the central portion. The rms calculation using the prior art method is shown as the wavy light line 630 in the center of that signal. As shown in FIG. 6, spikes occur in this signal in relationship to the source signal. As illustrated in FIG. 6, the prior art technique is sensitive to speech in the background noise signal. The bottom line 640 represents the RMS value calculated using the technique of the present ambient noise RMS detector. As illustrated in FIG. 6, the technique of the present ambient noise RMS detector is far less responsive to transient spikes than the prior art technique.

FIG. 7 is a graph comparing a signal containing background noise 710 with a scratch signal 720 in the background noise, and showing a comparison between the old method of the prior art and the technique and apparatus of the present ambient noise RMS detector. The scratch signals 720 are more pronounced than the speech signals 620 of FIG. 6. The rms(l) signal is shown as the wide dark signal 710 in FIG. 7. The rms calculation using the prior art method is shown as the wavy light line 730 in the center of that signal. As shown in FIG. 7, spikes 720 occur in this signal in relationship to the source signal 710. The bottom line 740 represents the RMS value calculated using the technique for the present ambient noise RMS detector. As illustrated in FIG. 7, the technique for the present ambient noise RMS detector is far less responsive to transient spikes than the prior art technique.

The present ambient noise RMS detector has thus been proven to more accurately calculate RMS values from an input signal, while being relatively immune to speech, wind noise, scratch, and other signal spikes. This improved RMS value calculation provides a better input value for an adaptive noise cancellation (ANC) circuit for use, for example, in a cellular telephone or the like. This improved value in turn allows for better operation of the ANC circuit, creating fewer artifacts or dropped out audio (e.g., due to the ANC circuit overcompensating and muting desired audio signals) in the audio output to the user.

While embodiments of the present ambient noise RMS detector have been disclosed and described in detail herein, it may be apparent to those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope thereof.

