The present invention relates generally to a pressurized air delivery system coupled to a communication system and more specifically to removing periodic noise from an audio signal generated therein.
Good, reliable communications among personnel engaged in hazardous environmental activities, such as fire fighting, are essential for accomplishing their missions while maintaining their own health and safety. Working conditions may require the use of a pressurized air delivery system such as, for instance, a Self Contained Breathing Apparatus (SCBA) mask and air delivery system. However, even while personnel are using such pressurized air delivery systems, it is desirable that good, reliable communications be maintained and personnel health and safety be effectively monitored.
Depending upon the type of air delivery system 110 being used, the system 110 may provide protection to a user by, for example: providing the user with clean breathing air; keeping harmful toxins from reaching the user's lungs; protecting the user's lungs from being burned by superheated air inside of a burning structure; and providing protection to the user from facial and respiratory burns. Moreover, in general the mask is considered a pressure demand breathing system because air is typically only supplied when the mask wearer inhales.
Communication system 130 typically includes a conventional microphone 132 that is designed to record the speech of the mask wearer and that may be mounted inside the mask, outside and attached to the mask, or held in the hand over a voicemitter port (a thin metal plate designed to pass speech sounds from inside the mask to the outside with minimal attenuation) on the mask 112. Communication system 130 further includes a communication unit 134 such as a two-way radio that the mask wearer can use to communicate his speech, for example, to other communication units. The mask microphone device 132 may be connected directly to the radio 134 or through an intermediary electronic processing device 138. This connection may be through a conventional wire cable (e.g., 136), or could be done wirelessly using a conventional RF, infrared, or ultrasonic short-range transmitter/receiver system. The intermediary electronic processing device 138 may be implemented, for instance, as a digital signal processor and may contain interface electronics, audio amplifiers, and battery power for the device and for the mask microphone.
There are some shortcomings associated with the use of systems such as system 100. These limitations will be described, for ease of illustration, by reference to the block diagram of
Returning to the shortcomings of systems such as system 100, an example of such a shortcoming relates to the generation by these systems of loud acoustic noises as part of their operation. More specifically, these noises can significantly degrade the quality of communications, especially when used with electronic systems such as radios. One such noise that is a prominent audio artifact introduced by a pressurized air delivery system, like a SCBA system, is the low-air alarm noise, which is illustrated in
The low-air alarm (LAA) noise occurs as a low frequency, periodic, pulsatile harmonic-rich broadband noise generated by an alarm device coupled to the pressurized air delivery system (
Thus, there exists a need for methods and apparatus for effectively detecting and attenuating low-air alarm noise that corrupts audio communication in a system that includes a pressurized air delivery system coupled to a communication system.
The accompanying figures, where like reference numerals refer to identical or functionally similar elements throughout the separate views and which together with the detailed description below are incorporated in and form part of the specification, serve to further illustrate various embodiments and to explain various principles and advantages all in accordance with the present invention.
Before describing in detail embodiments that are in accordance with the present invention, it should be observed that the embodiments reside primarily in combinations of method steps and apparatus components related to a method and apparatus for removing periodic noise pulses in an audio signal. Accordingly, the apparatus components and method steps have been represented where appropriate by conventional symbols in the drawings, showing only those specific details that are pertinent to understanding the embodiments of the present invention so as not to obscure the disclosure with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein. Thus, it will be appreciated that for simplicity and clarity of illustration, common and well-understood elements that are useful or necessary in a commercially feasible embodiment may not be depicted in order to facilitate a less obstructed view of these various embodiments.
It will be appreciated that embodiments of the invention described herein may be comprised of one or more generic or specialized processors (or “processing devices”) such as microprocessors, digital signal processors, customized processors and field programmable gate arrays (FPGAs) and unique stored program instructions (including both software and firmware) that control the one or more processors to implement, in conjunction with certain non-processor circuits, some, most, or all of the functions of the method and apparatus for removing periodic noise pulses in an audio signal. The non-processor circuits may include, but are not limited to, transmitter apparatus, receiver apparatus, and user input devices. As such, these functions may be interpreted as steps of a method for removing periodic noise pulses in an audio signal described herein. Alternatively, some or all functions could be implemented by a state machine that has no stored program instructions, or in one or more application specific integrated circuits (ASICs), in which each function or some combinations of certain of the functions are implemented as custom logic. Of course, a combination of the two approaches could be used. Both the state machine and ASIC are considered herein as a “processing device” for purposes of the foregoing discussion and claim language.
Moreover, an embodiment of the present invention can be implemented as a computer-readable storage element having computer readable code stored thereon for programming a computer (e.g., comprising a processing device) to perform a method as described and claimed herein. Examples of such computer-readable storage elements include, but are not limited to, a hard disk, a CD-ROM, an optical storage device and a magnetic storage device. Further, it is expected that one of ordinary skill, notwithstanding possibly significant effort and many design choices motivated by, for example, available time, current technology, and economic considerations, when guided by the concepts and principles disclosed herein will be readily capable of generating such software instructions and programs and ICs with minimal experimentation.
Generally speaking, pursuant to the various embodiments, a method, apparatus and a computer-readable storage element for removing periodic noise pulses from a continuous audio signal generated in a pressurized air delivery system is disclosed. In general, the method comprises the steps of: detecting, in a time-windowed segment of the continuous audio signal, a plurality of the periodic noise pulses having a pulse period and being representable in the form of a plurality of signal components combined by convolution; deconvolving the plurality of signal components to generate a plurality of deconvolved signal components; and removing at least a portion of the periodic noise pulses from the time-windowed segment of the continuous audio signal using the deconvolved signal components. The method further beneficially includes an add/overlap process in accordance with the teachings herein to attenuate substantially most if not substantially all of the periodic noise pulses from the audio signal.
In an embodiment, the audio signal is output from a microphone (e.g., a microphone signal) that is part of a communication system coupled to the pressurized air delivery system. The microphone signal includes speech and may also include low-air alarm (LAA) noise pulses. The microphone signal is digitized, and the samples assembled into frames of specified length (e.g., the time-windowed segment of the microphone signal). Each frame of digitized data is then processed for detecting presence of some of the LAA noise pulses, and if present, a pulse period of the LAA noise pulses within the frame is determined (which corresponds to a fundamental period of a pulse train characterizing the noise pulses). The digitized data frame is then processed to transform the sample data signal from the time domain into the cepstral domain, wherein the signal includes a primary noise pulse component and an impulse train component. In the cepstral domain the signal component due to the impulse train is separated and removed from the composite cepstral signal using a process of cepstral filtering. With the impulse train component removed, the remaining cepstral signal containing the primary pulse and any speech is converted back into the time domain. The processed frames of the now time domain signal are then re-synthesized into a continuous signal. The Add/Overlap process further removes the primary pulses from each frame of data during the re-synthesis process to generate an audio signal that is virtually free of the LAA (or other periodic) noise. Those skilled in the art will realize that the above recognized advantages and other advantages described herein are merely exemplary and are not meant to be a complete rendering of all of the advantages of the various embodiments of the present invention.
Before describing in detail the various aspects of the present invention, it would be useful in the understanding of the invention to provide a more detailed description of the low-air alarm noise that was briefly described above. As can be seen in
Referring now to
Method 500 can be implemented in various device locations in system 100 using a processing device such as, for instance, a digital signal processor (DSP). This DSP could be included, for example, in the radio communication device 134, the microphone 132 or another device, e.g., 138 external to the radio and the microphone or a combination of the three. The device further includes a suitable interface for receiving the audio signal, which could be wired (e.g., a cable connection) or wireless. Moreover, method 500 could be implemented as a computer-readable storage element having computer readable code stored thereon for programming a computer (e.g., comprising a processing device) to perform the method.
In one embodiment, the noise being detected and eliminated is LAA noise generated by device 122, which corrupts the speech. However, the teachings herein are not limited to that particular noise, but are also applicable to other periodic noise having characteristics that are similar to that of the LAA noise, wherein the noise can be modeled as signal components that are combined by convolution in the time domain. As such, other alternative implementations of processing different types of periodic noises are contemplated and are within the scope of the various teachings herein.
The method, in accordance with this embodiment of the present invention, when implemented to eliminate LAA noise is also referred to herein as the CANA (Cepstral Alarm Noise Attenuator) method. The basis of the CANA method for eliminating air regulator low-air alarm noise is that the continuous alarm noise can be thought of as the convolution of a single alarm pulse waveform of arbitrary shape with an impulse train having a given periodicity. Through the use of spectral filtering and deconvolution (e.g., cepstral) methods, the periodic pulse component can be separated from the basic pulse waveform and removed leaving only the initial attenuated basic pulse waveform and any concurrent speech signal. An additional aspect of the CANA method is the employment of a unique pulse-period-synchronous add-overlap method to eliminate the remnant pulse waveform and re-synthesize a continuous output waveform.
A more detailed block diagram of an exemplary implementation of a CANA method 600 is shown in
The basic methodology of the CANA method 600 can be summarized as follows. In an embodiment, block 610, A/D Conversion and Input Data Buffering, samples a continuous analog audio signal from a pressurized air breathing apparatus microphone (e.g., as illustrated in
The Data Frame Assembler 730 extracts data from the circular buffer of up to 1024 samples, for instance, and constructs and outputs an analysis frame 740 from the buffer data for further processing. The signal analysis frame size is based on the LAA pulse period by making each analysis frame length equal to, for example, at least twice a calculated pulse period 850 (as determined by module 630 of
Block 630 of the CANA method (600) is a detector to detect the presence of the noise and determine the alarm pulse period. An exemplary implementation of this block is detailed in
Referring back to the structure of a low-air alarm noise in
The low-air alarm noise detector 630 operates by trying to find and verify the low frequency harmonics of the signal that are below typical speech pitch frequencies. It accomplishes this using both frequency and time-signal energy analyses. The current analysis data frame 740, which has a duration that is slightly greater than twice the LAA pulse period, insures that at least 2 alarm pulses are present for processing by noise detector 630. Detector 630 comprises a low-pass filter 810, an FFT (Fast Fourier Transform) block 820, a harmonic peak search block 830 and a pulse period determination block 840 that outputs the estimated LAA noise pulse period 850.
The detector 630 operates by first filtering the analysis frame signal 740 with a 100 Hz low-pass filter (810) and down-sampling the result within a predefined limit, for example, from about 8 KHz to about 250 Hz. Since we are generally only interested in detecting periodicities of the LAA signal, and since the LAA fundamental pulse frequency and first few harmonics are less than 100 Hz, we only examine frequencies in this range. Thus we save computation by down sampling the signal to 250 Hz which is greater than twice the 100 Hz bandwidth and allows using a 256 point FFT. Energy of the time domain down-sampled signal is then determined by squaring each sample, and average energy of the frame is determined. A 256 point FFT of the energy signal is taken (820) to determine a power spectrum, |S(i,k)|2. This size FFT gives about 1 Hz (250 Hz/256 pts=0.977 Hz/pt) of frequency resolution. Note that k corresponds to an index of frequency in the range of 0 to 125 Hz. The harmonic peak search block 830 searches the power spectrum to locate at least two maximum spectral energy peaks satisfying one or more predefined parameters, wherein the located energy peaks correspond to two detected LAA noise pulses. In an embodiment, the one or more parameters include a maximum periodicity threshold and a minimum energy threshold.
For example, a search is done (830) through each frequency bin of the sampled power spectrum |S(i,k)|2 for a maximum spectral energy peak in a range from 20 Hz to 50 Hz (19<k<51). Bin energy Ipk(0,k) and corresponding frequency ƒ0 (0.9765k) are stored. Next, a maximum energy peak in a range from 40 Hz to 100 Hz, Ipk(1,k) and corresponding frequency ƒ1, are found. If the frequency of the second peak satisfies a maximum periodicity threshold, e.g., is within +/−5 Hz of twice the frequency of the first peak, and if both peaks exceed a minimum energy threshold, Et, determined as a percentage of the average spectral energy |S(i,k)|2avg, presence of the alarm noise is assumed and an “alarm present” detection flag AF can be set to 1 (840) to indicate such a detection. The alarm pulse period 850 is determined (840) from the frequency of the fundamental spectral energy peak as, T(i)=1/ƒ0. This pulse period information (850) is used by blocks 620 and 640 as shown in
Examples of the alarm detector 630 signals and outputs are shown in
In accordance with the teachings herein, embodiments of the invention can use a signal processing deconvolution technique (such as cepstral deconvolution, for instance) to process periodic noise signals included in an audio signal generated in a pressurized air delivery system, where the periodic noise pulses can be represented as two or more signals or signal components combined by convolution in the time domain. The LAA noise signal is an example of such a signal having periodic noise pulses that can be viewed in this manner. Thus, in an embodiment, the CANA method uses cepstral deconvolution to deconvolve a primary pulse shape from a periodic impulse train of subsequent pulses in the LAA signal and remove (or substantially attenuate) the impulse train component, leaving only the primary pulse shape. The primary pulse shape can itself also be removed (or substantially attenuated) by further processing (in block 650 described below in further detail). The fundamental mathematics behind this procedure will now be presented. The discussions below are limited to cepstral deconvolution signal processing for illustrative purposes only and is not meant to limit the scope of the teachings herein. Other deconvolution techniques such as, for example, spectral root homomorphic deconvolution are included within the scope of these teachings.
Consider a suitable length frame of a sampled microphone output of a pressurized air delivery system as depicted in
where s(n) is the composite alarm signal, x(n) is the impulse response of an arbitrary digital filter having a magnitude and phase response that describes the shape of the primary pulse, and x(n−nk) are the subsequent pulses, copies of the primary pulse, delayed in time nk samples and having amplitudes of αk. Thus, the low-air alarm signal can be viewed as the convolution of the primary pulse shape (also referred to herein as a primary noise pulse component) with an impulse train p(n) (also referred to herein as a noise impulse train component):
where δ(n) is an impulse occurring at time n. Since the primary alarm pulse waveform is related to subsequent pulses by convolution, they may be separated, in theory, using a deconvolution process such as cepstral deconvolution to 10 generate a deconvolved primary noise pulse cepstrum component and a deconvolved noise impulse train cepstrum component.
For the case of a windowed segment of a continuous signal containing only two low-air alarm pulses, and ignoring the effect of the window for the moment, the mathematical representation can be written as,
s(n)=x(n)*p(n),
p(n)=δ(n)+α1x(n−n1). Eq. 4
Taking the Fourier transform of Equation 4 we get the frequency domain representation:
S(ejω)=X(jω)P(jω),
S(ejω)=X(jω)(1+α1e−jωn
To compute the cepstrum of this signal we first calculate the complex logarithm if Equation 5:
log [S(ejω)]=log [X(ejω)]+log [(P(ejω)],
log [S(ejω)]=log [X(ejω)]+log [(1+α1e−jωn
Thus, the convolution of the primary pulse and the impulse train has been transformed into a multiplication by the Fourier transform and further into an addition by the complex logarithm operation. Calculation of the complex logarithm requires a continuous phase signal. Since the FFT operation produces a discontinuous phase component (modulo 2π radians), a process of “phase unwrapping” is applied to the phase. This procedure is well known in the art and amounts to adding appropriate multiples of 2π radians to the disjointed phase segments. By applying the inverse Fourier transform to Equation 6 we transform the signal into the so-called “cepstral” domain and get,
where c indicates the complex “cepstrum” of the composite signal and the ^ superscript has been added to the variables to indicate the domain change, wherein {circumflex over (x)}(n) is the deconvolved primary noise pulse cepstrum component, and {circumflex over (p)}(n) is the deconvolved noise impulse train cepstrum component.
In the cepstral domain the abscissa unit is time, and the convolved time domain signal components are additive. Note that the time windowing multiplication of the original signal is a convolution in the frequency domain and appears as frequency smearing of the components but does not affect their additive nature in the cepstral domain. The cepstral domain component due to the impulse train, {circumflex over (p)}(n), appears as an alternating sign sequence of impulses spaced nk samples apart, falling off in amplitude as 1/n. The first impulse occurs at the pulse train period time. The separation in the cepstral domain between the primary pulse signal and the impulse train is inversely related to their periodicities in the frequency domain (i.e. directly proportional to the pulse periods). Thus, if the periodicity of the impulse train is much longer that the periodicities of the primary signal pulse X(ejω), they will be well separated in time in the cepstral domain. If this is the case, the impulse sequence (and associated low-air alarm pulses), can be easily removed in the cepstral domain by filtering performed as simple editing (“liftering”) of the cepstrum at the impulse locations, in essence removing {circumflex over (p)}(n) from the cepstral representation. For two pulses this amounts to substantially zeroing the cepstrum at all multiples of the pulse repetition period.
Transformation of the “liftered” cepstral signal back to the time domain is then performed by reversing the cepstral transformation process. The result is the primary pulse minus any secondary noise pulses in the processing window. Note that in this embodiment the Fourier transform approach has been used to calculate the cepstrum of the signal. However, there are other methods of doing this that are known in the art such as a recursive method, and this representation does not preclude the use of these other methodologies.
With actual data the above analysis can be more complicated. For instance, sequential LAA pulses, produced by a mechanical device, are not necessarily identical. In this case, the impulses representing the time locations of the secondary pulse(s) are not delta functions but instead are the impulse response(s) of the transfer function(s) defining the primary pulse from the differing secondary pulse shapes. If the pulse shape transforming transfer function is low-pass the impulses will appear somewhat smeared out instead of impulsive. If additive noise or other signals are present (e.g. speech), the cepstrum of the impulse train is typically more complicated and distributed. An advantage in applying this deconvolution technique to the low-air alarm noise problem is that the alarm pulse periodicity is much longer (20-40 msec) than the average voiced speech pitch period (8-10 msec), making the two periodic components well separated in the cepstral domain and thus easier to separate. Thus removing the periodic component of the low-air alarm noise usually does not affect the periodic component of the speech.
The details of the Cepstral Deconvolver and Filter process 640, the theory of which was described above, will now be described. Filter 640 comprises a windowing function 1004, an FFT block 1010, a log/phase unwrap block 1020, an inverse FFT block 1030, a liftering block 1040, an adaptive lifter generator 1050, an FFT block 1060, a complex exponentiation block 1070, an inverse FFT block 1080 and an un-windowing function 1090. In operation, a frame of data s(i,n) (740) is passed to processing block 1004 of block 640 shown in
In addition, another window known as an exponential window to those skilled in the art may be applied to the data in the analysis frame. This window may be defined as:
β(n)=an, 0<=n<=l, Eq. 8
where l is the length of the analysis frame data sequence. The base a, in one embodiment, is equal to 0.997 although other values may be used for improved results depending on the data. The purpose of this window is to make the process of calculating the complex cepstrum of the analysis data frame s(i,n) more stable. It accomplishes this by moving poles and zeros of s(i,n) away from the z-plane unit circle, making the signal more minimum phase, and minimum or maximum phase signals are more stable in terms of calculation of the cepstrum. In addition to the windowing, the data frame is padded with zeros to a length of N=1024 sample points. This makes use of an FFT algorithm possible and makes the job of phase unwrapping described previously in the theory, easier by over-sampling the phase spectrum. Note that N can be greater than 1024 sample points, a power of two, so that finer frequency resolution may be obtained, though at the cost of more computation.
The windowed analysis frame data is then Fourier Transformed into the frequency domain using an FFT algorithm known to those skilled in the art. This is illustrated by block 1010 in
Block 1050 in
Note that the complex cepstrum is two sided and symmetrical about the origin at index N/2. The positive part or minimum phase component is defined over the interval in Eq. 10 and the maximum phase component over the interval defined by Eq. 11. The lifter index is measured from the start of each interval. srate is the sampling rate which in one embodiment is 8000.0 s/sec. The variable np is a defined number of samples, usually between 2 and 4 samples that widens the lifter around the locations of the cepstral impulse components. This is done to account for the fact that the calculated pulse period T(i) may not be exact, and the cepstral impulses due to the pulse train may be smeared due to the in-exactness of sequential basic pulse waveforms.
The calculated lifter function is used by processing block 1040 to multiply the cepstrum of the analysis frame cepstrum, c(i), thereby eliminating (or at least substantially eliminating) the cepstral component of the pulse train of the low-air alarm noise. The liftered cepstrum, designated by c(i), is then put through the reverse transformation processes designated by blocks 1060, 1070, 1080, and 1090 in
Examples of the waveforms produced by processor block 640 are shown in
The last processor of the CANA method is block 650 of
Depending on the duration of each low-air alarm pulse and based on the fact that each analysis frame contains at least two noise pulses, valid data (the portion of the liftered signal, e.g., 1312 where the pulse waveform has been eliminated), e.g., 1314, can be assumed to exist from the end of each data frame to the middle of the frame. Based on analysis of various low-air alarm noises, the pulse duration is known to be less than half the pulse repetition rate. Assuming the frame length to be Ln samples, valid output data exists in the segment Ln-m . . . Ln where m is half the number of samples in an analysis data frame, e.g., 1316. Based on empirical knowledge of the pulse waveform duration, the valid output data section can conservatively be extended by an extra 100 samples. Thus, the valid output data section of each analysis data frame can be defined by the samples with indices Ln-m-100 . . . Ln. The extra samples allow for frame overlap so that a complete half frame of data can be output for each pair of overlapping frames, e.g., 1318.
To allow a smooth overlap of the first and last 100 samples of each valid output data section, the first 100 samples and the last 100 samples are windowed (tapered) using an appropriate half Hamming window function, for example, as illustrated in
In the foregoing specification, specific embodiments of the present invention have been described. However, one of ordinary skill in the art appreciates that various modifications and changes can be made without departing from the scope of the present invention as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of present invention. The benefits, advantages, solutions to problems, and any element(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as critical, required, or essential features or elements of any or all the claims. The invention is defined solely by the appended claims including any amendments made during the pendency of this application and all equivalents of those claims as issued.
Moreover in this document, relational terms such as first and second, top and bottom, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms “comprises,” “comprising,” “has”, “having,” “includes”, “including,” “contains”, “containing” or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises, has, includes, contains a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. An element proceeded by “comprises . . . a”, “has . . . a”, “includes . . . a”, “contains . . . a” does not, without more constraints, preclude the existence of additional identical elements in the process, method, article, or apparatus that comprises, has, includes, contains the element. The terms “a” and “an” are defined as one or more unless explicitly stated otherwise herein. The terms “substantially”, “essentially”, “approximately”, “about” or any other version thereof, are defined as being close to as understood by one of ordinary skill in the art, and in one non-limiting embodiment the term is defined to be within 10%, in another embodiment within 5%, in another embodiment within 1% and in another embodiment within 0.5%. The term “coupled” as used herein is defined as connected, although not necessarily directly and not necessarily mechanically. A device or structure that is “configured” in a certain way is configured in at least that way, but may also be configured in ways that are not listed.
Number | Name | Date | Kind |
---|---|---|---|
4193004 | Lobdell et al. | Mar 1980 | A |
6862326 | Eran et al. | Mar 2005 | B1 |
7356074 | Shan | Apr 2008 | B2 |
20050255606 | Ahmed et al. | Nov 2005 | A1 |
20060053003 | Suzuki et al. | Mar 2006 | A1 |
Number | Date | Country | |
---|---|---|---|
20080019538 A1 | Jan 2008 | US |