The invention relates generally to the field of acoustics, and in particular to sound pick-up and reproduction. More specifically, the invention relates to a sound discrimination method and apparatus.
In a typical live music concert, multiple microphones (acoustic pick-up devices) are positioned close to each of the instruments and vocalists. The electrical signals from the microphones are mixed, amplified, and reproduced by loudspeakers so that the musicians can clearly be heard by the audience in a large performance space.
A problem with conventional microphones is that they respond not only to the desired instrument or voice, but also to other nearby instruments and/or voices. If, for example, the sound of the drum kit bleeds into the microphone of the lead singer, the reproduced sound is adversely effected. This problem also occurs when musicians are in a studio recording their music.
Conventional microphones also respond to the monitor loudspeakers used by the musicians onstage, and to the house loudspeakers that distribute the amplified sound to the audience. As a result, gains must be carefully monitored to avoid feedback, in which the music amplifying system breaks out in howling that spoils a performance. This is especially problematic in live amplified performances, since the amount of signal from the loudspeaker picked up by the microphone can vary wildly, depending on how musicians move about on stage, or how they move the microphones as they perform. An amplification system that has been carefully adjusted to be free from feedback during rehearsal may suddenly break out in howling during the performance simply because a musician has moved on stage.
One type of acoustic pick-up device is an omni directional microphone. An omni directional microphone is rarely used for live music because it tends to be more prone to feedback. More typically, conventional microphones having a directional acceptance pattern (e.g., a cardioid microphone) are used to reject off axis sounds output from other instruments or voices, or from speakers, thus reducing the tendency for the system to howl. However, these microphones have insufficient rejection to fully solve the problem.
Directional microphones generally have a frequency response that varies with the distance from the source. This is typical of pressure gradient responding microphones. This effect is called the “proximity effect”, and it results in a bass boost when the microphone is close to the source and a loss of bass when the microphone is far from the source. Performers who like proximity effect often vary the distance between the microphone and the instrument (or voice) during a performance to create effects and to change the level of the amplified sound. This process is called “working the mike”.
While some performers like proximity effect, other performers prefer that over the range of angles and distances that the microphone accepts sounds, the frequency response of the improved sound reproducing system should remain as uniform as possible. For these performers the timbre of the instrument should not change as the musician moves closer to or further from the microphone.
Cell phones, regular phones and speaker phones can have performance problems when there is a lot of background noise. In this situation the clarity of the desired speakers voice is degraded or overwhelmed by this noise. It would be desirable for these phones to be able to discriminate between the desired speaker and the background noise. The phone would then provide a relative emphasis of the speaker's voice over the noise.
The present invention is directed to overcoming one or more of the problems set forth above. Briefly summarized, according to one aspect of the present invention, method of distinguishing sound sources includes transforming data, collected by at least two transducers which each react to a characteristic of an acoustic wave, into signals for each transducer location. The transducers are separated by a distance of less than about 70 mm or greater than about 90 mm. The signals are separated into a plurality of frequency bands for each transducer location. For each band a relationship of the magnitudes of the signals for the transducer locations is compared with a first threshold value. A relative gain change is caused between those frequency bands whose magnitude relationship falls on one side of the threshold value and those frequency bands whose magnitude relationship falls on the other side of the threshold value. As such, sound sources are discriminated from each other based on their distance from the transducers.
Further features of the invention include (a) using a fast Fourier transform to convert the signals from a time domain to a frequency domain, (b) comparing a magnitude of a ratio of the signals, (c) causing those frequency bands whose magnitude comparison falls on one side of the threshold value to receive a gain of about 1, (d) causing those frequency bands whose magnitude comparison falls on the other side of the threshold value to receive a gain of about 0, (e) that each transducer is an omni-directional microphone, (f) converting the frequency bands into output signals, (g) using the output signals to drive one or more acoustic drivers to produce sound, (h) providing a user-variable threshold value such that a user can adjust a distance sensitivity from the transducers, or (i) that the characteristic is a local sound pressure, its first-order gradient, higher-order gradients, and/or combinations thereof.
Another feature involves providing a second threshold value different from the first threshold value. The causing step causes a relative gain change between those frequency bands whose magnitude comparison falls in a first range between the threshold values and those frequency bands whose magnitude comparison falls outside the threshold values.
A still further feature involves providing third and fourth threshold values that define a second range that is different from and does not overlap the first range. The causing step causes a relative gain change between those frequency bands whose magnitude comparison falls in the first or second ranges and those frequency bands whose magnitude comparison falls outside the first and second ranges.
Additional features call for (a) the transducers to be separated by a distance of no less than about 250 microns, (b) the transducers to be separated by a distance of between about 20 mm to about 50 mm, (c) the transducers to be separated by a distance of between about 25 mm to about 45 mm, (d) the transducers to be separated by a distance of about 35 mm, and/or (e) the distance between the transducers to be measured from a center of a diaphragm for each transducer.
Other features include that (a) the causing step fades the relative gain change between a low gain and a high gain, (b) the fade of the relative gain change is done across the first threshold value, (c) the fade of the relative gain change is done across a certain magnitude level for an output signal of one or more of the transducers, and/or (d) the causing of a relative gain change is effected by (1) a gain term based on the magnitude relationship and (2) a gain term based on a magnitude of an output signal from one or more of the transducers.
Still further features include that (a) a group of gain terms derived for a first group of frequency bands is also applied to a second group of frequency bands, (b) the frequency bands of the first group are lower than the frequency bands of the second group, (c) the group of gain terms derived for the first group of frequency bands is also applied to a third group of frequency bands, and/or (d) the frequency bands of the first group are lower than the frequency bands of the third group.
Additional features call for (a) the acoustic wave to be traveling in a compressible fluid, (b) the compressible fluid to be air, (c) the acoustic wave to be traveling in a substantially incompressible fluid (d) the substantially incompressible fluid to be water, (e) the causing step to cause a relative gain change to the signals from only one of the two transducers, (f) a particular frequency band to have a limit in how quickly a gain for that frequency band can change, and/or (g) there to be a first limit for how quickly the gain can increase and a second limit for how quickly the gain can decrease, the first limit and second limit being different.
According to another aspect, a method of discriminating between sound sources includes transforming data, collected by transducers which react to a characteristic of an acoustic wave, into signals for each transducer location. The signals are separated into a plurality of frequency bands for each location. For each band a relationship of the magnitudes of the signals for the locations is determined. For each band a time delay is determined from the signals between when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer. A relative gain change is caused between those frequency bands whose magnitude relationship and time delay fall on one side of respective threshold values for magnitude relationship and time delay, and those frequency bands whose (a) magnitude relationship falls on the other side of its threshold value, (b) time delay falls on the other side of its threshold value, or (c) magnitude relationship and time delay both fall on the other side of their respective threshold values.
Further features include (a) providing an adjustable threshold value for the magnitude relationship, (b) providing an adjustable threshold value for the time delay, (c) fading the relative gain change across the magnitude relationship threshold, (d) fading the relative gain change across the time delay threshold, (e) that causing of a relative gain change is effected by (1) a gain term based on the magnitude relationship and (2) a gain term based on the time delay, (f) that the causing of a relative gain change is further effected by a gain term based on a magnitude of an output signal from one or more of the transducers, and/or (g) that for each frequency band there is an assigned threshold value for magnitude relationship and an assigned threshold value for time delay.
A still further aspect involves a method of distinguishing sound sources. Data collected by at least three omni-directional microphones which each react to a characteristic of an acoustic wave is captured. The data is processed to determine (1) which data represents one or more sound sources located less than a certain distance from the microphones, and (2) which data represents one or more sound sources located more than the certain distance from the microphones. The results of the processing step are utilized to provide a greater emphasis of data representing the sound source(s) in one of (1) or (2) above over data representing the sound source(s) in the other of (1) or (2) above. As such, sound sources are discriminated from each other based on their distance from the microphones.
Additional features include that (a) the utilizing step provides a greater emphasis of data representing the sound source(s) in (1) over data representing the sound source(s) in (2), (b) after the utilizing step the data is converted into output signals, (c) a first microphone is a first distance from a second microphone and a second distance from a third microphone, the first distance being less than the second distance, (d) the processing step selects high frequencies from the second microphone and low frequencies from the third microphone which are lower than the high frequencies, (e) the low frequencies and high frequencies are combined in the processing step, and/or (f) the processing step determines (1) a phase relationship from the data from microphones one and two, and (2) determines a magnitude relationship from the data from microphones one and three.
According to another aspect, a personal communication device includes two transducers which react to a characteristic of an acoustic wave to capture data representative of the characteristic. The transducers are separated by a distance of about 70 mm or less. A signal processor for processing the data determines (1) which data represents one or more sound sources located less than a certain distance from the transducers, and (2) which data represents one or more sound sources located more than the certain distance from the transducers. The signal processor provides a greater emphasis of data representing the sound source(s) in one of (1) or (2) above over data representing the sound source(s) in the other of (1) or (2) above. As such, sound sources are discriminated from each other based on their distance from the transducers.
Further features call for (a) the signal processor to convert the data into output signals, (b) the output signals to be used to drive a second acoustic driver remote from the device to produce sound remote from the device, (c) the transducers to be separated by a distance of no less than about 250 microns, (d) the device to be a cell phone, and/or (e) the device to be a speaker phone.
A still further aspect calls for a microphone system having a silicon chip and two transducers secured to the chip which react to a characteristic of an acoustic wave to capture data representative of the characteristic. The transducers are separated by a distance of about 70 mm or less. A signal processor is secured to the chip for processing the data to determine (1) which data represents one or more sound sources located less than a certain distance from the transducers, and (2) which data represents one or more sound sources located more than the certain distance from the transducers. The signal processor provides a greater emphasis of data representing the sound source(s) in one of (1) or (2) above over data representing the sound source(s) in the other of (1) or (2) above, such that sound sources are discriminated from each other based on their distance from the transducers.
Another aspect calls for a method of discriminating between sound sources. Data collected by transducers which react to a characteristic of an acoustic wave is transformed into signals for each transducer location. The signals are separated into a plurality of frequency bands for each location. A relationship of the magnitudes of the signals is determined for each band for the locations. For each band a phase shift is determined from the signals which is indicative of when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer. A relative gain change is caused between those frequency bands whose magnitude relationship and phase shift fall on one side of respective threshold values for magnitude relationship and phase shift, and those frequency bands whose (1) magnitude relationship falls on the other side of its threshold value, (2) phase shift falls on the other side of its threshold value, or (3) magnitude relationship and phase shift both fall on the other side of their respective threshold values.
An additional feature calls for providing an adjustable threshold value for the phase shift.
According to a further aspect, a method of discriminating between sound sources includes transforming data, collected by transducers which react to a characteristic of an acoustic wave, into signals for each transducer location. The signals are separated into a plurality of frequency bands for each location. For each band a relationship of the magnitudes of the signals is determined for the locations. A relative gain change is caused between those frequency bands whose magnitude relationship falls on one side of a threshold value, and those frequency bands whose magnitude relationship falls on the other side of the threshold value. The gain change is faded across the threshold value to avoid abrupt gain changes at or near the threshold.
Another feature calls determining from the signals a time delay for each band between when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer. A relative gain change is caused between those frequency bands whose magnitude relationship and time delay fall on one side of respective threshold values for magnitude relationship and time delay, and those frequency bands whose (1) magnitude relationship falls on the other side of its threshold value, (2) time delay falls on the other side of its threshold value, or (3) magnitude relationship and time delay both fall on the other side of their respective threshold values. The gain change is faded across the threshold value to avoid abrupt gain changes at or near the threshold.
Other features include that (a) a group of gain terms derived for a first octave is also applied to a second octave, (b) the first octave is lower than the second octave, (c) the group of gain terms derived for the first octave is also applied to a third octave, (d) the frequency bands of the first octave is lower than the third octave, and/or (e) the frequency bands of the first group are lower than the frequency bands of the second group.
Another aspect involves a method of discriminating between sound sources. Data, collected by transducers which react to a characteristic of an acoustic wave, is transformed into signals for each transducer location. The signals are separated into a plurality of frequency bands for each location. Characteristics of the signals are determined for each band which are indicative of a distance and angle to the transducers of a sound source providing energy to a particular band. A relative gain change is caused between those frequency bands whose signal characteristics indicate that a sound source providing energy to a particular band meets distance and angle requirements, and those frequency bands whose signal characteristics indicate that a sound source providing energy to a particular band (a) does not meet a distance requirement, (b) does not meet an angle requirement, or (c) does not meet distance and angle requirements.
Further features include that the characteristics include (a) a phase shift which is indicative of when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer, and/or (b) a time delay between when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer, whereby an angle to the transducers of a sound source providing energy to a particular band is indicated.
An additional feature calls for the output signals to be (a) recorded on a storage medium, (b) communicated by a transmitter, and/or (c) further processed and used to present information on location of sound sources.
A further aspect of the invention calls for a method of distinguishing sound sources. Data collected by four transducers which each react to a characteristic of an acoustic wave is transformed into signals for each transducer location. The signals are separated into a plurality of frequency bands for each transducer location. For each band a relationship of the magnitudes of the signals for at least two different pairs of the transducers is compared with a threshold value. A determination is made for each transducer pair whether the magnitude relationship falls on one side or the other side of the threshold value. The results of each determination is utilized to decide whether an overall magnitude relationship falls on one side or the other side of the threshold value. A relative gain change is caused between those frequency bands whose overall magnitude relationship falls on one side of the threshold value and those frequency bands whose overall magnitude relationship falls on the other side of the threshold value, such that sound sources are discriminated from each other based on their distance from the transducers.
Other features call for (a) the four transducers to be arranged in a linear array, (b) a distance between each adjacent pair of transducers to be substantially the same, (c) each of the four transducers to be located at respective vertices of an imaginary polygon, and/or (d) giving a weight to results of the determination for each transducer pair.
Another aspect calls for a method of distinguishing sound sources. A sound distinguishing system is switched to a training mode. A sound source is moved to a plurality of locations within a sound source accept region such that the sound distinguishing system can determine a plurality of thresholds for a plurality of frequency bins. The sound distinguishing system is switched to an operating mode, The sound distinguishing system uses the thresholds to provide a relative emphasis to sound sources located in the sound source accept region over sound sources located outside the sound source accept region.
Another feature calls requires that two of the microphones be connected by an imaginary straight line that extends in either direction to infinity. The third microphone is located away from this line.
One more feature calls for comparing a relationship of the magnitudes of the signals for six unique pairs of the transducers with a threshold value.
These and other aspects, objects, features and advantages of the present invention will be more clearly understood and appreciated from a review of the following detailed description and appended claims, and by reference to the accompanying drawings.
a and 10b are schematic drawings of transducers being exposed to acoustic waves from different directions;
a and b are plots of gain versus frequency;
For some sound applications (e.g. the amplification of live music, sound recording, cell phones and speaker phones), a microphone system with an unusual set of directional properties is desired. A new microphone system having these properties is disclosed that avoids many of the typical problems of directional microphones while offering improved performance. This new microphone system uses the pressures measured by two or more spaced microphone elements (transducers) to cause a relative positive gain for the signals from sound sources that fall within a certain acceptance window of distance and angle relative to the microphone system compared to the gain for the signals from all other sound sources.
These goals are achieved with a microphone system having a very different directional pattern than conventional microphones. A new microphone system with this pattern accepts sounds only within an “acceptance window”. Sounds originating within a certain distance and angle from the microphone system are accepted. Sounds originating outside this distance and/or angle are rejected.
In one application of the new microphone system (a live music performance), sources we'd like to reject, such as the drum kit at the singer's microphone, or the loudspeakers at any microphone, are likely to be too far away and/or at the wrong angle to be accepted by the new microphone system. Accordingly, the problems described above are avoided.
Beginning with
Consider the ideal situation of a point source of sound 15 in free space, shown as a speaker in
Also, We can measure the sound pressure magnitude M1 and M2 at the respective locations of transducers 12 and 14, and we know rt. As such, we can set up a second equation including unknown R as:
Thus, we have two equations and two unknowns R and θ (given rt, τ, c and M1/M2). The two equations are numerically solved simultaneously using a computer.
An example is provided in
The distance rt is preferably measured from the center of a diaphragm for each of transducers 12 and 14. Distance rt is preferably smaller than a wavelength for the highest frequency of interest. However, rt should not be too small as the magnitude ratios as a function of distance will be small and thus more difficult to measure. Where the acoustic waves are traveling in a gas where c approx.=343 m/s (e.g. air), distance rt in one example is preferably about 70 millimeters (mm) or less. At about 70 mm the system is best suited for acoustic environments consisting primarily of human speech and similar signals. Preferably distance rt is between about 20 mm to about 50 mm. More preferably distance rt is between about 25 mm to about 45 mm. Most preferably distance rt is about 35 mm.
To this point the description has been inherently done in an environment of a compressible fluid (e.g. air). It should be noted that this invention will also be effective in an environment of an incompressible fluid (e.g. water or salt water). In the case of water the transducer spacing can be about 90 mm or greater. If it is only desired to measure low or extremely low frequencies, the transducer spacing can get quite large. For example, assuming the speed of sound in water is 1500 meters/second and the highest frequency of interest is 100 hz, then the transducers can be spaced 15 meters apart.
Turning to
Referring to
With reference to
(http://www.akustica.com/documents/AKU2001ProductBrief.pdf
Turning to
If, for example, it is desired to only accept sound sources located less than 0.13 meters from transducer 12 and at an angle e of less than 25 degrees, we find the intersection of these values at a point 23. At point 23 we see that the magnitude difference must be greater than 2 dB and time delay must be greater than 100 microseconds. A hatched area 27 indicates the acceptance window for this setting. If the sound source causes a magnitude difference of greater than or equal to 2 dB and a time delay of greater than or equal to 100 microseconds, then we accept that sound source. If the sound source causes a magnitude difference of less than 2 dB and/or a time delay of less than 100 microseconds, then we reject that sound source.
The above type of processing and resulting accepting or rejecting a sound source based on its distance and angle from the transducers, is done on a frequency band by frequency band basis. Relatively narrow frequency bands are desirable to avoid blocking desired sounds or passing non desired sounds. It is preferable to use narrow frequency bands and short time blocks, although those two characteristics conflict with each other. Narrower frequency bands enhance the rejection of unwanted acoustic sources but require longer time blocks. However, longer time blocks create system latency that can be unacceptable to a microphone user. Once a maximum acceptable system latency is determined, the frequency band width can be choosen. Then the block time is selected. Further details are provided below.
Because the system works independently over many frequency bands, a desired singer, located on-axis 0.13 meters from the microphone singing a C is accepted, while a guitar located off-axis 0.25 meters from the microphone playing an E is rejected. Thus, if a desired singer less than 0.13 meters and on axis from the microphone is singing a C, but a guitar is playing an E 0.25 meters from the microphone at any angle, the microphone system passes the vocalist's C and its harmonics, while simultaneously rejecting the instrumentalist's E and its harmonics.
Turning now to
Using block processing techniques which are well known to those skilled in the art, blocks of overlapping data are windowed at a block 22 (a separate windowing is done on the signal for each transducer). The windowed data are transformed from the time domain into the frequency domain using a fast Fourier transform (FFT) at a block 24 (a separate FFT is done on the signal for each transducer). This separates the signals into a plurality of linear spaced frequency bands (i.e. bins) for each transducer location. Other types of transforms can be used to transform the windowed data from the time domain to the frequency domain. For example, a wavelet transform may be used instead of an FFT to obtain log spaced frequency bins. In this embodiment a sampling frequency of 32000 samples/sec is used with each block containing 512 samples.
The definition of the discrete Fourier transform (DFT) in its inverse is as follows:
The functions x=fft (x) and x=ifft (X) implement the transform and inverse transform pair given for vectors of length N by:
is an Nth root of unity.
The FFT is an algorithm for implementing the DFT that speeds the computation. The Fourier transform of a real signal (such as audio) yields a complex result. The magnitude of a complex number X is defined as:
sqrt(real (X)·̂2+imag(X)·̂2)
The angle of a complex number X is defined as:
where the sign of the real and imaginary parts is observed to place the angle in the proper quadrant of the unit circle, allowing a result in the range:
−π≦angle(X)<π
The equivalent time delay is defined as:
The magnitude ratio of two complex values, X1 and X2 can be calculated in any of a number of ways. One can take the ratio of X1 and X2, and then find the magnitude of the result. Or, one can find the magnitude of X1 and X2 separately, and take their ratio. Alternatively, one can work in log space, and take the log of the magnitude of the ratio, or alternatively, the difference (subtraction) of log (X1) and log (X2).
Similarly, the time delay between two complex values can be calculated in a number of ways. One can take the ratio of X1 and X2, find the angle of the result and divide by the angular frequency. One can find the angle of X1 and X2 separately, subtract them, and divide the result by the angular frequency.
As described above, a relationship of the signals is established. In some embodiments the relationship is the ratio of the signal from front transducer 12 to the signal from rear transducer 14 which is calculated for each frequency bin on a block-by-block basis at a divider block 26. The magnitude of this ratio (relationship) in dB is calculated at a block 28. A time difference (delay) T (Tau) is calculated for each frequency bin on a block-by-block basis by first computing the phase at a block 30 and then dividing the phase by the center frequency of each frequency bin at a divider 32. The time delay represents the lapsed time between when an acoustic wave is detected by transducer 12 and when this wave is detected by a transducer 14.
Other well known digital signal processing (DSP) techniques for estimating magnitude and time delay differences between the two transducer signals may be used. For example, an alternate approach to calculating time delay differences is to use cross correlation in each frequency band between the two signals X1 and X2.
The calculated magnitude relationship and time differences (delay) for each frequency bin (band) are compared with threshold values at a block 34. For example, as described above in
A user input 36 may be manipulated to vary the acceptance angle threshold(s) and a user input 38 may be manipulates to vary the distance threshold(s) as required by the user. In one embodiment a small number of user presets are provided for different acceptance patterns which the user can select as needed. For example, the user would select between general categories such as narrow or wide for the angle setting and near or far for the distance setting.
A visual or other indication is given to the user to let her know the threshold settings for angle and distance. Accordingly, user-variable threshold values can be provided such that a user can adjust a distance selectivity and/or an angle selectivity from the transducers. The user user interface may represent this as changing the distance and/or angle thresholds, but in effect the user is adjusting the magnitude difference and/or the time difference thresholds.
When the magnitude difference and time delay both fall within the acceptance window for a particular frequency band, a relatively high gain is calculated at a block 40, and when one or both of the parameters is outside the window, a relatively low gain is calculated. The high gain is set at about 1 while the low gain is at about 0. Alternatively, the high gain might be above 1 while the low gain is below the high gain. In general, a relative gain change is caused between those frequency bands whose parameter (magnitude and time delay) comparisons both fall on one side of their respective threshold values and those frequency bands where one or both parameter comparisons fall on the other side of their respective threshold values.
The gains are calculated for each frequency bin in each data block. The calculated gain may be further manipulated in other ways known to those skilled in the art to minimize the artifacts generated by such gain change. For example, the minimum gain can be limited to some low value, rather than zero. Additionally, the gain in any frequency bin can be allowed to rise quickly but fall more slowly using a fast attack slow decay filter. In another approach, a limit is set on how much the gain is allowed to vary from one frequency bin to the next at any given time.
On a frequency bin by frequency bin basis, the calculated gain is applied to the frequency domain signal from a single transducer, for example transducer 12 (although transducer 14 could also be used), at a multiplier 42. Thus, sound sources in the acceptance window are emphasized relative to sources outside the window.
Using conventional block processing techniques, the modified signal is inverse FFT'd at a block 44 to transform the signal from the frequency domain back into the time domain. The signal is then windowed, overlapped and summed with the previous blocks at a block 46. At a block 48 the signal is converted from a digital signal back to an analog (output) signal. The output of block 48 is then sent to a conventional amplifier (not shown) and acoustic driver (i.e. speaker) (not shown) of a sound reinforcement system to produce sound. Alternatively, an input signal(digital) to block 48 or an output signal (analog) from block 48 can be (a) recorded on a storage medium (e.g. electronic or magnetic), (b) communicated by a transmitter (wired or wirelessly), or (c) further processed and used to present information on location of sound sources.
Some benefits of this microphone system will be described with respect to
The microphone system shown in
Turning to
A directional pattern for the microphone system of
The magnitude difference is both a function of distance and angle. The maximum change in magnitude with distance occurs in line with the transducers. The minimum change in magnitude with distance occurs in a line perpendicular to the axis of the transducers. For sources 90 deg off axis, there is no magnitude difference, regardless of the source distance. Angle, however, is just a function of the time difference alone. For applications where distance selectivity is important, the transducer array should be oriented pointing towards the location of a sound source or sources we wish to select.
A microphone having this sort of extreme directionality will be much less susceptible to feedback than a conventional microphone for two reasons. First, in a live performance application, the new microphone largely rejects the sound of main or monitor loudspeakers that may be present, because they are too distant and outside the acceptance window. The reduced sensitivity lowers the loop gain of the system, reducing the likelihood of feedback. Additionally, in a conventional system, feedback is exacerbated by having several “open” microphones and speakers on stage. Whereas any one microphone and speaker might be stable and not create feedback, the combination of multiple cross coupled systems can more easily be unstable, causing feedback. The new microphone system described herein is “open” only for a sound source within the acceptance window, making it less likely to contribute to feedback by coupling to another microphone and sound amplification system on stage, even if those other microphones and systems are completely conventional.
The new microphone system also greatly reduces the bleed through of sound from other performers or other instruments in a performing or recording application. The acceptance window (both distance and angle) can be tailored by the performer or sound crew on the fly to meet the needs of the performance.
The new microphone system can simulate the sound of many different styles of microphones for performers who want that effect as part of their sound. For example, in one embodiment of the invention this system can simulate the proximity effect of conventional microphones by boosting the gain more at low frequencies than high frequencies for magnitude differences indicating small R values. In the embodiment of
Proximity effect can also be caused by combining transducers 12, 14 into a single uni-directional or bi-directional microphone, thereby creating a fixed directional array. In this case the calculated gain is applied to the combined signal from transducers 12, 14, providing pressure gradient type directional behavior (not adjustable by the user), in addition to the enhanced selectivity of the processing of
The new microphone can create new microphone effects. One example is a microphone having the same output for all sound source distances within the acceptance window. Using the magnitude difference and time delay between the transducers 12 and 14, the gain is adjusted to compensate for the 1/R falloff from transducer 12. Such a microphone might be attractive to musicians who do not “work the mike”. A sound source of constant level would cause the same output magnitude for any distance from the transducers within the acceptance window. This feature can be useful in a public address (PA) system. Inexperienced presenters generally are not careful about maintaining a constant distance from the microphone. With a conventional PA system, their reproduced voice can vary between being too loud and too soft. The improved microphone described herein keeps the voice level constant, independent of the distance between the speaker and the microphone. As a result, variations in the reproduced voice level for an inexperienced speaker are reduced.
The new microphone can be used to replace microphones for communications purposes, such as a microphone for a cell phone for consumers (in a headset or otherwise), or a boom microphone for pilots. These personal communication devices typically have a microphone which is intended to be located about 1 foot or less from a user's lips. Rather than using a boom to place a conventional noise canceling microphone close to the user's lips, a pair of small microphones mounted on the headset could use the angle and/or distance thresholds to accept only those sounds having the correct distance and/or angle (e.g. the user's lips). Other sounds would be rejected. The acceptance window is centered around the anticipated location of the user's mouth.
This microphone can also be used for other voice input systems where the location of the talker is known (e.g. in a car). Some examples include hands free telephony applications, such as hands free operation in a vehicle, and hands free voice command, such as with vehicle systems employing speech recognition capabilities to accept voice input from a user to control vehicle functions. Another example is using the microphone in a speakerphone which can be used, for example, in tele-conferencing. These types of personal communication devices typically have a microphone which is intended to be located more than 1 foot from a user's lips. The new microphone technology of this application can also be used in combination with speech recognition software. The signals from the microphone are passed to the speech recognition algorithm in the frequency domain. Frequency bins that are outside the accept region for sound sources are given a lower weighting than frequency bins that are in the accept region. Such an arrangement can help the speech recognition software to process a desired speakers voice in a noisy environment.
Turning now to
However, when the wavelength of sound approaches the distance between the microphones, this simple approach breaks down. The phase measurement produces results in the range between −π and π. However, there is an uncertainty in the measurement having a value that is an integral multiple of 2π. A measurement of 0 radians of phase difference could just as easily represent a phase difference of 2π or −2π.
This uncertainty is illustrated graphically in
This issue can be avoided by reducing the distance between transducers 12, 14 such that their spacing is less than a wavelength even for the highest frequency (shortest wavelength) we wish to sense. This approach eliminates the 2π uncertainty. However, a narrower spacing between the transducers decreases the magnitude difference between transducers 12, 14, making it harder to measure the magnitude difference (and thus provide distance selectivity).
This problem can be avoided by using two pairs of transducer elements: a widely spaced pair for low frequency estimates of source distance and angle, and a narrowly spaced pair for high frequency estimates of distance and angle. In one embodiment only three transducer elements are used: widely spaced T1 and T2 for low frequencies and narrowly spaced T1 and T3 for high frequencies.
We will now turn to
Each of the three signal streams receive standard block processing windowing at block 78 and are converted from the time domain to the frequency domain at FFT block 80. High frequency bins above a pre-defined frequency from the signal of transducer 66 are selected out at block 82. In this embodiment the pre-defined frequency is 4 Khz. Low frequency bins at or below 4 khz from the signal of transducer 68 are selected out at block 84. The high frequency bins from block 82 are combined with the low frequency bins from block 84 at a block 86 in order to create a full complement of frequency bins. It should be noted that this band splitting can alternatively be done in the analog domain rather than the digital domain.
The remainder of the signal processing is substantially the same as for the embodiment in
Turning to
In a still further embodiment based on
With reference to
We can take advantage of this fact by not bothering to estimate source position above 5 kHz. Instead, if acoustic energy is sensed below 5 kHz that is within the acceptance window of the microphone, then energy above 5 Khz is also allowed to pass, making the assumption that it is coming from the same source.
One method of achieving this goal is to use the instantaneous gains predicted for the frequency bins located in the octave between 2.5 and 5 kHz for example, and to apply those same gains to the frequency bins one and two octaves higher, that is, for the bins between 5 and 10 kHz, and the bins between 10 and 20 kHz. This approach preserves any harmonic structure that may exist in the audio signal. Other initial octaves, such as 2-4 kHz, can be used as long as they are commensurate with transducer spacing.
As shown in
Implementation of this embodiment will be described with reference to
The two calculated gains out of blocks 108 and 110, based on magnitude and time delay, are summed at a summer 116. The reason for summing the gains will be described below. The summed gain for frequencies below 5 kHz is passed through at a block 118. The gain for frequency bins between 2.5 and 5 kHz is selected out at a block 120 and remapped (applied) into the frequency bins for 5 to 10 kHz at a block 122 and for 10 to 20 kHz at a block 124 (as discussed above with respect to
Turning now to
The two transducer level gain terms are summed with each other at a summer 134. The output of summer 134 is added at a summer 136 to the gain term “A” (from block 126 of
The gain term output by summer 136, which has been calculated in dB, is converted to a linear gain at a block 138, and applied to the signal from transducer 12, as shown in
Details of non-linear blocks 108, 110, 128 and 130 will now be discussed with reference to
The microphone systems described above can be used in a cell phone or speaker phone. Such a cell phone or speaker phone would also include an acoustic driver for transmitting sound to the user's ear. The output of the signal processor would be used to drive a second acoustic driver at a remote location to produce sound (e.g. the second acoustic driver could be located in another cell phone or speaker phone 500 miles away).
A still further embodiment of the invention will now be described. This embodiment relates to a prior art boom microphone that is used to pick up the human voice with a microphone located at the end of a boom worn on the user's head. Typical applications are communications microphones, such as those used by pilots, or sound reinforcement microphones used by some popular singers in concert. These microphones are normally used when one desires a hands-free microphone located close to the mouth in order to reduce the pickup of sounds from other sources. However, the boom across the face can be unsightly and awkward. Another application of a boom microphone is for a cell phone headset. These headsets have an earpiece worn on or in the user's ear, with a microphone boom suspended from the earpiece. This microphone may be located in front of a users mouth or dangling from a cord, either of which can be annoying.
An earpiece using the new directional technology of this application is described with reference to
For previous embodiments described above, the general assumption was that of a substantially free field acoustic environment. However, near the head, the acoustic field from sources is modified by the head, and free-field conditions no longer hold. As a result, the acceptance thresholds are preferably changed from free field conditions.
At low frequencies, where the wavelength of sound is much larger than the head, the sound field is not greatly changed, and an acceptance threshold similar to free field may be used. At high frequencies, where the wavelength of sound is smaller than the head, the sound field is significantly changed by the head, and the acceptance thresholds must be changed accordingly.
In this kind of application, it is desirable for the thresholds to be a function of frequency. In one embodiment, a different threshold is used for every frequency bin for which the gain is calculated. In another embodiment, a small number of thresholds are applied to groups of frequency bins. These thresholds are determined empirically. During a calibration process, the magnitude and time delay differences in each frequency bin are continually recorded while a sound source radiating energy at all frequencies of interest is moved around the microphone. A high score is assigned to the magnitude and time difference pairs when the source is located in the desired acceptance zone and a low score when it is outside the acceptance zone. Alternatively, multiple sound sources at various locations can be turned on and off by the controller doing the scoring and tabulating.
Using well known statistical methods for minimizing error, the thresholds for each frequency bin are calculated using the db difference and time (or phase) difference as the independent variables, and the score as the dependent variable. This approach compensates for any difference in frequency response that may exist between the two microphone elements that make up any given unit.
An issue to consider is that microphone elements and analog electronics have tolerances, so the magnitude and phase response of two microphones making up a pair may not be well matched. In addition, the acoustical environment in which the microphone is placed alters the magnitude and time delay relationships for sound sources in the desired acceptance window.
In order to address these issues an embodiment is provided in which the microphone learns what the appropriate thresholds are, given the intended use of the microphone, and the acoustical environment. In the intended acoustic environment with a relatively low level of background noise, a user switches the system to a learning mode and moves a small sound source around in a region that the microphone should accept sound sources when operating. The microphone system calculates the magnitude and time delay differences in all frequency bands during the training. When the data gathering is complete, the system calculates the best fit of the data, using well known statistical methods and calculates a set of thresholds for each frequency bin or groups of frequency bins. This approach assists in attaining an increased number of correct decisions about sound source location made for sound sources located in a desired acceptance zone.
A sound source used for training could be a small loudspeaker playing a test signal that contains energy in all frequency bands of interest during the training period, either simultaneously, or sequentially. If the microphone is part of a live music system, the sound source can be one of the speakers used as a part of the live music reinforcement system. The sound source could also be a mechanical device that creates noise.
Alternately, a musician can use their own voice or instrument as the training source. During a training period, the musician sings or plays their instrument, positioning the mouth or instrument in various locations within the acceptance zone. Again, the microphone system calculates magnitude and time delay differences in all frequency bands, but rejects any bands for which there is little energy. The thresholds are calculated using best fit approaches as before, and bands which have poor information are filled in by interpolation from nearby frequency bands.
Once the system has been trained, the user switches the microphone back to a normal operating mode, and it operates using the newly calculated thresholds. Further, once a microphone system is trained to be approximately correct, a check of the microphone training is done periodically throughout the course of a performance (or other use), using the music of the performance as a test signal.
Referring to
In another embodiment of the invention, slew rate limiting is used in the signal processing. This embodiment is similar to the embodiment of
Referring to
Thus between t=0.1 and 0.3 seconds, the applied gain (which has been slew rate limited) lags behind the calculated gain because the calculated gain is rising faster than the threshold. Between t=0.5 and 0.6, the calculated and applied gains are the same, since the calculated gain is falling at a rate less than the threshold. Beyond t=0.6, the calculated gain is falling faster than threshold, and the applied gain lags once again until it can catch up.
Another example of using more than two transducers is to create multiple transducers pairs whose sound source distance and angle estimates can be compared. In a reverberant sound field, the magnitude and phase relationships between the sound pressure measured at any two points due to a source can differ substantially from those same two points measured in a free field. As a result, for a source in one particular location in a room, and a pair of transducers in another particular location in the room, the magnitude and phase relationship at one frequency can fall within the acceptance window, even though the physical location of the sound source is outside the acceptance window. In this case, the distance and angle estimate is faulty. However, in a typical room, the distance and angle estimate for that same frequency made just a short distance away is likely to be correct. A microphone system using multiple pairs of microphone elements can make multiple simultaneous estimates of sound source distance and angle for each frequency bin, and reject those estimates that do not agree with the estimates from the majority of other pairs.
An example of the system described in the previous paragraph will be discussed with reference to
In another example described with reference to
In a further embodiment, one of the four transducers (e.g. omni-directional microphones) 202, 204, 206 and 208 is eliminated. For example, if transducer 202 is eliminated, we will have transducers 204 and 208 which can be connected by an imaginary straight line that extends to infinity in either direction, and transducer 206 which is located away from this line. Such an arrangement results in three pair of transducers 204-208, 206-208 and 204-206 which can be used to determine sound source distance and angle.
The invention has been described with reference to the embodiments described above. However, it will be appreciated that variations and modifications can be effected by a person of ordinary skill in the art without departing from the scope of the invention.
The present application is a division of U.S. application Ser. No. 11/766,622, filed Jun. 21, 2007, the entire disclosure which is hereby incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
Parent | 11766622 | Jun 2007 | US |
Child | 14303682 | US |