The present invention is generally directed to differential microphone arrays (DMAs), and, in particular, to DMAs that have low noise amplification.
Microphone arrays may include a number of geographically arranged microphone sensors for receiving sound signals (such as speech signals) and converting the sound signals to electrical signals. The electrical signals may be digitized by analog-to-digital converters (ADCs) for converting into digital signals which may be further processed by a processor (such as a digital signal processor). Compared with a single microphone, the sound signals received at microphone arrays may be further processed for noise reductionspeech enhancement, sound source separation, de-reverberation, spatial sound recording, and source localization and tracking. The processed digital signals may be packaged for transmission over communication channels or converted back to analog signals using a digital-to-analog converter (ADC). Microphone arrays have also been configured for beamforming, or directional sound signal reception. The processor may be programmed as if to receive sound signals from a specific sound source.
Additive microphone arrays may achieve signal enhancement and noise suppression based on synchronize-and-add principles. To achieve better noise suppression, additive microphone arrays may include a large inter-sensor distance. For example, the distance between microphone sensors in additive microphone arrays may range from a couple of centimeters to a couple of decimeters. Because of the large inter-sensor spacing, the bulk size of additive microphone arrays may be large. For this reason, additive microphone arrays may not be suitable for many applications. Additionally, additive microphones may suffer the following drawbacks. First, the beam patterns of additive microphone arrays are frequency-dependent and the widths of the formed beams are inversely proportional to the frequency. Therefore, additive microphone arrays are not effective in dealing with low-frequency noise and interference. Second, the noise component from the additive microphone arrays are generally attenuated in a non-uniform manner over the entire spectrum, resulting in undesirable artifacts in the output. Finally, when the incident angle of the target speech source is different from the array's facing direction (a situation which may often occur in practice), the speech signal may be low-pass filtered, resulting in speech distortion.
In contrast, differential microphone arrays (DMAs) allow for small inter-sensor distance, and may be made very compact. DMAs include an array of microphone sensors that are responsive to the spatial derivatives of the acoustic pressure field. For example, the outputs of a number of geographically arranged omni-directional sensors may be combined together to measure the differentials of the acoustic pressure fields among microphone sensors. Thus, different orders of DMAs may be constructed from omni-directional microphone sensors so that the DMAs may have certain directivity.
Compared to additive microphone arrays, DMAs have the following advantages. First, DMAs may form frequency-independent beam patterns so that they are effective for processing both high- and low-frequency signals. Second, DMAs have the potential to attain maximum directional gain with a given number of microphones sensors. Third, the gains of DMAs decrease with the distance between the sound source and the arrays, and therefore inherently suppress environmental noise and interference from far-away sources.
An Nth order DMA may be constructed from at least N+1 microphone sensors. As shown in
Another drawback is that DMAs may amplify sensor noise. Each microphone sensor may include membranes what may vibrate in response to sound waves to convert pressures applied by the sound waves into electrical signals. The generated electrical signals include sensor noise in addition to the measurements of the sound. Unlike environmental noise, the sensor noise is inherent to the microphone sensors and therefore is present even in a soundproof environment such as a sound booth. Typically, microphone array outputs may have 20-30 dB of white noise due to the sensors depending on the quality of microphone sensors. DMAs are known for amplification of sensor noise; and, the higher order DMAs, the larger the amplification. For example, a third-order DMA of current art may amplify the sensor noise to about 80 dB, rendering the DMA useless for practical purposes.
One way to reduce the sensor noise is to use larger membranes in the microphone sensors. However, both larger membranes and larger microphone sensors increase the bulk size of DMAs. Another way to reduce the sensor noise is to use materials that generate less noise. However, the lower the generated sensor noise, the more expensive the microphone sensors. For example, a 20 dB microphone sensor can be much much more expensive than a 30 dB microphone sensor. Finally, no matter how microphone sensors are fabricated, the sensor noise inherently exists and is subject to amplification by DMAs. Thus, the presently available and/or known DMAs are limited to one or two orders of differentials. Accordingly, a need exists to improve over the present DMAs and provide an improved low noise differential microphone array.
There exists a need for differential microphone arrays that are easy to design and can reduce and/or eliminate amplification of sensor noise.
Embodiments of the present invention include a differential microphone array (DMA) that include a number (M) of microphone sensors for converting a sound to a number of electrical signals and a processor that is configured to apply linearly-constrained minimum variance filters on the electrical signals over a time window to calculate frequency responses of the electrical signals over a plurality of subbands and sum the frequency responses of the electrical signals for each subband to calculate an estimated frequency spectrum of the sound.
In embodiments of the present invention, the number of microphone sensors is larger than the order of the DMA plus one, and the linearly-constrained minimum variance filters are minimum-norm filters. In other embodiments of the present invention, the number of microphone sensors is equal to the order of the DMA plus one.
Embodiments of the present invention include a method for operating a differential microphone array that includes a number (M) of microphone sensors for converting sound to electrical signals. The method includes applying linearly-constrained minimum variance filters on the electrical signals over a time window to calculate frequency responses of the electrical signals over a plurality of subbands and summing the frequency responses of the electrical signals for each subband to calculate an estimated frequency spectrum of the sound.
Embodiments of the present invention include a method for designing reconstruction filters for a differential microphone array including a number (M) of microphone sensors. The method includes specifying a target differential order (N) for the differential microphone array, specifying N+1 steering vectors d(ω,αN,n)=[1,e−jωτ
Embodiments of the present invention include a differential microphone array including a plurality of microphone sensors for receiving a speech signal and whose outputs are divided into frames. In an embodiment, the frames of the outputs are transformed into a frequency response by a frequency transform. In an embodiment, the frames are transformed using short-time Fourier transform (STFT). Other types of frequency transform that may be used to generate a frequency response include discrete cosine transform (DCT) and wavelet based transforms. The frequency responses can be divided into a plurality of subbands. In each subband, a differential beamformer is designed and applied to the frequency response coefficients to produce an estimate of clean signal in the subband. Finally, the clean speech signal is reconstructed by summing the inverse frequency transform of the frequency responses.
Y
m(ω)=e−j(m−1)ωτ0αX(ω)+Vm(ω) (1)
where X(ω) and Vm(ω) are, respectively, the STFT of the source signal x(k) and the noise component vm(k), j=√{square root over (−1)} (or the imaginary unit), ω=2πf is the angular frequency, τ0=δ/c (c is the sound speed) is the delay between two successive microphone sensors at angle θ=0°, and α=cos(θ). Embodiments of the present invention may be similarly applicable to non-uniform array. For a non-uniform array of microphone sensors, for example, Equation (1) can be written as Y(ω)=e−jωτ
y(ω)=d(ω,α)X(ω)+v(ω) (2)
where v(ω)=[V1(ω), V2(ω), . . . , VM(ω)]T, and d(ω,α)=[1, e−jωτ0α, . . . , e−j(M−1)ωτ0α]T is the steering vector (of length M) at the frequency co, and the superscript T denotes a transpose operator.
Embodiments of the present invention include the design of DMAs as beamformers that recover the spectrum of the desired signal X(ω) based on the observed y(ω). As shown in
Referring to
where h(ω)=[H1(ω), H2(ω), . . . , HM(ω)]T. As shown in
The design of the DMA is then to determine the weight vector h(ω) so that Z(ω) is an optimal estimate of X(ω). As indicated by Equation (2), y(ω) includes noise component v(ω) which may include both environmental noise and sensor noise. The weight vector h(ω) may be determined by adaptive beamforming to minimize the noise component. In adaptive beamforming, the noise component may be minimized for certain beam patterns, or
where the superscript H denotes a transpose complex conjugation. A linearly constrained minimum variance (LCMV) filter solution for Equation (4) is:
h
LCMV(ω)=Φv−1(ω)DH(ω,α)[D(ω,α)Φv−1(ω)DH(ω,α)]−1β, (5)
in which α and β include vectors through which the certain beam patterns may be defined, and Φ(
In an embodiment, M=N+1. Thus, D is a fully ranked square matrix, and
h
LCMV(ω)=D−1(ω,α)β, (6)
which corresponds exactly to the filter of an Nth-order DMA. However, because of hLCMV(ω) is designed in the frequency domain and is derived directly from the steering vectors d and the beam pattern β, hLCMV(ω) is designed in the frequency domain. In this way, embodiments of the present invention do not need to calculate the equalization filters which are hard to design, and therefore, embodiments of the present invention has the advantage of easier calculation.
Current art requires that M=N+1 so that steering matrix D is always a square matrix that can be inversed. If M>N+1, the steering matrix D is not a square matrix. In an embodiment of the present invention, when M>N+1, the filter is designed to be a minimum-norm filter, or
h(ω,α,β)=DH(ω,α)[D(ω,α)DH(ω,α)]−1β, (7)
where the selection of vectors α and β of length N+1 may determine the response and the order of the DMA. Since M may be much larger than N+1, the DMA designed according to the minimum-norm filter h(ω,α,β) is much more robust against the noise, especially against the sensor noise. This is because, for example, the minimum-norm filter h(ω,α,β) is also be derived from maximizing the white noise gain subject to the Nth order DMA fundamental constraints. Therefore, for a large number of microphone sensors, the white noise gain may approach M. If the value of M is much larger than N+1, the order of the DMA may not be equal to N anymore. However, since the Nth order DMA fundamental constraints is fulfilled, the resulting shape of the directional pattern may be slightly different than the one obtain when M=N+1. In this way, the DMA designed according to the minimum-norm filter h(ω,α,β) may effectively achieve an effective trade-off between good noise suppression and beam forming.
The beam pattern derived using the minimum-norm filter is
B[h(ω,α,β),θ]=dH(ω, cos θ)DH(ω,α)[D(ω,α)DH(ω,α)]−1β. (8)
The white noise gain, directivity factor, and the gain for a point noise source for the minimum-norm filters are, respectively,
where θn is the incident angle for a point noise source.
As discussed above, the trade-off is between Gdn[h(ω,α,β)]=GN and GWn[h(ω,α,β)]≧1, where GN is the directivity factor of the frequency-independent N-th order DMA.
Thus, embodiments of the present invention include a process for calculating a set of filters that can be used to reconstruct the sound signals. For example, the reconstruction filters specify coefficients at a number of subbands.
d(ω,αN,n)=[1,e−jωτ
where n=1, 2, . . . , N. At 306, the steering matrix D may be constructed from the steering vectors
which is a M×(N+1) matrix. Thus, if M=N+1, D is a square matrix. However, if M>N+1, D is a rectangular matrix. At 308, a set of linearly-constrained minimum variance filters may be calculated. If the number of microphone sensors M=N+1 (N is the order of the DMA), D is a square matrix and
h
LMCV(ω)=D−1(ω,α)β.
However, if M>N+1, h(ω,α,β)=DH (ω,α)[D(ω,α)DH(ω,α)]−1β, which is a minimum-norm filter which suppresses noise amplification.
For example, the calculated linear-constrained minimum variance filters or the minimum-norm filter is used to reconstruct the sound source.
Embodiments of the present invention can be used to construct DMAs of different orders, including first-order cardioid (in which α=[1, −1]τ, β=[1, 0]T), second-order cardioid (α=[1, −1, 0]τ, β=[1, 0, 0]T), and third-order cardioid (α=[1, −1, 0, −√{square root over (2)}/2]τ, β=[1, 0, 0, −√{square root over (2)}/8+¼]T). The number of microphone sensors used for the construction can equal to the order plus one or be larger than the order plus one. Experimental results have demonstrated that DMAs designed using the minimum-norm filters exhibit superior robustness against noise.
Embodiments of the present invention can use different numbers of microphone sensors to construct a first-order cardioid DMA, in which α=[1, −1]T (namely, the two nulls are placed at 0° and180°), and β=[1, 0]T (the strength at 0° and 180° are set 1 and 0, respectively).
Embodiments of the present invention can use different numbers of microphone sensors to construct second-order cardioid DMAs, in which α=[1, −1, 0]τ, β=[1, 0, 0]T.
Embodiments of the present invention use different numbers of microphone sensors to construct a third-order cardioid, in which α=[1, −1, 0, −√{square root over (2)}/2]τ, δ=[1, 0, 0, −√{square root over (2)}/8+¼]T.
Embodiments of the present invention provide a low noise differential microphone array that is an improvement above known DMAs. Embodiments of the present invention provide a differential microphone array, including a number (M) of microphone sensors for converting a sound to a number of electrical signals; and a processor which is configured to: apply linearly-constrained minimum variance filters on the electrical signals over a time window to calculate frequency responses of the electrical signals over a plurality of subbands; and sum the frequency responses of the electrical signals for each subband to calculate an estimated frequency spectrum of the sound. In embodiments, the processor is configured to, prior to applying the linearly-constrained minimum variance filters, calculate a short-time Fourier transform of the electrical signals; and calculate an inverse short-time Fourier transform of the estimated frequency spectrum of the electrical signals. In embodiments, the differential microphone array is one of a uniform linear microphone array and a non-uniform linear microphone array. In embodiments, a differential order of the differential microphone array is N, and the linearly-constrained minimum variance filters are determined by a beam pattern of the differential microphone array. In embodiments, the linearly-constrained minimum variance filter is calculated as a function of a steering matrix D, and the steering matrix D includes N+1 steering vectors d(ω,αN,n)=[1,e−jωτ
Embodiments of the present invention provide a method and system for operating a differential microphone array that includes a number (M) of microphone sensors for converting sound to electrical signals, including: applying, by a processor, linearly-constrained minimum variance filters on the electrical signals over a time window to calculate frequency responses of the electrical signals over a plurality of subbands; and summing, by the processor, the frequency responses of the electrical signals for each subband to calculate an estimated frequency spectrum of the sound. In embodiments, prior to applying the linearly-constrained minimum variance filters, calculating a short-time Fourier transform of the electrical signals; and calculating an inverse short-time Fourier transform of the estimated frequency spectrum of the electrical signals. In embodiments of the system and method, the differential microphone array is one of a uniform linear microphone array and a non-uniform linear array. In embodiments of the system and method, a differential order of the differential microphone array is N, and the linearly-constrained minimum variance filters are determined by a beam pattern of the differential microphone array. In embodiments of the system and method, the linearly-constrained minimum variance filter is calculated as a function of a steering matrix D, and the steering matrix includes N+1 steering vectors d(ω,αN,n)=[1,e−jωτ
Embodiments of the present invention provide a method and system for designing reconstruction filters for a differential microphone array including a number (M) of microphone sensors, including: specifying, by a processor, a target differential order (N) for the differential microphone array; specifying, by the processor, N+1 steering vectors d(ω,αN,n)=[1,e−jωτ
It will be appreciated that the disclosed methods, systems, and procedures described herein can be implemented using one or more processors executing instructions from one or more computer programs or components. These components may be provided as a series of computer instructions on a computer-readable medium, including, for example, RAM, ROM, flash memory, magnetic, and/or optical disks, optical memory, and/or other storage media. The instructions may be configured to be executed by one or more processors which, when executing the series of computer instructions, performs or facilitates the performance of all or part of the disclosed methods, and procedures.
Although the present disclosure has been described with reference to particular examples and embodiments, it is understood that the present disclosure is not limited to those examples and embodiments. Further, those embodiments may be used in various combinations with and without each other. The present disclosure as claimed therefore includes variations from the specific examples and embodiments described herein, as will be apparent to one of skill in the art.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2012/085830 | 12/4/2012 | WO | 00 |