The present invention is directed to the processing of signals, and more particularly, but not exclusively, relates to techniques to extract a signal from a selected source while suppressing interference from one or more other sources using two or more microphones.
The difficulty of extracting a desired signal in the presence of interfering signals is a long-standing problem confronted by engineers. This problem impacts the design and construction of many kinds of devices such as acoustic-based systems for interrogation, detection, speech recognition, hearing assistance or enhancement, and/or intelligence gathering. Generally, such devices do not permit the selective amplification of a desired sound when contaminated by noise from a nearby source. This problem is even more severe when the desired sound is a speech signal and the nearby noise is also a speech signal produced by other talkers. As used herein, “noise” refers not only to random or nondeterministic signals, but also to undesired signals and signals interfering with the perception of a desired signal.
One form of the present invention includes a unique signal processing technique using two or more detectors. Other forms include unique devices and methods for processing signals.
A further embodiment of the present invention includes a system with a number of directional sensors and a processor operable to execute a beamforming routine with signals received from the sensors. The processor is further operable to provide an output signal representative of a property of a selected source detected with the sensors. The beamforming routine may be of a fixed or adaptive type.
In another embodiment, an arrangement includes a number of sensors each responsive to detected sound to provide a corresponding number of representative signals. These sensors each have a directional reception pattern with a maximum response direction and a minimum response direction that differ in relative sound reception level by at least 3 decibels at a selected frequency. A first axis coincident with the maximum response direction of a first one of the sensors intersects a second axis coincident with the maximum response direction of a second one of those signals at an angle in a range of about 10 degrees through about 180 degrees. A processor is also included that is operable to execute a beamforming routine with the sensor signals and generate an output signal representative of a selected sound source. An output device may be included that responds to this output signal to provide an output representative of sound from the selected source. In one form, the sensors, processor, and output device belong to a hearing system.
Still another embodiment includes: providing a number of directional sensors each operable to detect sound and provide a corresponding number of sensor signals. The sensors each have a directional response pattern oriented in a predefined positional relationship with respect to one another. The sensor signals are processed with a number of signal weights that are adaptively recalculated from time-to-time. An output is provided based on this processing that represents sound emanating from a selected source.
Yet another embodiment includes a number of sensors oriented in relation to a reference axis and operable to provide a number of sensor signals representative of sound. The sensors each have a directional response pattern with a maximum response direction, and are arranged in a predefined positional relationship relative to one another with a separation distance of less than two centimeters to reduce a difference in time of reception between the sensors for sound emanating from a source closer to one of the sensors than another of the sensors. The processor generates an output signal from the sensor signals as a function of a number of signal weights for each of a number of different frequencies. The signal weights are adaptively recalculated from time-to-time.
Still a further embodiment of the present invention includes: positioning a number of directional sensors in a predefined geometry relative to one another that each have a directional pattern with sound response being attenuated by at least 3 decibels from one direction relative to another direction at a selected frequency; detecting acoustic excitation with the sensors to provide a corresponding number of sensor signals; establishing a number of frequency domain components for each of the sensor signals; and determining an output signal representative of the acoustic excitation from a designated direction. This determination can include weighting the components for each of the sensor signals to reduce variance of the output signals and provide a predefined gain of the acoustic excitation from the designated direction.
Further embodiments, objects, features, aspects, benefits, forms, and advantages of the present invention shall become apparent from the detailed drawings and descriptions provided herein.
While the present invention can take many different forms, for the purpose of promoting an understanding of the principles of the invention, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same. It will nevertheless be understood that no limitation of the scope of the invention is thereby intended. Any alterations and further modifications of the described embodiments, and any further applications of the principles of the invention as described herein are contemplated as would normally occur to one skilled in the art to which the invention relates.
Sensors 22, 24 are separated by distance D as illustrated by the like labeled line segment along lateral axis T. Lateral axis T is perpendicular to azimuthal axis AZ. Midpoint M represents the halfway point along separation distance SD between sensor 22 and sensor 24. Axis AZ intersects midpoint M and acoustic source 12. Axis AZ is designated as a point of reference for sources 12, 14, 16 in the azimuthal plane and for sensors 22, 24. For the depicted embodiment, sources 14, 16 define azimuthal angles 14a, 16a relative to axis AZ of about +22° and −65°, respectively. Correspondingly, acoustic source 12 is at 0° relative to axis AZ. In one mode of operation of system 10, the “on axis” alignment of acoustic source 12 with axis AZ selects it as a desired or target source of acoustic excitation to be monitored with system 10. In contrast, the “off-axis” sources 14, 16 are treated as noise and suppressed by system 10, which is explained in more detail hereinafter. To adjust the direction being monitored, sensors 22, 24 can be steered to change the position of axis AZ. In an additional or alternative operating mode, the designated monitoring direction can be adjusted as more fully described below. For these operating modes, it should be understood that neither sensor 22 nor 24 needs to be moved to change the designated monitoring direction, and the designated monitoring direction need not be coincident with axis AZ.
Sensors 22, 24 are of a directional type and are illustrated in the form of microphones 23 each having a type of directional sound-sensing pattern with a maximum response direction. A few nonlimiting types of such directional patterns are illustrated in
Other types of directional patterns and/or acoustic/sound sensor types can be utilized in other embodiments. Alternatively or additionally, more or fewer acoustic sources at different azimuths may be present; where the illustrated number and arrangement of sources 12, 14, 16 is provided as merely one of many examples. In one such example, a room with several groups of individuals engaged in simultaneous conversation may provide a number of the sources.
Referring again to
Referring additionally to
Processor 42 can be a software or firmware programmable device, a state logic machine, or a combination of both programmable and dedicated hardware, Furthermore, processor 42 can be comprised of one or more components and can include one or more Central Processing Units (CPUs). In one embodiment, processor 42 is in the form of a digitally programmable, highly integrated semiconductor chip particularly suited for signal processing. In other embodiments, processor 42 may be of a general purpose type or other arrangement as would occur to those skilled in the art.
Likewise, memory 50 can be variously configured as would occur to those skilled in the art. Memory 50 can include one or more types of solid-state electronic memory, magnetic memory, or optical memory of the volatile and/or nonvolatile variety. Furthermore, memory can be integral with one or more other components of processing subsystem 30 and/or comprised of one or more distinct components.
Processing subsystem 30 can include any oscillators, control clocks, interfaces, signal conditioners, additional filters, limiters, converters, power supplies, communication ports, or other types of components as would occur to those skilled in the art to implement the present invention. In one embodiment, some or all of the operational components of subsystem 30 are provided in the form of a single, integrated circuit device.
Referring also to the flow chart of
In stage 142, routine 140 begins with initiation of the A/D sampling and storage of the resulting discrete input samples xA(z) and xB(Z) in buffer 52 as previously described. Sampling is performed in parallel with other stages of routine 140 as will become apparent from the following description. Routine 140 proceeds from stage 142 to conditional 144. Conditional 144 tests whether routine 140 is to continue. If not, routine 140 halts. Otherwise, routine 140 continues with stage 146. Conditional 144 can correspond to an operator switch, control signal, or power control associated with system 10 (not shown).
In stage 146, a fast discrete fourier transform (FFT) algorithm is executed on a sequence of samples xA(z) and xB(z) and stored in buffer 54 for each channel A and B to provide corresponding frequency domain signals XA(k) and XB(k); where k is an index to the discrete frequencies of the FFTs (alternatively referred to as “frequency bins” herein). The set of samples xA(z) and xB(Z) upon which an FFT is performed can be described in terms of a time duration of the sample data. Typically, for a given sampling rate fS, each FFr is based on more than 100 samples. Furthermore, for stage 146, FFT calculations include application of a windowing technique to the sample data. One embodiment utilizes a Hamming window. In other embodiments, data windowing can be absent or a different type utilized, the FFT can be based on a different sampling approach, and/or a different transform can be employed as would occur to those skilled in the art. After the transformation, the resulting spectra XA(k) and XB(k) are stored in FFT buffer 54 of memory 50. These spectra can be complex-valued.
It has been found that reception of acoustic excitation emanating from a desired direction can be improved by weighting and summing the input signals in a manner arranged to minimize the variance (or equivalently, the energy) of the resulting output signal while under the constraint that signals from the desired direction are output with a predetermined gain. The following relationship (1) expresses this linear combination of the frequency domain input signals:
Y(k) is the output signal in frequency domain form, WA(k) and WB(k) are complex valued multipliers (weights) for each frequency k corresponding to channels A and B, the superscript “*” denotes the complex conjugate operation, and the superscript “H” denotes taking the Hermitian transpose of a vector. For this approach, it is desired to determine an “optimal” set of weights WA(k) and WB(k) to minimize variance of Y(k). Minimizing the variance generally causes cancellation of sources not aligned with the desired direction. For the mode of operation where the desired direction is along axis AZ, frequency components which do not originate from directly ahead of the array are attenuated because they are not consistent in amplitude and possibly phase across channels A and B. Minimizing the variance in this case is equivalent to minimizing the output power of off-axis sources, as related by the optimization goal of relationship (2) that follows:
where Y(k) is the output signal described in connection with relationship (1). In one form, the constraint requires that “on axis” acoustic signals from sources along the axis AZ be passed with unity gain as provided in relationship (3) that follows:
eHW(k)=1 (3)
Here e is a two element vector which corresponds to the desired direction. When this direction is coincident with axis AZ, sensors 22 and 24 generally receive the signal at the same time and possibly with an expected difference in amplitude, and thus, for source 12 of the illustrated embodiment, the vector e is real-valued with equal weighted elements—for instance eH=[1 1]. In contrast, if the selected acoustic source is not on axis AZ, then sensors 22, 24 can be steered to align axis AZ with it.
In an additional or alternative mode of operation, the elements of vector e can be selected to monitor along a desired direction that is not coincident with axis AZ. For such operating modes, vector e possibly becomes complex-valued to represent the appropriate time/amplitude/phase difference between sensors 22, 24 that correspond to acoustic excitation off axis AZ. Thus, vector e operates as the direction indicator previously described. Correspondingly, alternative embodiments can be arranged to select a desired acoustic excitation source by establishing a different geometric relationship relative to axis AZ. For instance, the direction for monitoring a desired source can be disposed at a nonzero azimuthal angle relative to axis AZ. Indeed, by changing vector e, the monitoring direction can be steered from one direction to another without moving either sensor 22, 24.
For the general case of a system with C sensors, the vector e is the steering vector describing the weights and delays associated with a desired monitoring direction and is of the form provided by relationship (4):
e(φ)=[a1(k)e+jφ
where an is a real-valued constant representing the amplitude of the response from each channel n for the target direction, and φn(k) represents the relative phase delay of each channel n. For the specific case of a linearly spaced array in free space, φn(k) is defined by relationship (5):
where c is the speed of sound in meters per second, D is the spacing between array elements in meters, fs is the sampling frequency in Hertz, and θ is the desired “look direction.” If the array is not linearly spaced or if the sensors are not in free space, the expression for φn(k) may become more complex. Thus, vector e may be varied with frequency to change the desired monitoring direction or look-direction and correspondingly steer the response of the array of differently oriented directional sensors.
For inputs XA(k) and XB(k) that generally correspond to stationary random processes (which is typical of speech signals over small periods of time), the following weight vector W(k) in relationship (6) can be determined from relationships (2) and (3):
where e is the vector associated with the desired reception direction, R(k) is the correlation matrix for the kth frequency, W(k) is the optimal weight vector for the kth frequency and the superscript “−1” denotes the matrix inverse. The derivation of this relationship is explained in connection with a general model of the present invention applicable to embodiments with more than two sensors 22, 24 in array 20.
The correlation matrix R(k) can be estimated from spectral data obtained via a number “F” of fast discrete Fourier transforms (FFTs) calculated over a relevant time interval. For the two channel (channels A and B) embodiment, the correlation matrix for the kth frequency, R(k), is expressed by the following relationship (7):
where XA is the FFT in the frequency buffer for channel A and XB is the FFT in the frequency buffer for channel B obtained from previously stored FFTs that were calculated from an earlier execution of stage 146; “n” is an index to the number “F” of FFTs used for the calculation; and “M” is a regularization parameter. The terms RAA(k), RAB(k), RBA(k), and RBB(k) represent the weighted sums for purposes of compact expression.
Accordingly, in stage 148 spectra XA(k) and XB(k) previously stored in buffer 54 are read from memory 50 in a First-In-First-Out (FIFO) sequence. Routine 140 then proceeds to stage 150. In stage 150, multiplier weights WA*(k), WB*(k) are applied to XA(k) and XB(k), respectively, in accordance with the relationship (1) for each frequency k to provide the output spectra Y(k). Routine 140 continues with stage 152 which performs an Inverse Fast Fourier Transform (IFFT) to change the Y(k) WTV determined in stage 150 into a discrete time domain form designated y(z). Next, in stage 154, a Digital-to-Analog (D/A) conversion is performed with D/A converter 84 (
After conversion to the continuous time domain form, signal y(t) is input to signal conditioner/filter 86. Conditioner/filter 86 provides the conditioned signal to output device 90. As illustrated in
After stage 154, routine 140 continues with conditional 156. In many applications it may not be desirable to recalculate the elements of weight vector W(k) for every Y(k). Accordingly, conditional 156 tests whether a desired time interval has passed since the last calculation of vector W(k). If this time period has not lapsed, then control flows to stage 158 to shift buffers 52, 54 to process the next group of signals. From stage 158, processing loop 160 closes, returning to conditional 144. Provided conditional 144 remains true, stage 146 is repeated for the next group of samples of xL(z) and xR(z) to determine the next pair of XA(k) and XB(k) FFTs for storage in buffer 54. Also, with each execution of processing loop 160, stages 148, 150, 152, 154 are repeated to process previously stored XA(k) and XB(k) FFTs to determine the next Y(k) FFT and correspondingly generate a continuous y(t). In this manner buffers 52, 54 are periodically shifted in stage 158 with each repetition of loop 160 until either routine 140 halts as tested by conditional 144 or the time period of conditional 156 has lapsed.
If the test of conditional 156 is true, then routine 140 proceeds from the affirmative branch of conditional 156 to calculate the correlation matrix R(k) in accordance with relationship (5) in stage 162. From this new correlation matrix R(k), an updated vector W(k) is determined in accordance with relationship (4) in stage 164. From stage 164, update loop 170 continues with stage 158 previously described, and processing loop 160 is re-entered until routine 140 halts per conditional 144 or the time for another recalculation of vector W(k) arrives. Notably, the time period tested in conditional 156 may be measured in terms of the number of times loop 160 is repeated, the number of FFTs or samples generated between updates, and the like. Alternatively, the period between updates can be dynamically adjusted based on feedback from an operator or monitoring device (not shown).
When routine 140 initially starts, earlier stored data is not generally available. Accordingly, appropriate seed values may be stored in buffers 52, 54 in support of initial processing. In other embodiments, a greater number of acoustic sensors can be included in array 20 and routine 140 can be adjusted accordingly.
Referring to relationship (7), regularization factor M typically is slightly greater than 1.00 to limit the magnitude of the weights in the event that the correlation matrix R(k) is, or is close to being, singular, and therefore noninvertable. This occurs, for example, when time-domain input signals are exactly the same for F consecutive FFT calculations.
In one embodiment, regularization factor M is a constant. In other embodiments, regularization factor M can be used to adjust or otherwise control the array beamwidth, or the angular range at which a sound of a particular frequency can impinge on the array relative to axis AZ and be processed by routine 140 without significant attenuation. This beamwidth is typically larger at lower frequencies than higher frequencies, and increases with regularization factor M. Accordingly, in one alternative embodiment of routine 140, regularization factor M is increased as a function of frequency to provide a more uniform beamwidth across a desired range of frequencies. In another embodiment of routine 140, M is alternatively or additionally varied as a function of time. For example, if little interference is present in the input signals in certain frequency bands, the regularization factor M can be increased in those bands. In a further variation, this regularization factor M can be reduced for frequency bands that contain interference above a selected threshold. In still another embodiment, regularization factor M varies in accordance with an adaptive function based on frequency-band-specific interference. In yet further embodiments, regularization factor M varies in accordance with one or more other relationships as would occur to those skilled in the art.
Referring to
In operation, the user of handset 220 can selectively receive an acoustic signal by aligning the corresponding source with a designated direction, such as axis AZ. As a result, sources from other directions are attenuated. Moreover, the wearer may select a different signal by realigning axis AZ with another desired sound source and correspondingly suppress one or more different off-axis sources. Alternatively or additionally, system 210 can be configured to operate with a reception direction that is not coincident with axis AZ. In a further alternative form, hands-free telephone system 210 includes multiple devices distributed within the passenger compartment of a vehicle to provide hands-free operation. For example, one or more loudspeakers and/or one or more acoustic sensors can be remote from handset 220 in such alternatives.
Under certain circumstances, the directional orientation of a sensor array relative to the target acoustic source changes. Without accounting for such changes, attenuation of the target signal can result. This situation can arise, for example, when a hearing aid wearer turns his or her head so that he or she is not aligned properly with the target source, and the hearing aid does not otherwise account for this misalignment. It has been found that attenuation due to misalignment can be reduced by localizing and/or tracking one or more acoustic sources of interest.
In a further embodiment, one or more transformation techniques are utilized in addition to or as an alternative to fourier transforms in one or more forms of the invention previously described. One example is the wavelet transform, which mathematically breaks up the time-domain waveform into many simple waveforms, which may vary widely in shape. Typically wavelet basis functions are similarly shaped signals with logarithmically spaced frequencies. As frequency rises, the basis functions become shorter in time duration with the inverse of frequency. Like fourier transforms, wavelet transforms represent the processed signal with several different components that retain amplitude and phase information. Accordingly, routine 140 and/or routine 520 can be adapted to use such alternative or additional transformation techniques. In general, any signal transform components that provide amplitude and/or phase information about different parts of an input signal and have a corresponding inverse transformation can be applied in addition to or in place of FFTs.
Routine 140 and the variations previously described generally adapt more quickly to signal changes than conventional time-domain iterative-adaptive schemes. In certain applications where the input signal changes rapidly over a small interval of time, it may be desired to be more responsive to such changes. For these applications, the F number of FFTs associated with correlation matrix R(k) may provide a more desirable result if it is not constant for all signals (alternatively designated the correlation length F). Generally, a smaller correlation length F is best for rapidly changing input signals, while a larger correlation length F is best for slowly changing input signals.
A varying correlation length F can be implemented in a number of ways. In one example, filter weights are determined using different parts of the frequency-domain data stored in the correlation buffers. For buffer storage in the order of the time they are obtained (First-In, First-Out (FIFO) storage), the first half of the correlation buffer contains data obtained from the first half of the subject time interval and the second half of the buffer contains data from the second half of this time interval. Accordingly, the correlation matrices R1(k) and R2(k) can be determined for each buffer half according to relationships (8) and (9) as follows:
R(k) can be obtained by summing correlation matrices R1(k) and R2(k).
Using relationship (6) of routine 140, filter coefficients (weights) can be obtained using both R1(k) and R2(k). If the weights differ significantly for some frequency band k between R1(k) and R2(k), a significant change in signal statistics may be indicated. This change can be quantified by examining the change in one weight through determining the magnitude and phase change of the weight and then using these quantities in a function to select the appropriate correlation length F. The magnitude difference is defined according to relationship (10) as follows:
ΔMA(k)=||wA,1(k)|−|wA,2(k)|| (10)
where wA,1(k) and wA,2(k) are the weights calculated for the left channel using R1(k) and R2(k), respectively. The angle difference is defined according to relationship (11) as follows:
where the factor of ±2π is introduced to provide the actual phase difference in the case of a ±2π jump in the phase of one of the angles. Similar techniques may be used for any other channel such as channel B, or for combinations of channels.
The correlation length F for some frequency bin k is now denoted as F(k). An example function is given by the following relationship (12):
F(k)=max(b(k)·ΔAA(k)+d(k)·ΔMA(k)+cmax(k), cmin(k)) (12)
where cmin(k) represents the minimum correlation length, cmax(k) represents the maximum correlation length and b(k) and d(k) are negative constants, all for the kth frequency band. Thus, as ΔAA(k) and ΔMA(k) increase, indicating a change in the data, the output of the function decreases. With proper choice of b(k) and d(k), F(k) is limited between cmin(k) and cmax(k), so that the correlation length can vary only within a predetermined range. It should also be understood that F(k) may take different forms, such as a nonlinear function or a function of other measures of the input signals.
Values for function F(k) are obtained for each frequency bin k. It is possible that a small number of correlation lengths may be used, so in each frequency bin k the correlation length that is closest to F1(k) is used to form R(k). This closest value is found using relationship (13) as follows:
where imin, is the index for the minimized function F(k) and c(i) is the set of possible correlation length values ranging from cmin to cmax.
The adaptive correlation length process can be incorporated into the correlation matrix stage 162 and weight determination stage 164 for use in a hearing aid. Logic of processing subsystem 30 can be adjusted as appropriate to provide for this incorporation. The application of adaptive correlation length can be operator selected and/or automatically applied based on one or more measured parameters as would occur to those skilled in the art.
Referring to
Subsystem 30 of systems 700 and/or 800 can be provided with logic in the form of programming, firmware, hardware, and/or a combination of these to implement one or more of the previously described routine 140, variations of routine 140, and/or a different adaptive beamformer routine, such as any of those described in U.S. Pat. No. 5,473,701 to Cezanne; U.S. Pat. No. 5,511,128 to Lindemann; U.S. Pat. No. 6,154,552 to Koroljow; Banks, D. “Localization and Separation of Simultaneous Voices with Two Microphones” IEE Proceedings I 140, 229-234 (1992); Frost, O. L. “An Algorithm for Linearly Constrained Adaptive Array Processing” Proceedings of IEEE 60 (8), 926-935 (1972); and/or Griffiths, L. J. and Jim, C. W. “An Alternative Approach to Linearly Constrained Adaptive Beamforming” IEEE Transactions on Antennas and Propagation AP-30(1), 27-34 (1982), to name just a few. In one alternative embodiment, system 10 operates in accordance with an adaptive beamformer routine other than routine 140 and its variations described herein. In still other embodiments a fixed beamforming routine can be utilized.
In one preferred form of system 10, 700, and/or 800; directional response pattern DP is of any type and has a maximum response direction that provides a response level at least 3 decibels (dB) greater than a minimum response direction at a selected frequency. In a more preferred form, the relative difference between the maximum and minimum response direction levels is at least 6 decibels (dB) at a selected frequency. In a still more preferred embodiment, this difference is at least 12 decibels at a selected frequency and the microphones are matched with generally the same directional response pattern type. In yet another more preferred embodiment, the difference is 3 decibels or more, and the sensors include a pair of matched microphones with a directional response pattern of the cardioid, figure-8, supercardioid, or hypercardioid type. Nonetheless, in other embodiments, the sensor directional response patterns may not be matched.
It has been discovered for directional acoustic sensors with generally symmetrically arranged maximum response directions that are located relatively close to one another, that phase differences of such approximately collocated sensors often can be ignored without undesirably impacting performance. In one such embodiment, routine 140 and its variations (collectively designated the FMV routine) can be simplified to operate based generally on amplitude differences between the sensor signals for each frequency band (designated the AFMV routine). As a result, highly directional responses can be obtained from a relatively small package compared to techniques that require comparatively large sensor-to-sensor distances.
As previously described in connection with routine 140, relationships (2) and (3) provide variance and gain constraints to determine weights in accordance with relationship (6) as follows:
It was further described that the correlation matrix R(k) of relationship (6) can be expressed by the following relationship (7):
When two directional sensors are located close enough to one another such that their approximate co-location results in an insignificant phase difference response of the sensors for directions and frequencies of interest, the AFMV routine can be utilized. Examples of such orientations include those shown with respect to sensors 22 and 24 in system 10, sensors 722 and 724 in system 700, and sensors 822 and 824 in system 800; where the sensor-to-sensor separation distance SD is relatively small, or near zero.
In one preferred form, directional sensors based on this model are approximately co-located such that a desired fidelity of an output generated with the AFMV routine is provided over a frequency range and directional range of interest. In a more preferred form, separation distance SD is less than about 2 centimeters (cms). In still a more preferred form, directional sensors implemented with this model have a separation distance SD of less than about 0.5 centimeter (cm). In a most preferred form, directional sensors utilized with this model have a distance of separation less than 0.2 cm. Indeed, it is contemplated in such forms, that two or more directional sensors can be so close to one another as to provide contact between corresponding sensing elements.
The FMV routine can be modified to provide the AFMV routine, which is described starting with relationships (14) as follows:
s1=s1R+S1I
s2=s2R+s2I
X1=s1+s2
X2=α·s1+β·s2 (14)
where s1 and s2 are the complex-valued representation of the sources for the kth frequency band, α and β are real numbers, and X1 and X2 are the complex-valued representations of the signals received by two sensors for the kth frequency band. Correspondingly, the ideal correlation matrix, based on the calculation of the expected value of random variables, is expressed by relationship (15) as follows:
where σ12 and σ22 are the powers of s1 and s2, respectively.
However, the correlation matrix that results from correlating real data is an estimate of this ideal matrix, Rideal, and can contain some error. This error approaches zero as F approaches infinity. This ideal matrix Rideal can be estimated from known data, as follows from relationships (16a-16d):
where subscripts R and I indicate real and imaginary parts, respectively, and n is a subscript indexing stored FFT coefficients for the kth frequency band, respectively.
The correlation may now be expressed in terms of Rideal and the real and imaginary parts of the error or bias with relationship (17) as follows:
Rest=Rideal+Rerror,R+Rerror,I (17)
Using relationships (16a-16d), the matrices can be expressed as follows in relationship (18):
Thus, the imaginary part of the estimated correlation matrix is an error term and can be neglected under suitable conditions, resulting in a substitute correlation matrix relationship (19) and corresponding weight relationship (20) as follows.
Relationships (19) and (20) can be used in place of relationships (6) and (7) in routine 140 to provide the AFMV routine. Further, not only can relationships (19) and (20) be used in the execution of routine 140, but also in embodiments where regularization factor M is adjusted to control beamwidth. Additionally, the steering vector ek can be modified (for each frequency band k) so that the response of the algorithm is steered in a desired direction. The vector e is chosen so that it matches the relative amplitudes in each channel for the desired direction in that frequency band. Alternatively or additionally, the procedure can be adjusted to account for directional pattern asymmetry under appropriate conditions.
For an embodiment of system 800 with a suitably small separation distance SD between sensors 822 and 824, and with patterns DP of a cardioid type for each sensor, the steering vector is: ek=[1 0 ]T because a negligible amount, if any, of the signal from straight ahead (along arrow 822a) should be picked up by sensor 824 given its opposite orientation relative to sensor 822.
In another embodiment, a combination of the FMV routine and the AFMV routine is utilized. In this example, a pair of cardioid-pattern sensors are oriented as shown in system 800 for each ear of a listener, the AFMV routine or other fixed or adaptive beamformer routine is utilized to generate an output from each pair, and the FMV routine is utilized to generate an output based on the two outputs from each sensor pair with an appropriate steering vector. The AFMV routine described in connection with relationships (14)-(20) can be used in connection with system 10 or system 700 where sensors 22 and 24 or sensors 722 and 724 have a suitably small separation distance SD. In still other embodiments, different configurations and arrangements of two or more directional microphones can be implemented in connection with the AFMV routine.
Generally, assisted hearing applications of the FMV routine and/or AFMV routine implemented with system 10, 700, 800, and/or 900 can provide an audio signal to the ear of the user and can be of a behind-the-ear, in-the-ear, or implanted type; a combination of these; or of such different form as would occur to those skilled in the art. In one more specific, nonlimiting embodiment,
System 950 further includes integrated circuitry 970 carried by device 960. Circuitry 970 is operatively coupled to sensors 722 and 724 and includes a processor arranged to execute the AFMV routine. Alternatively, the FMV routine, its variations, and/or a different adaptive beamformer routine can be implemented. Device 960 further includes a power supply and such other devices and controls as would occur to one skilled in the art to provide a suitable hearing aid arrangement. System 950 also includes in-the-ear audio output device 980 and cochlear implant 982. Circuitry 970 generates an output signal that is received by in-the-ear audio output device 980 and/or cochlear implant device 982. Cochlear implant 982 is typically disposed along the ear passage of a user and is configured to provide electrical stimulation signals to the inner ear in a standard manner. Transmission between device 960 and devices 980 and 982 can be by wire or through any wireless technique as would occur to one skilled in the art. While devices 980 and 982 are shown in a common system for convenience of illustration, it should be understood that in other embodiments one type of output device 980 or 982 is utilized to the exclusion of the other. Alternatively or additionally, sensors configured to implement the AFMV procedure can be used in other hearing aid embodiments sized and shaped to fit just one ear of the listener with processing adjusted to account for acoustic shadowing caused by the head, torso, or pinnae. In still another embodiment, a hearing aid system utilizing the AFMV procedure could be utilized with a cochlear implant where some or all of the processing hardware is located in the implant device.
Besides hearing aids, the FMV and/or AFMV routines of the present invention can be used together or separately in connection with other aural or audio applications such as the hands-free telephony system 210 of
In one preferred embodiment of the present invention, one or more of the previously described systems and/or attendant processes are directed to the detection and processing of a broadband acoustic signal having a range of at least one-third of an octave. In a more preferred broadband-directed embodiment of the present invention, a frequency range of at least one octave is detected and processed. Nonetheless, in still other preferred embodiments, the processing may be directed to a single frequency or narrow range of frequencies of less than one-third of an octave. In other alternative embodiments, at least one acoustic sensor is of a directional type while at least one other of the acoustic sensors is of an omnidirectional type. In still other embodiments based on more than two sensors, two or more sensors may be omnidirectional and/or two or more may be of a directional type.
Many other further embodiments of the present invention are envisioned. One further embodiment includes: detecting acoustic excitation with a number of acoustic sensors that provide a number of sensor signals; establishing a set of frequency components for each of the sensor signals; and determining an output signal representative of the acoustic excitation from a designated direction. This determination includes weighting the set of frequency components for each of the sensor signals to reduce variance of the output signal and provide a predefined gain of the acoustic excitation from the designated direction.
For other alternative embodiments, directional sensors may be utilized to detect a characteristic different than acoustic excitation or sound, and correspondingly extract such characteristic from noise and/or one of several sources to which the directional sensors are exposed. In one such example, the characteristic is visible light, ultraviolet light, and/or infrared radiation detectable by two or more optical sensors that have directional properties. A change in signal amplitude occurs as a source of the signal is moved with respect to the optical sensors, and an adaptive beamforming algorithm is utilized to extract a target source signal amidst other interfering signal sources. For this system, a desired source can be selected relative to a reference axis such as axis AZ. In still other embodiments, directional antennas with adaptive processing of radar returns or communication signals can be utilized.
Another embodiment includes a number of acoustic sensors in the presence of multiple acoustic sources that provide a corresponding number of sensor signals. A selected one of the acoustic sources is monitored. An output signal representative of the selected one of the acoustic sources is generated. This output signal is a weighted combination of the sensor signals that is calculated to minimize variance of the output signal.
A still further embodiment includes: operating a voice input device including a number of acoustic sensors that provide a corresponding number of sensor signals; determining a set of frequency components for each of the sensor signals; and generating an output signal representative of acoustic excitation from a designated direction. This output signal is a weighted combination of the set of frequency components for each of the sensor signals calculated to minimize variance of the output signal.
Yet a further embodiment includes an acoustic sensor array operable to detect acoustic excitation that includes two or more acoustic sensors each operable to provide a respective one of a number of sensor signals. Also included is a processor to determine a set of frequency components for each of the sensor signals and generate an output signal representative of the acoustic excitation from a designated direction. This output signal is calculated from a weighted combination of the set of frequency components for each of the sensor signals to reduce variance of the output signal subject to a gain constraint for the acoustic excitation from the designated direction.
A further embodiment includes: detecting acoustic excitation with a number of acoustic sensors that provide a corresponding number of signals; establishing a number of signal transform components for each of these signals; and determining an output signal representative of acoustic excitation from a designated direction. The signal transform components can be of the frequency domain type. Alternatively or additionally, a determination of the output signal can include weighting the components to reduce variance of the output signal and provide a predefined gain of the acoustic excitation from the designated direction.
In yet another embodiment, a system includes a number of acoustic sensors. These sensors provide a corresponding number of sensor signals. A direction is selected to monitor for acoustic excitation with the hearing aid. A set of signal transform components for each of the sensor signals is determined and a number of weight values are calculated as a function of a correlation of these components, an adjustment factor, and the selected direction. The signal transform components are weighted with the weight values to provide an output signal representative of the acoustic excitation emanating from the direction. The adjustment factor can be directed to correlation length or a beamwidth control parameter just to name a few examples.
For a further embodiment, a system includes a number of acoustic sensors to provide a corresponding number of sensor signals. A set of signal transform components are provided for each of the sensor signals and a number of weight values are calculated as a function of a correlation of the transform components for each of a number of different frequencies. This calculation includes applying a first beamwidth control value for a first one of the frequencies and a second beamwidth control value for a second one of the frequencies that is different than the first value. The signal transform components are weighted with the weight values to provide an output signal.
For another embodiment, acoustic sensors provide corresponding signals that are represented by a plurality of signal transform components. A first set of weight values are calculated as a function of a first correlation of a first number of these components that correspond to a first correlation length. A second set of weight values are calculated as a function of a second correlation of a second number of these components that correspond to a second correlation length different than the first correlation length. An output signal is generated as a function of the first and second weight values.
In another embodiment, acoustic excitation is detected with a number of sensors that provide a corresponding number of sensor signals. A set of signal transform components is determined for each of these signals. At least one acoustic source is localized as a function of the transform components. In one form of this embodiment, the location of one or more acoustic sources can be tracked relative to a reference. Alternatively or additionally, an output signal can be provided as a function of the location of the acoustic source determined by localization and/or tracking, and a correlation of the transform components.
In a further embodiment, a hearing aid device includes a number of sensors each responsive to detected sound to provide a corresponding number of sound representative sensor signals. The sensors each have a directional response pattern with a maximum response direction and a minimum response direction that differ in sound response level by at least 3 decibels at a selected frequency. A first axis coincident with the maximum response direction of a first one of the sensors is positioned to intersect a second axis coincident with the maximum response direction of a second one of the sensors at an angle in a range of about 10 degrees through about 180 degrees. In one form, the first one of the sensors is separated from the second one of the sensors by less than about two centimeters, and/or are of a matched cardioid, hypercardioid, supercardioid, or figure-8 type. Alternatively or additionally, the device includes integrated circuitry operable to perform an adaptive beamformer routine as a function of amplitude of the sensor signals and an output device operable to provide an output representative of sound emanating from a direction selected in relation to position of the hearing aid device.
It is contemplated that various signal flow operators, converters, functional blocks, generators, units, stages, processes, and techniques may be altered, rearranged, substituted, deleted, duplicated, combined or added as would occur to those skilled in the art without departing from the spirit of the present inventions. It should be understood that the operations of any routine, procedure, or variant thereof can be executed in parallel, in a pipeline manner, in a specific sequence, as a combination of these appropriate to the interdependence of such operations on one another, or as would otherwise occur to those skilled in the art. By way of nonlimiting example, A/D conversion, D/A conversion, FFT generation, and FFT inversion can typically be performed as other operations are being executed. These other operations could be directed to processing of previously stored A/D or signal transform components, just to name a few possibilities. In another nonlimiting example, the calculation of weights based on the current input signal can at least overlap the application of previously determined weights to a signal about to be output.
Any theory, mechanism of operation, proof, or finding stated herein is meant to further enhance understanding of the present invention and is not intended to make the present invention in any way dependent upon such theory, mechanism of operation, proof, or finding. The following patents, patent applications, and publications are hereby incorporated by reference each in its entirety: U.S. Pat. No. 5,473,701; U.S. Pat. No. 5,511,128; U.S. Pat. No. 6,154,552; U.S. Pat. No. 6,222,927 B1; U.S. patent application Ser. No. 09/568,430; U.S. patent application Ser. No. 09/568,435; U.S. patent application Ser. No. 09/805,233; International Patent Application Number PCT/US01/15047; International Patent Application Number PCT/US01/14945; International Patent Application Number PCT/US99/26965; Banks, D. “Localization and Separation of Simultaneous Voices with Two Microphones” WEE Proceedings I 140, 229-234 (1992); Frost, O. L. “An Algorithm for Linearly Constrained Adaptive Array Processing” Proceedings of IEEE 60 (8), 926-935 (1972); and Griffiths, L. J. and Jim, C. W. “An Alternative Approach to Linearly Constrained Adaptive Beamforming” IEEE Transactions on Antennas and Propagation AP-30(1), 27-34 (1982). While the invention has been illustrated and described in detail in the drawings and foregoing description, the same is to be considered as illustrative and not restrictive in character, it being understood that only the selected embodiments have been shown and described and that all changes, modifications and equivalents that come within the spirit of the invention as defined herein or by the following claims are desired to be protected.
The present application is related to International Patent Application Number PCT/US01/15047 filed on May 10, 2001; International Patent Application Number PCT/US01/14945 filed on May 9, 2001; U.S. patent application Ser. No. 09/805,233 filed on Mar. 13, 2001; U.S. patent application Ser. No. 09/568,435 filed on May 10, 2000; U.S. patent application Ser. No. 09/568,430 filed on May 10, 2000; International Patent Application Number PCT/US99/26965 filed on Nov. 16, 1999; and U.S. Pat. No. 6,222,927 B1; all of which are hereby incorporated by reference.
This invention was made with Government support under agreement 240-6762A awarded by the Defense Advanced Research Projects Agency (DARPA). The Government has certain rights in the invention.