This invention generally relates to audio signal processing, and more specifically to calibrating more than one microphone (e.g., a microphone array) using a signal level difference histogram algorithm.
There are many applications that utilize two or more microphones (e.g., microphone arrays) to pick up an acoustic signal. Separate microphone signals can be processed to obtain enhanced signals. One of the enhancement applications is acoustic beamforming, which means that sounds coming from different directions are attenuated differently. For example, if a person is speaking on the phone in a noisy environment, the acoustic beam can be directed towards the speaker, which will provide an improved signal-to-noise ratio of the picked signal, because the background noise is attenuated while the speech is preserved. For implementing acoustic beamforming successfully, matching microphone sensitivity is an important factor.
Majority, if not all, of the applications utilizing several microphones, benefit from matching the sensitivities of the microphones. If the frequency responses of the microphones are similar enough, only a full-band gain has to be applied into N−1 of the microphone signals if N is the number of microphones.
A conventional microphone capsule sensitivity tolerance is within a few decibels. This means that two random microphone capsules of the same type may have several decibels sensitivity difference. It is assumed that a sensitivity difference of a few decibels would be quite common in a product utilizing two or more microphones. On the other hand, the acoustic beamformer requires that the microphone sensitivities are matched more accurately; otherwise the beamformer may significantly deteriorate the desired signal.
A conventional way to match the microphone sensitivities is to use a manual calibration. This means that the individual microphone components are first measured using a suitable calibration measurement. After the measurement, matching microphone components are selected to be used in the array. Alternatively, the sensitivity differences found in the measurement can be compensated by building up a matched array. The compensation can be carried out either utilizing microphone specific full-band gains or, in case of non-similar frequency responses, microphone specific filters that match both the frequency responses and sensitivities of the microphones of the array. The manual method is obviously very expensive to be utilized in mass-production. Besides, possible later sensitivity mismatch due to the aging of the microphone components requires a new calibration.
Another group of calibration methods utilizes a dedicated signal source for calibrating the microphone array in place. This makes the re-calibration easier to carry out. The method usually requires an accurate knowledge about the placement of the microphones relative to the sound source. Also the calibration environment has to be controlled.
Yet another group of calibration methods is automatic self-calibration methods. These methods exploit the signals picked up by the microphones during normal operation of the array. For the calibration, typical implementations use either the whole signal or time intervals of the signal when the desired signal is active. When dealing with close-talking microphone arrays, the whole signal is not usable for the calibration purposes, since the sound pressure level of the desired signal is different at different microphones whereas the level of usual ambient noise is more or less the same at different microphones. Therefore, a separation between desired signal and ambient noise is required. If the desired signal is utilized for self-calibration, the microphone positions and the direction of arriving sound have to be known or estimated. Any estimation faults of these factors can cause errors in the calibration.
According to a first aspect of the invention, apparatus, comprises: a signal processing module, configured to calculate one or more differences between one or more signal levels from one or more microphones of a plurality of microphones and further one or more signal levels from one or more selected microphones of the plurality of the microphones, configured to create or update one or more difference histograms corresponding to the one or more of selected microphones using the one or more differences, and further configured to determine a sharpness and a sensitivity difference for each of the one or more difference histograms; and a gain control module, configured to adjust one or more amplifying gains for one or more microphone signals corresponding to the one or more microphones using the sensitivity difference for each of the one or more difference histograms corresponding to one of the one or more microphones, if the sharpness meets a predetermined criterion, for matching sensitivities of the plurality of microphones.
According further to the first aspect of the invention, the signal processing module may be configured to update one of the difference histograms corresponding to one of the one or more of the selected microphones if a corresponding difference for the one of the one or more of the selected microphones is within a predetermined range.
Further according to the first aspect of the invention, the signal processing module may be configured to determine the sharpness of the each of the difference histograms only if the each of the one or more difference histograms is matured.
Still further according to the first aspect of the invention, if the sharpness meets a predetermined criterion for one of the one or more microphones, the signal processing module may be configured to determine the sensitivity difference by identifying a maximum peak location on the each of the one or more difference histograms or using an interpolation. Still further, the signal processing module may be configured to provide the sensitivity difference to the gain control module to adjust the one or more amplifying gains. Yet still further, the signal processing module may be configured to update the sensitivity difference using one or more smoothing methods and to provide the sensitivity difference, after being updated using the one or more smoothing methods, to the gain control module to adjust the one or more amplifying gains.
According further to the first aspect of the invention, the apparatus may be a part of an electronic device comprising the plurality of the microphones.
According still further to the first aspect of the invention, the apparatus may further comprise: a low-pass filter or a plurality of low-pass filters configured to eliminate high frequency components from signals with the one or more signal levels and with the further one or more signal levels.
According further still to the first aspect of the invention, the apparatus may further comprise: a signal level calculator, configured to compute the one or more signal levels and the further one or more signal levels for providing to the signal processing module. Still further, the signal level calculator and the signal processing module may be combined.
Yet still further according to the first aspect of the invention, the apparatus may further comprise: a signal classification module, configured to separate a signal from each of the one or microphones into a speech and noise components, and further configured to provide one or more control signals comprising calibration-suitable information to the signal processing module.
Still yet further according to the first aspect of the invention, the apparatus may further comprise: an analog-to-digital converter, configured to convert analog microphone signals of the plurality of the microphones into digital microphone signals before determining the one or more signal levels and the further one or more signal levels.
Still further still according to the first aspect of the invention, the apparatus may further comprise: a memory, configured to store the one or more amplifying gains provided by the gain control module.
Further according to the first aspect of the invention, an integrated circuit may comprise selected or all modules of the apparatus.
Still further according to the first aspect of the invention, the apparatus may be configured to provide the sensitivity difference for the each of the one or more difference histograms and to adjust the one or more amplifying gains independently of locations of the plurality of the microphones.
According further to the first aspect of the invention, the plurality of the microphones may be an array of the microphones.
According still further to the first aspect of the invention, the one or more signal levels and the further one or more signal levels may be power signal levels calculated for a predetermined frame length.
According to a second aspect of the invention, a method, comprises: calculating one or more differences between one or more signal levels from one or more microphones of a plurality of microphones and further one or more signal levels from one or more selected microphones of the plurality of the microphones; creating or updating one or more difference histograms corresponding to the one or more of selected microphones using the one or more differences, and determining a sharpness and a sensitivity difference for each of the one or more difference histograms; and adjusting one or more amplifying gains for one or more microphone signals corresponding to the one or more microphones using the sensitivity difference for each of the one or more difference histograms corresponding to one of the one or more microphones, if the sharpness meets a predetermined criterion, for matching sensitivities of the plurality of microphones.
According further to the second aspect of the invention, the one or more signal levels and the further one or more signal levels may be power signal levels calculated for a predetermined frame length.
Further according to the second aspect of the invention, the updating of one of the difference histograms corresponding to one of the one or more of the selected microphones may be performed if a corresponding difference for the one of the one or more of the selected microphones is within a predetermined range.
Still further according to the second aspect of the invention, the determining the sharpness of the each of the difference histograms may be performed only if the each of the one or more difference histograms is matured.
According further to the second aspect of the invention, if the sharpness meets a predetermined criterion for one of the one or more microphones, the determining of the sensitivity difference may be performed by identifying a maximum peak location on the each of the one or more difference histograms or using an interpolation. Still further, the determining of the sensitivity difference may be performed by updating the sensitivity difference using one or more smoothing methods.
According still further to the second aspect of the invention, prior to the calculating the differences, the method may comprise: filtering high frequency components from signals with the one or more signal levels and with the further one or more signal levels.
According further still to the second aspect of the invention, prior to the calculating the differences, the method may comprise: computing the one or more signal level and the further one or more signal levels for providing to the gain control module.
According yet further still to the second aspect of the invention, prior to the calculating the differences, the method may comprise: separating a signal from each of the one or microphones into a speech and noise components, and providing one or more control signals comprising calibration-suitable information.
Yet still further according to the second aspect of the invention, the method may further comprise: storing said one or more amplifying gains.
Still yet further according to the second aspect of the invention, the plurality of the microphones may be an array of the microphones.
According to a third aspect of the invention, a computer program product comprises: a computer readable storage structure embodying a computer program code thereon for execution by a computer processor with the computer program code, wherein the computer program code comprises instructions for performing the method of the second aspect of the invention.
According to a fourth aspect of the invention, an electronic device, comprises: a plurality of microphones; and a multiple microphone calibration module, comprising: a signal processing module, configured to calculate one or more differences between one or more signal levels from one or more microphones of a plurality of microphones and further one or more signal levels from one or more selected microphones of the plurality of the microphones, configured to create or update one or more difference histograms corresponding to the one or more of selected microphones using the one or more differences, and further configured to determine a sharpness and a sensitivity difference for each of the one or more difference histograms; and a gain control module, configured to adjust one or more amplifying gains for one or more microphone signals corresponding to the one or more microphones using the sensitivity difference for each of the one or more difference histograms corresponding to one of the one or more microphones, if the sharpness meets a predetermined criterion, for matching sensitivities of the plurality of microphones.
According further to the fourth aspect of the invention, the multiple microphone calibration module may be detachable from the electronic device.
For a better understanding of the nature and objects of the present invention, reference is made to the following detailed description taken in conjunction with the following drawings, in which:
a and 3b are histograms generated according to embodiments of the present invention for two microphones used in a mobile phone:
a-5c are graphs illustrating calibration values (sensitivity differences) determined by different methods as a function of time: a) using a raw histogram maximum value from the histogram peak value, b) using an interpolation parabola value shown in
A new method, apparatus and software product are presented for calibrating multiple microphones (e.g., a microphone array) to match their sensitivity using an ambient noise by creating and updating one or more calibration signal level difference histograms. Using ambient noise for the sensitivity calibration can eliminate the requirement for knowing the microphone positions and a direction of arrival of the desired acoustic signal. According to one embodiment, a multiple microphone calibration module performing the sensitivity calibration may be build-in as a part of an electronic device comprising the multiple microphones or it may be a stand-alone unit, which can be attached to an electronic device (e.g., a mobile phone) for the sensitivity calibration.
According to embodiments of the present invention, the microphone sensitivity difference may be detected using a signal level (e.g., power level) difference histogram. In one simple scenario, according to one embodiment of the present invention, during the natural operation of the microphone array, the sampled microphone signals may be divided into frames whose power levels are calculated (though this division may not be used for applying the calibration procedure described herein). The frames then may be classified to be either a background noise or a desired signal, e.g., speech. If the frame is classified as the background noise, the difference of the power levels of the microphone signals can be stored into the histogram.
Then the microphone sensitivity difference can be derived from the area around the highest peak in the histogram. Using the signal (e.g., power) level difference histogram instead of direct smoothing of the level difference can provide information whether the found microphone sensitivity difference indicated by the histogram is trustworthy. After the histogram has obtained enough data, i.e., the derived distribution becomes mature enough (e.g., acquiring a predetermined amount of data in the histogram, or a threshold total value of all histogram bins), the shape of the distribution can indicate the reliability of the obtained microphone sensitivity difference: a sharp distribution may indicate a reliable estimate while a broad distribution suggests that the estimate cannot be trusted.
The sensitivity difference estimate can be derived from the peak location of the distribution on the histogram. Whenever the histogram is mature enough and the shape of it indicates that the estimate is reliable, a sensitivity difference value (or the sensitivity difference) may be used. The obtained sensitivity difference can be further smoothed, e.g., using a suitable IIR (infinite impulse response) filtering, to obtain a more stable estimate. Since this estimate may still be quite fluctuating, it is possible to apply a second smoothing to it, and so on. The number smoothings may be defined by the required accuracy of calibration. For example, the reason for using two separate smoothing stages may be that the faster one (1st smoothing) can offer a quicker estimate and the slower (2nd) smothing can provide a more stable and more precise estimate that may be used in a long run (e.g., stored in a memory). All additional estimates (smoothings) may be also equipped with a maturity check, which can indicate if the estimate is ready to be used.
According to a further embodiment, after the microphone sensitivity difference has been detected for a signal from a particular microphone, a corresponding gain may be applied to a channel used for processing that signal. More detailed description of the algorithm is provided herein.
The flow chart of
In a next step 14, N microphone signals may be pre-filtered using one or more low-pass filters (this is an optional step). The low-pass-filtering may be useful, since the microphones (including microphone capsules and surrounding acoustic constructions) are not as directive at low frequencies, and hence using this pre-filtering may lead to better results. For example, 1-kHz roll-off frequency can be used.
In a next step 16, signal levels (e.g., power frame signals) for N microphone signals can be computed. This step may be implemented by computing signal powers “frame-wisely” using a suitable frame length, e.g., 5 ms.
In a next step 17 a signal classification for a signal from one (or more) reference microphone of the N microphones may be implemented for controlling the calibration status. For example, if one component (e.g., noise) is identified, this will indicate that the calibration is suitable, whereas if another component (e.g., speech) is identified, this will indicate that calibration is not suitable. A simple voice activity detector (VAD) based on the power of one microphone signal may be used to distinguish between speech and noise frames. Then in a next step 18, it is ascertained whether the calibration is suitable as explained herein. If that is not the case (e.g., the frame is classified as speech), the process may go to step 38. However, if it is ascertained that the calibration is suitable (e.g., the frame is classified as noise), in a next step 19, differences between a signal level for a reference microphone and signal levels of other (selected) microphones (or just one microphone for N=2) may be calculated.
In a next step 20, it is ascertained whether the calculated difference is within a pre-defined range. If that is the case, in a next step 22, the difference may be stored into a histogram (starting or updating the histogram for the corresponding microphone). The acceptance range may be defined according to the sensitivity tolerance of the utilized microphone capsules. For example, with a tolerance of ±3 dB two microphones may have, at most, a sensitivity difference of ±6 dB, and thus the acceptance range can be ±6 dB. The histogram may be updated in such a manner that all bins of the histogram are multiplied with a positive factor less than one and after that the bin corresponding to the amount of difference is increased by adding a constant value to it. If however, it is ascertained in step 20 that the difference is not within the predetermined range, the process may go to step 38. It is noted that steps 22-38 may be performed for each of the non-referenced N−1 microphones separately to match their sensitivity to the selected referenced microphone.
In a next step 24, it is ascertained whether the histogram is mature, i.e., if the histogram has obtained enough data and is ready to be used. If that is not the case, the process may go to step 38. If however, it is ascertained that the histogram is mature, in a next step 26, the sharpness may be determined (calculated). The sharpness can be defined, e.g., by a ratio of the maximum bin height and the sum of all bin heights.
In a next step 28, it is ascertained whether the sharpness is “sharp enough”: the calculated sharpness may be compared against a sharpness threshold and if it exceeds that threshold, the sharpness is sharp enough, otherwise the histogram is broad and is not ready for calibration purposes. The sensitivity difference may be derived if the histogram is sharp enough. This principle relies on the fact that the sensitivity of a microphone stays constant, since changes due to aging of the component or other environmental effects are very slow from the viewpoint of the operating speed of the histogram calibration process. Therefore, the true sensitivity difference of two microphones stays constant meaning that the distribution presented in the histogram can be concentrated around the true sensitivity difference. However, if the histogram becomes broad, this indicates that the signal has not been suitable for calibration purposes (e.g., in case of a wind noise).
a and 3b shows examples among others of sharp and broad power level difference distributions. Histograms shown in
Thus, if it is ascertained in step 28 of
Since the sensitivity difference histogram has a limited resolution, the accuracy of the estimation may be improved if the values between bins are also taken into account. According to one embodiment, the sensitivity difference estimation may be derived from the distribution utilizing Lagrange interpolation. A parabola may be fitted to the points defined by the highest bin and the bins adjacent to it, and the peak of the parabola thus can be determined. The sensitivity difference estimate then may be located at the peak position at the corresponding axis. The peak estimation is illustrated in
In a next step 32 of
For example, a more stable estimate may be derived using a first fast IIR smoothing (see
a-5c show examples of graphs illustrating calibration values (sensitivity differences) determined by different methods as a function of time: a) using a raw histogram maximum value from the histogram peak value as indicated by the arrow 91 in
Finally, in a next step 38 in
It is noted that the order of steps shown in
In reference to the signal classification (step 18 in
As described herein, the histogram sharpness detection may be implemented using the ratio of the highest bin and sum of all bins. Alternatively, the ratio of the sum of the highest bin and two or more adjacent bins and the sum of all bins may be also used. The optimal amount of bins in the numerator may depend on the used level resolution of the histogram. For example in examples shown in
In reference to the histogram peak location estimation, an accurate search of the peak location of the histogram may be carried out using more than just three bins of the histogram as used in the examples of
It is further noted that the simplest implementation form of the algorithm is for matching two microphones (N=2). However, this algorithm may handle more than two microphones, as described herein, wherein one arbitrary microphone is selected to be the reference microphone against which all other microphones are compared. In other words, the power level differences may be calculated between the signal of the reference microphone and the signals of the other microphones. The implementation principles remain the same, and N−1 histograms are needed instead of one histogram (N is the total number of microphones to be matched).
According to a further embodiment of the present invention, further improvement in the calibration robustness (accuracy) can be achieved by using several reference microphones. A complete such solution is that each microphone is compared to all other N−1 microphones. The solution can offer coincident decision paths to define the sensitivity difference between two microphones. In an ideal case all paths should indicate the same sensitivity difference. If this is not the case, it could be decided whether to use the values by averaging them in some suitable manner or to disable the update of the sensitivity difference estimate. A reduced version of the complete solution could be to use more than one but less than N microphones as reference microphones (N being the total number of microphones to be matched).
Furthermore, since it is unlikely that all N microphones (e.g., of the microphone array) become matched simultaneously, the control logic may be designed to indicate the maturity state of the calibration of each microphone. This may allow to start utilizing the microphones one by one for the further processing, e.g., beamforming.
The advantage of the algorithm disclosed herein is that it may not require any separate calibration routines to be done, since the calibration can be carried out on the fly during the normal operation of the electronic device utilizing it. This fact can minimize possible errors in manual calibration, and even more importantly, it can minimize the cost of calibration. Furthermore, microphone component aging and environmental effects on microphone component sensitivities may be handled inherently by the algorithm. Finally, the algorithm does not need information about the microphone positions and the direction of arrival of the desired sound (acoustic signal).
An acoustic signal 56 can be received by a microphone array 54 with N microphones for generating N corresponding microphone signals 58, wherein N is a finite integer of at least a value of two. A multi-channel analog-to-digital (A/D) converter 60 (which can be a part of the module 52, or alternatively be a part of the electronic device 50) can provide A/D conversion of the microphone signals 58 into digital signals 76 (the example shown in
The low-pass filter(s) 62 may be optionally used to cut-off high frequency components as described in reference to step 14 of
The module 74 maybe used to implement steps 19 through 36 described in reference to
According to an embodiment of the present invention, the modules 74, 64, 62, 66 or 70 may be implemented as a software or a hardware module or a combination thereof. Furthermore, the module 74, 64, 62, 66 or 70 may be implemented as a separate module or may be combined with any other module of the electronic device 50 or it can be split into several modules according to their functionality. Furthermore, an integrated circuit may comprise selected or all modules of the multiple microphones calibration module 52.
As explained above, the invention provides both a method and corresponding equipment consisting of various modules providing the functionality for performing the steps of the method. The modules may be implemented as hardware, or may be implemented as software or firmware for execution by a computer processor. In particular, in the case of firmware or software, the invention can be provided as a computer program product including a computer readable storage structure embodying computer program code (i.e., the software or firmware) thereon for execution by the computer processor.
It is further noted that various embodiments of the present invention recited herein can be used separately, combined or selectively combined for specific applications.
It is to be understood that the above-described arrangements are only illustrative of the application of the principles of the present invention. Numerous modifications and alternative arrangements may be devised by those skilled in the art without departing from the scope of the present invention, and the appended claims are intended to cover such modifications and arrangements.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/FI2009/050314 | 4/22/2009 | WO | 00 | 11/9/2010 |
Number | Date | Country | |
---|---|---|---|
61125475 | Apr 2008 | US |