The present invention relates to a method and to a device for operating voice-enhancement systems, such as communication and/or intercom/two-way intercom and/or duplex telephony devices in motor vehicles, where voice signals are picked up via a microphone system and routed to at least one loudspeaker.
Methods of this kind are used in motor vehicles for voice-supported duplex telephony or for supporting voice input-controlled electronic or electrical components. The fundamental difficulty that arises is that, depending on the particular operating state, there is background noise in the motor vehicle. This masks the voice commands. Intercom and two-way intercom systems in motor vehicles are mainly advantageously used in large vehicles, minibusses and the like. However, they can also be used in normal passenger cars. When using voice-controlled input units for electrical components in motor vehicles, it is still very important for the background noise to be suppressed or for the voice command to be filtered out.
Thus, a voice-recognition device for a motor vehicle is described in European Patent Application No. 0 078 014, where the status of engine operation and/or motor vehicle movement is signaled or fed in, via sensors, to the amplifier system of the voice-recognition device. Based on this, a noise-level control is then used to attempt to filter out the voice command from the background noise.
A filtering operation is described in PCT International Published Patent Application No. WO 97/34290, where periodic interfering noise signals are filtered out by determining their periods and by using a generator to interfere with them, so that the voice signal remains.
In German Published Patent Application No. 197 05 471, it is described to support a voice recognition with the aid of transversal filtering.
In German Published Patent Application No. 41 06 405, a method is described for subtracting noise from the voice signal, a multiplicity of microphones being used. A duplex telephony device having a plurality of microphones is discussed in German Published Patent Application No. 199 58 836.
In German Published Patent Application No. 39 25 589, it is described to use a multiple microphone system, where, in motor vehicle applications, one of the microphones is placed in the engine compartment and one other microphone in the passenger compartment. A subtraction of both signals then follows. The disadvantage in this context is that only the engine noise or the actual running noise of the vehicle itself is subtracted from the total signal in the passenger compartment. Specific secondary noises are disregarded in this case. Also lacking is a feedback suppression. Everywhere that microphones and loudspeakers are placed in an acoustically coupleable vicinity, the acoustic signal that is extracted, coupled out or decoupled at the loudspeaker is fed back into the microphone. The result is a so-called feedback, and a subsequent overmodulation. Methods for avoiding such an overmodulation are described in European Published Patent Application No. 1 077 013, PCT International Published Patent Application No. WO 02/069487, and PCT International Published Patent Application No. WO 02/21817.
It is an object of the present invention to provide a method and a device that may improve the verbal communication among the occupants of a vehicle.
The above and other beneficial objects of the present invention may be achieved by providing a method and a device as described herein.
The above object may be attained in that, for the operation of a voice-supported system, such as a communications and/or duplex telephony device in a motor vehicle, using at least one microphone and at least one loudspeaker to reproduce a signal generated by the microphone, as well as using a bandpass filter arranged between the microphone and the loudspeaker, the power of a signal is determined as a function of a frequency, and the bandpass filter is adjusted or set as a function of at least one local maximum of the power of the signal as a function of the frequency.
A local maximum of the power of the signal as a function of the frequency may include also the global maximum of the power of the signal as a function of the frequency.
In an example embodiment of the present invention, the local maximum of the power of the signal may be determined as a function of a derivative, e.g., the first derivative, of the power of the signal with respect to the frequency.
In an example embodiment of the present invention, an edge or slope signal may be formed using the first derivative of the power of the signal with respect to the frequency, which takes on a first binary value when the first derivative of the power of the signal with respect to the frequency is greater than or equal to zero, and which takes on a second binary value when the first derivative of the power of the signal with respect to the frequency is less than zero, the local maximum of the power of the signal being determined as a function of the first derivative of the slope signal.
In an example embodiment of the present invention, the presence of a local maximum of the power of the signal may only be assumed if the first derivative of the slope signal is less than zero.
The foregoing object may additionally be attained in that, for the operation of a voice-supported system, such as a communications and/or duplex telephony device in a motor vehicle, using at least one microphone and at least one loudspeaker to reproduce a signal generated by the microphone, as well as using a bandpass filter arranged between the microphone and the loudspeaker, the power of a signal may be determined as a function of a frequency, and the bandpass filter may be adjusted as a function of a derivative of the power of the signal with respect to the frequency.
In an example embodiment of the present invention, the bandpass filter may be adjusted as a function of at least two local maxima of the power of the signal as a function of the frequency.
In an example embodiment of the present invention, the bandpass filter may be adjusted as a function of the first derivative of the power of the signal with respect to the frequency.
In an example embodiment of the present invention, a slope signal may be formed using the first derivative of the power of the signal with respect to the frequency, which takes on a first binary value when the first derivative of the power of the signal with respect to the frequency is greater than or equal to zero, and which takes on a second binary value when the first derivative of the power of the signal with respect to the frequency is less than zero, the bandpass filter being adjusted as a function of the slope signal or of the first derivative of the slope signal.
In an example embodiment of the present invention, all local maxima may be determined in one frequency range. In an example embodiment of the present invention, the global maximum may be determined in that frequency range.
In an example embodiment of the present invention, the bandpass filter may be adjusted so that it blocks the portion of the signal generated by the microphone at a notch frequency only when the ratio:
to
In an example embodiment of the present invention, the bandpass filter may be adjusted so that it blocks the portion of the signal generated by the microphone at a notch frequency only when the ratio:
to
In an example embodiment of the present invention, the bandpass filter may be adjusted so that it blocks the portion of the signal generated by the microphone at a notch frequency only when the ratio:
to
In an example embodiment of the present invention, the bandpass filter may be adjusted so that it blocks the portion of the signal generated by the microphone at a notch frequency only when the ratio:
to
In an example embodiment of the present invention, the bandpass filter may be adjusted so that it blocks the portion of the signal generated by the microphone at a notch frequency only when the ratio:
to
In an example embodiment of the present invention, the bandpass filter may be adjusted so that it blocks the portion of the signal generated by the microphone at a notch frequency only when the ratio:
to
In an example embodiment of the present invention, the bandpass filter may be adjusted so that it blocks the portion of the signal generated by the microphone at a notch frequency only when the ratio:
to
In an example embodiment of the present invention, the bandpass filter may be adjusted so that it blocks the portion of the signal generated by the microphone at a notch frequency only when the ratio:
to
In an example embodiment of the present invention, the feedback-power threshold (RatioThreshold, OutGrdRatioThreshold) may be established as a function of an output signal of the bandpass filter.
In an example embodiment of the present invention, the feedback-power threshold (RatioThreshold, OutGrdRatioThreshold) may be between 20 and 40.
In an example embodiment of the present invention, the bandpass filter may be adjusted so that it blocks the portion of the signal generated by the microphone at a notch frequency only when the ratio:
to
In an example embodiment of the present invention, the bandpass filter may be adjusted so that it blocks the portion of the signal generated by the microphone at a notch frequency only when the ratio:
to
The power of the signal generated by the microphone at the frequency at which the power of the signal generated by the microphone is a maximum, and/or the power of the signal generated by the microphone at a frequency at which the power of the signal generated by the microphone has a local maximum, in the sense of the foregoing, may include alternatively or additionally also the power that the signal has in response to a closely adjacent frequency of above-named frequency and which (still) has a similar high power, such as the maximum in each case.
In an example embodiment of the present invention, the additional power threshold (RichContentThreshold) may be between 20 and 50, e.g., between 30 and 40.
In an example embodiment of the present invention, the bandpass filter may be adjusted as a function of its output signal.
In an example embodiment of the present invention, the bandpass filter may include a notch filter or a filter bank, e.g., a multifilter, having at least one notch filter. The filter bank may include, for example, 10 notch filters.
Further aspects, features and details are set forth below in the following description of exemplary embodiments.
In the present exemplary embodiment, loudspeakers 9, 17, 18, 19, 20 output a signal generated by microphone 21, loudspeakers 7, 17, 18, 19, 20 output a signal generated by microphone 22, loudspeakers 7, 9, 19, 20 output a signal generated by microphone 23, and loudspeakers 7, 9, 17, 18 output a signal generated by microphone 24. In this manner, the possibility of verbal communication in a motor vehicle is supported. In this context, in principle, the more strongly a signal is amplified between one of microphones 21, 22, 23, 24 and one of loudspeakers 7, 9, 17, 18, 19, 20, the better is the communication may be. However, the possibility of implementing such an amplification is limited by possible feedback effects caused by sound radiated by a loudspeaker 7, 9, 17, 18, 19, 20, which is received by microphone 21, 22, 23, 24, and is subsequently amplified and radiated by loudspeaker 7, 9, 17, 18, 19, 20.
To reduce such a feedback, in accordance with the example embodiment illustrated in
To amplify signal S and/or signal S′, amplifiers may be provided. However, the amplifier function may also be provided by the bandpass filter.
It may be provided to average over time the power at test frequencies fn, fn+1, fn+2, fn+3, fn+4, fn+5, fn+6, fn+7, fn+8 i.e., to develop an average over time, and to test this average value over time of the power instead of the current power of signal S at test frequencies fn, fn+1, fn+2, fn+3, fn+4, fn+5, fn+6, fn+7, fn+8. The foregoing may consequently also include the average value of the power developed over a certain time period. Furthermore, power in the present context may include the amplitude or its average value over time. In the present context, further modifications of power, amplitude or their average values over time may also be included, such as normalized values. Thus, for instance, by the power of signal S at a test frequency fn in the present context, the value of the power of signal S at this test frequency fn divided by the sum of the power of signal S at all test frequencies fn, fn+1, fn+2, fn+3, fn+4, fn+5, fn+6, fn+7, fn+8 may be understood.
Step 40 is followed by interrogation 41, e.g., whether the danger of feedback exists at a test frequency fn, fn+1, fn+2, fn+3, fn+4, fn+5, fn+6, fn+7, fn+8. Details pertaining to this query are explained with respect to
If signal S generated by microphone 30 has not already been reduced by the bandpass filter, by signal components around the test frequency, then query 42 is followed by an interrogation 43, e.g., whether a bandpass filter is available. If a bandpass filter is available, interrogation 43 is followed by a step 47, in which a bandpass filter is selected and the filter parameters, i.e., the mid-frequency fc and the quality Q of the bandpass filter, are generated. The mid-frequency fc is an example of the notch frequency. The notch frequency may be particularly the frequency range about the mid-frequency fc, which the bandpass filter actually filters out of signal S generated by microphone 30.
Mid-frequency fc may, for example, be equated to the test frequency, for which feedback has been established. In an example embodiment of the present invention, mid-frequency fc may also be a test frequency having a correction frequency added to it. This correction frequency is formed, for example, as a function of the power of the signal generated by the microphone at the test frequency at which the power generated by the microphone is a maximum, as well as of the power of the signal generated by the microphone at least one test frequency next to this test frequency. Thus, the correction frequency may be generated in accordance with:
fkorr=sign*fdist*Pmaxneigh/(Pmax+Pmaxneigh);
in which:
This is explained in greater detail in the light of the following example:
Then it is true that
fkorr=(−)*40Hz*4/(16+2)=−8Hz
The test frequency at which the power of the signal generated by the microphone is a maximum, is consequently 3840 Hz, and the notch frequency is 3832 Hz.
The correction frequency may also be formed according to:
fkorr=Δf*(Pneighright−Pneighleft)/(Pmax+|Pneighright−Pneighleft|),
in which:
Pneighleft represents the power of the signal generated by the microphone at the test frequency directly below the test frequency at which the power of the signal generated by the microphone is a maximum.
Based on the above numerical example, it is true in this case that:
fkorr=40Hz*(2−4)/(16+|4−2|)=−4.44Hz
The test frequency, at which the power of the signal generated by the microphone is a maximum, is consequently 3840 Hz and the notch frequency is 3835.56 Hz.
Quality Q is adjusted to a predefined value of, for example, 1/40 Hz.
If query 43 results in the statement that no bandpass filter is available, query 43 is followed by a step 48, in which the power of signal S is reduced by a reduction factor which may be between 2 dB and 5 dB, e.g., at essentially 3 dB.
If the result of query 42 is that signal S generated by microphone 30 is already being reduced with the aid of the bandpass filter by signal portions around the test frequency, a query 44 follows query 42. Using query 44, the question is whether by a further widening of the frequency range in which the bandpass filter blocks, that is, by a further reduction of its quality Q, a predetermined minimum quality may be undershot.
If by a further widening of the frequency range a predetermined minimum quality may be undershot, query 44 is followed by a step 45, and otherwise by a step 46. In step 45, which corresponds to step 48, the power of signal S is reduced by a reduction factor, which may be between 2 dB and 5 dB, e.g., at essentially 3 dB. In step 46 quality Q is reduced, i.e., the bandpass filter is widened.
After steps 45, 46, 47 and 48 there is a step 49 in which a time between 0.1 s and 3 s is expected.
to
Using query 62, e.g., as provided by this exemplary embodiment, the question is put whether the ratio PowerRatio3:
to
It may be provided that query 62 is only answered affirmatively if the global maximum is at a test frequency for longer than a time threshold OutGrdMaxBinTimeThreshold.
To carry out query 62, first of all the local maxima are determined. For this purpose, first of all (for the test frequencies) the first derivative of the power of Signal S with respect to frequency f is determined. From the first derivative of the power of signal S with respect to frequency f a slope signal is subsequently formed, which assumes a first binary value when the first derivative of the power of signal S with respect to the frequency f is greater than or equal to zero, and which assumes a second binary value when the first derivative of the power of signal S with respect to frequency f is less than zero. Subsequently, the first derivative of the slope signal is ascertained. In this context, in an example embodiment of the present invention, the presence of a local maximum of the power of signal S as a function of frequency f is only assumed if the first derivative of the slope signal is less than zero.
In this context, Table 1 shows an exemplary embodiment of a program written in the language Matlab™, which ascertains the indices idx_vec of the test frequencies at which there are local maxima according to criteria mentioned above. In this context, x denotes a vector having the powers at the individual test frequencies, and flec_thresh denotes a value between 0 and −1.
The local maximum having the greatest power is regarded as the global maximum.
If query 62 is answered in the affirmative, then query 62 is followed by a query 63, and otherwise by a step 64.
By query 63, the question is put as to whether signal S has a strong harmonic component. For this purpose, in an exemplary embodiment, the question is put whether the ratio:
to
If query 63 reveals that the ratio:
to
In step 64, the sequence is stopped for a predetermined retention time, such as 3 s. After the expiration of the retention time, feedback is negated.
If query 61 yields that the power of output signal S′ of bandpass filter 32 does not exceed the output threshold, then query 61 is followed by query 65 which essentially corresponds to query 62. In this context, however, a different feedback power threshold RatioThresholdis used, and not OutGrdRatioThreshold. However, the feedback-power threshold RatioThreshold may also be between 30 and 40.
If query 65 is answered affirmatively, then query 65 is followed by query 66 corresponding to query 63. Otherwise the presence of feedback is negated.
If query 66 reveals that the ratio:
to
The feedback detection is not limited to the example embodiment described above. The feedback detection may, for example, be constituted so that only query 65 is provided. The detection of feedback may also be provided so as to replace the example embodiments in accordance with
Query 63 as in
In
The ratio:
to
The ratio:
to
Number | Name | Date | Kind |
---|---|---|---|
5245665 | Lewis et al. | Sep 1993 | A |
5442712 | Kawamura et al. | Aug 1995 | A |
5677987 | Seki et al. | Oct 1997 | A |
6125187 | Hanajima et al. | Sep 2000 | A |
6252969 | Ando | Jun 2001 | B1 |
6385176 | Lyengar et al. | May 2002 | B1 |
6535609 | Finn et al. | Mar 2003 | B1 |
6674865 | Venkatesh et al. | Jan 2004 | B1 |
20040158460 | Finn et al. | Aug 2004 | A1 |
Number | Date | Country |
---|---|---|
39 25 589 | Feb 1991 | DE |
41 06 405 | Sep 1991 | DE |
197 05 471 | Jul 1997 | DE |
199 58 836 | May 2001 | DE |
0 078 014 | Aug 1986 | EP |
0 599 450 | Jun 1994 | EP |
1 077 013 | Mar 2002 | EP |
1 445 761 | Jan 2004 | EP |
WO 9734290 | Sep 1997 | WO |
WO 0221817 | Mar 2002 | WO |
WO 02069487 | Sep 2002 | WO |
WO 03079721 | Sep 2003 | WO |
Number | Date | Country | |
---|---|---|---|
20050013451 A1 | Jan 2005 | US |