The present invention relates to noise reduction in speech signal processing.
Common noise reduction algorithms make assumptions to the type of noise present in a noisy signal. The Wiener filter for example introduces the mean of squared errors (MSE) cost function as an objective distance measure to optimally minimize the distance between the desired and the filtered signal. The MSE however does not account for human perception of signal quality. Also, filtering algorithms are usually applied to each of the frequency bins independently. Thus, all types of signals are treated equally. This allows for good noise reduction performance under many different circumstances.
However, mobile communication situations in an automobile environment are special in that they contain speech as their desired signal. The noise present while driving is mainly characterized by increasing noise levels with lower frequency. Speech signal processing starts with an input audio signal from a speech-sensing microphone. The microphone signal represents a composite of multiple different sound sources. Except for the speech component, all of the other sound source components in the microphone signal act as undesirable noise that complicates the processing of the speech component. Separating the desired speech component from the noise components has been especially difficult in moderate to high noise settings, especially within the cabin of an automobile traveling at highway speeds, when multiple persons are simultaneously speaking, or in the presence of audio content.
In speech signal processing, the microphone signal is usually first segmented into overlapping blocks of appropriate size and a window function is applied. Each windowed signal block is then transformed into the frequency domain using a fast Fourier transform (FFT) to produce noisy short-term spectra signals. In order to reduce the undesirable noise components while keeping the speech signal as natural as possible, SNR-dependent (SNR: signal-to-noise ratio) weighting coefficients are computed and applied to the spectra signals. However, existing conventional methods use an SNR-dependent weighting rule which operates in each frequency independently and which does not take into account the characteristics of the actual speech sound being processed.
Embodiments of the present invention are directed to an arrangement for speech signal processing. The processing may be accomplished on a speech signal prior to speech recognition. The system and methodology may also be employed with mobile telephony signals and more specifically in an automotive environments that are noisy, so as to increase intelligibility of received speech signals.
An input microphone signal is received that includes a speech signal component and a noise component. The microphone signal is transformed into a frequency domain set of short-term spectra signals. Then speech formant components within the spectra signals are estimated based on detecting regions of high energy density in the spectra signals. One or more dynamically adjusted gain factors are applied to the spectra signals to enhance the speech formant components.
A computer-implemented method that includes at least one hardware implemented computer processor, such as a digital signal processor, may process a speech signal and identify and boost formants in the frequency domain. An input microphone signal having a speech signal component and a noise component may be received by a microphone.
The speech pre-processor transforms the microphone signal into a frequency domain set of short term spectra signals. Speech formant components are recognized within the spectra signals based on detecting regions of high energy density in the spectra signals. One or more dynamically adjusted gain factors are applied to the spectra signals to enhance the speech formant components.
The formants may be identified and estimated based on finding spectral peaks using a linear predictive coding filter. The formants may also be estimated using an infinite impulse response smoothing filter to smooth the spectral signals. After the formants are identified, the coefficients for the frequency bins where the formants are identified may be boosted using a window function. The window function boosts and shapes the overall filter coefficients. The overall filter can then be applied to the original speech input signal. The gain factors for boosting are dynamically adjusted as a function of formant detection reliability. The shaped windows are dynamically adjusted and applied only to frequency bins that have identified speech. In certain embodiments of the invention, the boosting window function may be adapted dynamically depending on signal to noise ratio.
In embodiments of the invention, the gain factors are applied to underestimate the noise component so as to reduce speech distortion in formant regions of the spectra signals. Additionally, the gain factors may be combined with one or more noise suppression coefficients to increase broadband signal to noise ratio.
The formant detection and formant boosting may be implemented within a system having one or more modules. As used herein, the term module may imply an application specific integrated circuit or a general purpose processor and associated source code stored in memory. Each module may include one or more processors. The system may include a speech signal input for receiving a microphone signal having a speech signal component and a noise component. Additionally, the system may include a signal pre-processor for transforming the microphone signal into a frequency domain set of short term spectra signals. The system includes both a formant estimating module and a formant enhancement module. The formant estimating module estimates speech formant components within the spectra signals based on detecting regions of high energy density in the spectra signals. The formant enhancement module determines one or more dynamically adjusted gain factors that are applied to the spectra signals to enhance the speech formant components.
Various embodiments of the present invention are directed to computationally efficient techniques for enhancing speech quality and intelligibility in speech signal processing by identifying and accentuating speech formants within the microphone signals. Formants represent the main concentration of acoustical energy within certain frequency intervals (the spectral peaks) which are important for interpreting the speech content. Formant identification and accentuation may be used in conjunction with noise reduction algorithms.
As stated above, formants should be accentuated only during voiced speech phonemes and on those formant regions where the SNR (signal-to-noise ratio) is sufficient. Otherwise, noise components will be amplified, which leads to a reduced speech quality. In a first step, the inventive method first identifies frequency regions of the input speech signal containing voiced speech. 301 In order to accomplish this, a voiced excitation detector is employed. Any known excitation detector may be used and the below described detector is only exemplary. In one embodiment, the voiced excitation detector module decides whether the mean logarithmic INR (Input-to-Noise ratio) exceeds a certain threshold PVUD* over a number (MF) of frequency bins:
If the result is true, a voice signal is recognized. If the result is false, the frequency bins in the current frame, denoted here with n, do not contain speech.
Once the frames having speech are identified, an optional smoothing function may be applied to the speech signal to eliminate the problem of harmonics masking the superposed formants. 302. A first-order infinite impulse response (IIR) filter may be applied for smoothing, although other spectral smoothing techniques may be applied without deviating from the intent of the invention (e.g. spline, fast and slow smoothing etc.). The smoothing filter should be designed to provide an adequate attenuation of the harmonics' effects while not cancelling out any formants' maxima.
An exemplary filter is defined below and this filter is applied once in forward direction and once in backward direction so as to keep local features in place. It has the form:
With the given transformation parameters (sampling frequency FS=16000 Hz and window width NFFT=512, a good compromise numerical smoothing constant was found to be gamma_f=0.92. This corresponds to a natural decay constant of:
for arbitrary short-term Fourier transform (STFT) parameters. The STFT-dependent parameter is then:
After smoothing the PSD, the local maxima are determined by finding the zeros of the derivative of the smoothed PSD within the respective frequency bins 303. Streaks of zeros are consolidated, and an analysis of the second derivative is used to classify minima, maxima, and saddle points as is known to those of ordinary skill in the art. The maximum point will be assumed to be the central frequency of the formant fF(iF,n) and—in the case of fast and slow smoothing—the width of the formant will be known ΔfF(iF,n).
Once the formants are identified, the formant regions can be accentuated using an adaptive gain factor. A boosting function B (f, n) with codomain [0, 1], where a value of 0 should represent the absence of any formants in the respective frequency bin, while a value of 1 should demark a formant's center.
We introduce the prototype boosting window function bprot(x):→[0,1] with
defines the actual prototype window shape.
Within any formant, the highest signal-to-noise ratio (SNR) can be expected at its center. The introduction of noise by boosting the signal increases towards formants' borders. Thus, typical boosting around a formant's center preferably should fall off gently.
Different shaped windows, such as, Gaussian, cosine, and triangular windows can be used. Different weighting rules can be utilized to boost the input signal. Preferably the boosting window emphasizes the center frequencies of formants and the window is stretched over a frequency range. For each formant detected, the prototype window function is stretched by a factor w (iF, n) to match the formant's width, if it is known—as is the case for the approach with fast and slow smoothing. Otherwise, it should be stretched to a constant frequency width of about 600 Hz although other similar frequency ranges may be employed.
The window must also be shifted by the formant's central frequency to match its location in the frequency domain. The boosting function is defined to be the sum of the stretched and shifted prototype boosting window functions:
In other embodiments of the invention, the gain values around the center of the shaped windows may be adjusted depending on the presumed reliability of the formant estimation. Thus, if the formant estimation reliability is low, the windowing function will not boost the frequency components as much when compared to a highly reliable formant estimation.
In order to avoid detection of formants within the speech signal (e.g. frame) when no actual speech is present, prior estimated formants can also be taken into account for adjustments to the window function. In general, the formant locations slowly change over time depending on the spoken phoneme.
where α is the overestimation factor, and β is the spectral floor. Here, the spectral floor acts as both a feedback limit, and the classical spectral floor that masks musical noise.
can be replaced by INR(fμ,n) to get
To find the equilibrium map in its input-state space, set
H′(fμ,n)H″(fμ,n−1)=:H′eq
and
INR(fμ,n)=:INR′eq.
This leads to
This is an implicit representation of the reduced system's equilibrium map. It can be transformed to give the INR′eq as a function of the system's output H′eq:
or to give a quasi-function. of H′eq with two branches in the INR′eq domain:
This system has two distinct equilibria. A top branch is stable on both sides while the lower branch is unstable. Left of the bifurcation point, the filter's output constantly decreases toward zero, so the filter is closed almost completely as soon as a low input INR is reached. The noise reduction filter's output H (fμ, n)—represents filter coefficients of values between 0 and 1 for each frequency bin μ in a frame n. It should be understood by one of ordinary skill in the art that other noise reductions filters may be employed in combination with formant detection and boosting without deviating from the intent of the invention and therefore, the present invention is not limited solely to recursive Wiener filters. Filters with a similar feedback structure as the modified Wiener filter (e.g. modified power subtraction, modified magnitude subtraction) can be further enhanced by placing their hysteresis flanks depending on the formant boosting function. Arbitrary noise reduction filters (e.g., Y. Ephraim, D. Malah: Speech Enhancement Using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator, IEEE Trans. Acoust. Speech Signal Process., vol. 32, no. 6, pp 1109-1121, 1984.) can be enhanced by applying additional gain on their output filter coefficients depending on the formant boosting function.
Once the filter coefficients of the noise reduction filter are determined, the coefficients are provided to the formant booster 401. The formant booster 401 first detects formants in the spectrum of the noise reduced signal. The formant booster may identify all high power density bands as formants or may employ other detection algorithms. The detection of formants can be performed using linear predictive coding (LPC) techniques for estimating the vocal tract information of a speech sound then searching for the LPC spectral peaks. In one embodiment, a voice excitation detection methodology is employed as described with respect to
After the formants have been boosted within their respective frequency bins, the resultant filter coefficients H(k,μ) are convolved with the digital microphone signal resulting in a reduced noise and formant boosted signal Ŝ(k, μ). The signal, which is still in the frequency domain and composed of frequency bins and temporal frames, is passed through a synthesis filter bank to transform the signal into the time domain. The resulting signal represents an augmented version of the original speech signal and should be better defined, so that a subsequent speech recognition engine (not shown) can recognize the speech.
In contrast to the process described above where the formants are boosted subsequent to a noise reduction filter, the disclosed formant detection method and boosting can also be applied as a preprocessing stage or as part of a conventional noise suppression filter. This methodology underestimates the background noise in formant regions and can be used to arbitrarily control the filter's parameters depending on the formants. In this approach, the noise suppression filter—is provoked to provide admission of formants that would normally be attenuated if all frequency bins were treated equally. As a consequence, the noise suppression filter operates less-aggressively, thus it reduces speech distortions to a certain extent. As previously indicated, in some embodiments of the invention, a recursive Wiener filter may be used as the noise suppression filter. While the recursive Wiener filter effectively reduces musical noise, it also attenuates speech at low TNRs. The placement of the hysteresis edges, or flanks, in the filter's characteristic—determines at which INR signals are attenuated down to the spectral floor. Proper placement of the flanks will lead to a good trade-off between musical noise suppression and speech signal fidelity. It is desirable to modify the flanks' positions according to circumstance. In areas with only noise—the term area is used here to describe time spans as well as frequency bands—the musical noise suppression should remain prevalent while in areas with speech signal components (e.g. in formants), preserving the speech signal gets more important. By detecting important speech components in the form of formants, one gets a good weighting function between the two. For the recursive Wiener filter, the edges, or flanks, at which INR the filter closes (INR eq,down) or opens (INR eq,up) are given by:
This system can be rearranged to describe the parameters α and β as functions of the flanks' desired INR:
The flanks can be independently placed by choosing adequate overestimation a and spectral floor β. If one chose β arbitrarily small, for example, to move the upwards flank towards a higher TNR, this would also result in a very low maximum attenuation, which might be undesirable. This may be eliminated by introducing a separate parameter Hmin that does not contribute to the feedback, but limits the output attenuation anyway. The proposed system is described by
This filter can be tailored to different conditions better than could the conventional recursive Wiener filter. The boosting function can be put to use in this setup by defining the default flank positions (INRup0, INRdown0) their desired maximum deviations (ΔINRup, ΔINRdown) in the center of formants. Then, the filter parameters are updated in every frame and for every bin according to the presence of formants:
Where B(fμ,n) is the formant boost window function. The formants can be determined as described above and the boost window function may also be selected from any of a number of window functions including Gaussian, triangular, and cosine etc.
If the formant boosting is performed prior or simultaneous with the noise reduction, there is no accentuation of the formants beyond 0 dB. Additionally, there is no further improvement of formants in bins that have good signal to noise ratios. Further, providing the boosting pre-noise reduction filtering potentially introduces additional noise. If the boosting is performed before the pre-noise reduction filtering audible speech improvements may occur especially in the lower frequencies.
Once the formant frequency ranges are determined, the formants frequencies are boosted. 504 The frequencies may be boosted based on a number of factors. For example, only the center frequency may be boosted or the entire frequency range may be boosted. The level of boost may depend on the amount of boost provided to the last formant along with a maximum threshold in order to avoid clipping.
Embodiments of the invention may be implemented in whole or in part in any conventional computer programming language such as VHDL, SystemC, Verilog, ASM, etc. Alternative embodiments of the invention may be implemented as pre-programmed hardware elements, other related components, or as a combination of hardware and software components.
Embodiments can be implemented in whole or in part as a computer program product for use with a computer system. Such implementation may include a series of computer instructions fixed either on a tangible medium, such as a computer readable medium (e.g., a diskette, CD-ROM, ROM, or fixed disk) or transmittable to a computer system, via a modem or other interface device, such as a communications adapter connected to a network over a medium. The medium may be either a tangible medium (e.g., optical or analog communications lines) or a medium implemented with wireless techniques (e.g., microwave, infrared or other transmission techniques). The series of computer instructions embodies all or part of the functionality previously described herein with respect to the system. Those skilled in the art should appreciate that such computer instructions can be written in a number of programming languages for use with many computer architectures or operating systems. Furthermore, such instructions may be stored in any memory device, such as semiconductor, magnetic, optical or other memory devices, and may be transmitted using any communications technology, such as optical, infrared, microwave, or other transmission technologies. It is expected that such a computer program product may be distributed as a removable medium with accompanying printed or electronic documentation (e.g., shrink wrapped software), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the network (e.g., the Internet or World Wide Web). Of course, some embodiments of the invention may be implemented as a combination of both software (e.g., a computer program product) and hardware. Still other embodiments of the invention are implemented as entirely hardware, or entirely software (e.g., a computer program product).
Although various exemplary embodiments of the invention have been disclosed, it should be apparent to those skilled in the art that various changes and modifications can be made which will achieve some of the advantages of the invention without departing from the true scope of the invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2012/053666 | 9/4/2012 | WO | 00 | 8/31/2015 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2014/039028 | 3/13/2014 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
4015088 | Dubnowski et al. | Mar 1977 | A |
4052568 | Jankowski | Oct 1977 | A |
4057690 | Vagliani et al. | Nov 1977 | A |
4359064 | Kimble | Nov 1982 | A |
4410763 | Strawczynski et al. | Oct 1983 | A |
4536844 | Lyon | Aug 1985 | A |
4672669 | DesBlache et al. | Jun 1987 | A |
4688256 | Yasunaga | Aug 1987 | A |
4764966 | Einkauf et al. | Aug 1988 | A |
4825384 | Sakurai | Apr 1989 | A |
4829578 | Roberts | May 1989 | A |
4864608 | Miyamoto et al. | Sep 1989 | A |
4914692 | Hartwell et al. | Apr 1990 | A |
5034984 | Bose | Jul 1991 | A |
5048080 | Bell et al. | Sep 1991 | A |
5125024 | Gokcen et al. | Jun 1992 | A |
5155760 | Johnson et al. | Oct 1992 | A |
5220595 | Uehara | Jun 1993 | A |
5239574 | Brandman et al. | Aug 1993 | A |
5349636 | Irribarren | Sep 1994 | A |
5394461 | Garland | Feb 1995 | A |
5416887 | Shimada | May 1995 | A |
5434916 | Hasegawa | Jul 1995 | A |
5475791 | Schalk et al. | Dec 1995 | A |
5574824 | Slyh et al. | Nov 1996 | A |
5577097 | Meek | Nov 1996 | A |
5581620 | Brandstein et al. | Dec 1996 | A |
5581652 | Abe | Dec 1996 | A |
5602962 | Kellermann | Feb 1997 | A |
5627334 | Hirano | May 1997 | A |
5652828 | Silverman | Jul 1997 | A |
5696873 | Bartkowiak | Dec 1997 | A |
5708704 | Fisher | Jan 1998 | A |
5708754 | Wynn | Jan 1998 | A |
5721771 | Higuchi et al. | Feb 1998 | A |
5744741 | Nakajima | Apr 1998 | A |
5761638 | Knittle et al. | Jun 1998 | A |
5765130 | Nguyen | Jun 1998 | A |
5784484 | Umezawa | Jul 1998 | A |
5799276 | Komissarchik | Aug 1998 | A |
5939654 | Anada | Aug 1999 | A |
5959675 | Mita et al. | Sep 1999 | A |
5978763 | Bridges | Nov 1999 | A |
6009394 | Bargar | Dec 1999 | A |
6018711 | French-St. George et al. | Jan 2000 | A |
6061651 | Nguyen | May 2000 | A |
6098043 | Forest et al. | Aug 2000 | A |
6246986 | Ammicht et al. | Jun 2001 | B1 |
6253175 | Basu | Jun 2001 | B1 |
6266398 | Nguyen | Jul 2001 | B1 |
6279017 | Walker | Aug 2001 | B1 |
6353671 | Kandel | Mar 2002 | B1 |
6373953 | Flaks | Apr 2002 | B1 |
6449593 | Valve | Sep 2002 | B1 |
6496581 | Finn et al. | Dec 2002 | B1 |
6526382 | Yuschik | Feb 2003 | B1 |
6549629 | Finn et al. | Apr 2003 | B2 |
6574595 | Mitchell et al. | Jun 2003 | B1 |
6636156 | Damiani et al. | Oct 2003 | B2 |
6647363 | Claassen | Nov 2003 | B2 |
6717991 | Gustafsson et al. | Apr 2004 | B1 |
6778791 | Shimizu et al. | Aug 2004 | B2 |
6785365 | Nguyen | Aug 2004 | B2 |
6898566 | Benyassine | May 2005 | B1 |
7065486 | Thyssen | Jun 2006 | B1 |
7068796 | Moorer | Jun 2006 | B2 |
7069213 | Thompson | Jun 2006 | B2 |
7069221 | Crane et al. | Jun 2006 | B2 |
7117145 | Venkatesh et al. | Oct 2006 | B1 |
7162421 | Zeppenfeld et al. | Jan 2007 | B1 |
7171003 | Venkatesh et al. | Jan 2007 | B1 |
7206418 | Yang et al. | Apr 2007 | B2 |
7224809 | Hoetzel | May 2007 | B2 |
7274794 | Rasmussen | Sep 2007 | B1 |
7424430 | Kawahara | Sep 2008 | B2 |
7643641 | Haulick et al. | Jan 2010 | B2 |
8000971 | Ljolje | Aug 2011 | B2 |
8050914 | Schmidt et al. | Nov 2011 | B2 |
8831942 | Nucci | Sep 2014 | B1 |
8990081 | Lu | Mar 2015 | B2 |
20010038698 | Breed et al. | Nov 2001 | A1 |
20020138253 | Kagoshima | Sep 2002 | A1 |
20020184031 | Brittan et al. | Dec 2002 | A1 |
20030026437 | Janse et al. | Feb 2003 | A1 |
20030065506 | Adut | Apr 2003 | A1 |
20030072461 | Moorer | Apr 2003 | A1 |
20030088417 | Kamai | May 2003 | A1 |
20030185410 | June et al. | Oct 2003 | A1 |
20040047464 | Yu et al. | Mar 2004 | A1 |
20040076302 | Christoph | Apr 2004 | A1 |
20040230637 | Lecoueche et al. | Nov 2004 | A1 |
20050010414 | Yamazaki | Jan 2005 | A1 |
20050075864 | Kim | Apr 2005 | A1 |
20050240401 | Ebenezer | Oct 2005 | A1 |
20050246168 | Campbell | Nov 2005 | A1 |
20050265560 | Haulick et al. | Dec 2005 | A1 |
20060222184 | Buck et al. | Oct 2006 | A1 |
20070055513 | Hwang | Mar 2007 | A1 |
20070230712 | Belt et al. | Oct 2007 | A1 |
20070233472 | Sinder | Oct 2007 | A1 |
20080004881 | Attwater et al. | Jan 2008 | A1 |
20080082322 | Joublin | Apr 2008 | A1 |
20080107280 | Haulick et al. | May 2008 | A1 |
20080319740 | Su | Dec 2008 | A1 |
20090276213 | Hetherington | Nov 2009 | A1 |
20090316923 | Tashev et al. | Dec 2009 | A1 |
20100189275 | Christoph | Jul 2010 | A1 |
20100299148 | Krause | Nov 2010 | A1 |
20110119061 | Brown | May 2011 | A1 |
20110286604 | Matsuo | Nov 2011 | A1 |
20120130711 | Yamabe | May 2012 | A1 |
20120134522 | Jenison | May 2012 | A1 |
20120150544 | McLoughlin | Jun 2012 | A1 |
Number | Date | Country |
---|---|---|
101350108 | Jan 2009 | CN |
102035562 | Apr 2011 | CN |
104704560 | Jun 2015 | CN |
101 56 954 | Jun 2003 | DE |
10 2005 002 865 | Jun 2006 | DE |
0 856 834 | Aug 1998 | EP |
1 083 543 | Mar 2001 | EP |
1 116 961 | Jul 2001 | EP |
1 343 351 | Sep 2003 | EP |
1 850 328 | Oct 2007 | EP |
1 850 640 | Oct 2007 | EP |
2 107 553 | Oct 2009 | EP |
2 148 325 | Jan 2010 | EP |
2 097 121 | Oct 1982 | GB |
WO 9418666 | Aug 1994 | WO |
WO 0232356 | Apr 2002 | WO |
WO 2004100602 | Nov 2004 | WO |
WO 2006117032 | Nov 2006 | WO |
WO 2011119168 | Sep 2011 | WO |
Entry |
---|
Chinese Patent Application; date of entry Apr. 9, 2015; for Chinese Pat. App. No. 201280076334.6; 39 pages. |
Notification Concerning Transmittal of International Preliminary Report on Patentability (Chapter 1 of the Patent Cooperation Treaty, PCT/US2012/053666, date of mailing Mar. 19, 2015, 6 pages. |
Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration, PCT/US2012/053666, date of mailing Dec. 11, 2012, 5 pages. |
Written Opinion of the International Searching Authority, PCT/US2012/053666, date of mailing Dec. 11, 2012, 6 pages. |
Kobatake H. et al.,: “Enhancement of noisy speech by maximum likelihood estimation”, Speech Processing 1. Toronto, May 14-17, 1991; [International Conference on Acoustics, Speech & Signal Processing. ICASSP], New York, IEEE, US, vol. CONF. 16, Apr. 14, 1991, pp. 973-976, XP010043136, DOI: 10.1109/ICASSP.1991.150503; ISBN: 978-0-7803-0003-3. Abstract p. 975, paragraph [4. Practical computation] p. 975, paragraph [6. Conclusion] figure 4. |
Lecomte I. et al.,: “Car noise processing for speech input”, May 23, 1989; May 23, 1989-May 26, 1989, May 23, 1989, pp. 512-515, XP010083112. Abstract pp. 513-514, paragraph [Speech enhancement] figure 2; tables 1-3. |
Chinese Office Action (with English translation) dated Aug. 10, 2016; for Chinese Pat. App. No. 201280074944.2; 22 pages. |
Richardson et al. “LPC-Synthesis Mixture: A Low Computational Cost Speech Enhancement Algorithm”, Proceedings of the IEEE, Apr. 11, 1996, 4 pages. |
Arslan et al. “New Methods for Adaptive Noise Suppression,” IEEE, vol. 1, May 1995, 4 pages. |
Ljolje et al. “Discriminative Training of Multi-Stage Barge-in Models,” IEEE, Dec. 1, 2007, 6 pages. |
Setlur et al. “Recognition-based Word Counting for Reliable Barge-In and Early Endpoint Detection in Continuous Speech Recognition,” International Conference on spoken Language Processing, Oct. 1, 1998, 4 pages. |
Ittycheriah et al. “Detecting User Speech in Barge-in Over Prompts Using Speaker Identification Methods,” Eurospeech 99, Sep. 5, 1999, 4 pages. |
Rose et al. “A Hybrid Barge-In Procedure for More Reliable Turn-Taking in Human-Machine Dialog Systems,” 5th International Conference on Spoken Language Processing, Oct. 1, 1998, 6 pages. |
Hansler et al. “Acoustic Echo and Noise Control: A Practical Approach”, John Wiley & Sons, New York, New York, USA, Copyright 2004, Part 1, 250 pages. |
Hansler et al. “Acoustic Echo and Noise Control: a Practical Approach”, John Wiley & Sons, New York, New York, USA, Copyright 2004, Part 2, 221 pages. |
Sang-Mun Chi et al: “Lombard effect compensation and noise suppression for noisy Lombard speech recognition”, IEEE, US, vol. 4, Oct. 3, 1996 pp. 2013-2016, 4 pages. |
Schmidt et al: “Signal processing for in-car communication systems”, Signal Processing, Elsevier Science Publishers B.V. Amsterdam, NL, vol. 86, No. 6, Jun. 1, 2006, pp. 1307-1326, 20 pages. |
Jung et al: “On the Lombard Effect Induced by Vehicle Interior Driving Noises, Regarding Sound Pressure Level and Long-Term Average Speech Spectrum”, Mar. 1, 2012, pp. 334-341, ISSN: 1610-1928, 8 pages. |
Alfonso Ortega et al: “Cabin car communication system to improve communications inside a car”, IEEE May 13, 2002, pp. IV-3836, 4 pages. |
Extended Search Report dated Sep. 19, 2008 for European Application No. 08013196.4; 11 pages. |
Decision to grant dated Feb. 28, 2014 for European Application No. 08013196.4; 52 pages. |
Supplemental Decision to grant dated May 27, 2014 for European Application No. 08013196.4; 43 pages. |
Office Action dated Apr. 1, 2013 for U.S. Appl. No. 12/507,444, 17 pages. |
Response to Office Action dated Aug. 1, 2013 U.S. Appl. No. 12/507,444, 16 pages. |
Final Office Action dated Nov. 15, 2013 for U.S. Appl. No. 12/507,444, 19 pages. |
Office Action dated Jun. 14, 2013 for U.S. Appl. No. 12/254,488; 22 pages. |
Response to Office Action dated Dec. 4, 2013 for U.S. Appl. No. 12/254,488; 12 pages. |
Notice of Allowance dated Dec. 23, 2013 for U.S. Appl. No. 12/254,488; 11 pages. |
European Search Report Apr. 24, 2008 for European Application No. 07021121.4, 3 pages. |
European Extended Search Report dated May 6, 2008 for European Application No. 07021121.4, 3 pages. |
European Search Report dated Jun. 14, 2011 for European Application No. 07021932.4, 2 pages. |
Decision to Grant dated Dec. 5, 2013 for European Application No. 07021932.4, 1 page. |
International Preliminary Report on Patentability dated Nov. 11, 2005 for PCT Application No. PCT/EP2004/004980; 8 pages. |
Written Opinion dated Nov. 8, 2004 for PCT Application No. PCT/EP2004/004980; 7 pages. |
Search Report dated Nov. 8, 2004, 2004 for PCT Application No. PCT/EP2004/004980; 3 pages. |
Office Action dated Nov. 28, 2007 for U.S. Appl. No. 10/556,232; 11 pages. |
Response to Office Action files Mar. 28, 2008 for U.S. Appl. No. 10/556,232; 7 pages. |
Office Action dated May 29, 2008 for U.S. Appl. No. 10/556,232; 10 pages. |
Response to Office Action files Aug. 29, 2008 for U.S. Appl. No. 10/556,232; 9 pages. |
Office Action dated Dec. 9, 2008 for U.S. Appl. No. 10/556,232; 17 pages. |
Response to Office Action files Mar. 9, 2009 for U.S. Appl. No. 10/556,232; 13 pages. |
Office Action dated May 13, 2009 for U.S. Appl. No. 10/556,232; 17 pages. |
Response to Office Action files May 29, 2009 for U.S. Appl. No. 10/556,232; 6 pages. |
Notice of Allowance dated Aug. 26, 2009 for U.S. Appl. No. 10/556,232; 7 pages. |
Notice of Allowance dated Jan. 15, 2014 for U.S. Appl. No. 11/924,987; 7 pages. |
Office Action dated Jan. 7, 2014 for U.S. Appl. No. 13/518,406; 10 pages. |
Response to Office Action filed May 5, 2014 for U.S. Appl. No. 13/518,406; 8 pages. |
Final Office Action dated Jun. 10, 2014 for U.S. Appl. No. 13/518,406; 10 pages. |
Response to Final Office Action filed Nov. 13, 2014 for U.S. Appl. No. 13/518,406; 11 pages. |
Office Action dated Nov. 26, 2014 for U.S. Appl. No. 13/518,406; 6 pages. |
Response to Office Action filed Feb. 17, 2015 for U.S. Appl. No. 13/518,406; 9 pages. |
Notice of Allowance dated Mar. 10, 2015 for U.S. Appl. No. 13/518,406; 7 pages. |
European Office Action dated Oct. 16, 2014 for European Application No. 10716929.4; 5 pages. |
Decision to grant dated Jan. 18, 2016 for European Application No. 10716929.4; 24 pages. |
Response to Written Opinion filed Jan. 9, 2015 for European Application No. 10716929.4; 9 pages. |
International Preliminary Report on Patentability dated Oct. 2, 2012 for PCT Application No. PCT/US2010/028825; 8 pages. |
Search Report dated Dec. 28, 2010 for PCT Application No. PCT/US2010/028825; 4 pages. |
Written Opinion 2010 dated Dec. 28, 2010 for PCT Application No. PCT/US2010/028825; 7 pages. |
Extended Search Report dated Jul. 20, 2016 for European Application No. 12878823.9; 16 pages. |
Supplementary Search Report dated Aug. 5, 2016 for European Application No. 12878823.9; 1 pages. |
Notice of Allowance dated Aug. 15, 2016 for U.S. Appl. No. 14/406,628; 12 pages. |
International Preliminary Report on Patentability dated May 14, 2015 for PCT Application No. PCT/US2012/062549; 6 pages. |
Office Action dated Feb. 16, 2016 for U.S. Appl. No. 14/438,757; 12 pages. |
Response to Office Action dated May 13, 2016 for U.S. Appl. No. 14/438,757; 15 pages. |
Final Office Action dated Jul. 28, 2016 for U.S. Appl. No. 14/438,757; 12 pages. |
EPO Extended Search Report dated Jun. 27, 2011 for European Application No. 11155021.6; 10 pages. |
EPO Communication Pursuant to Article 94(3) EPC dated Jul. 5, 2013 for European Application No. 11155021.6; 2 pages. |
Response to EPO Communication Pursuant to Article 94(3) EPC dated Oct. 8, 2013 for European Application No. 11155021.6; 11 pages. |
U.S. Appl. No. 11/928,251. |
U.S. Appl. No. 12/507,444. |
U.S. Appl. No. 12/254,488. |
U.S. Appl. No. 12/269,605. |
U.S. Appl. No. 13/273,890. |
U.S. Appl. No. 14/254,007. |
U.S. Appl. No. 10/556,232. |
U.S. Appl. No. 13/518,406. |
U.S. Appl. No. 14/406,628. |
European Response (with Amended Claims and Replacement Specification Page) to European Office Action dated Aug. 5, 2016; Response filed on Jan. 25, 2017 for European Application No. 12878823.9; 10 Pages. |
Chinese Office Action with English translation dated Nov. 16, 2016; for Chinese Pat. App. No. 201280076334.6; 13 pages. |
Chinese Response with English claims filed Dec. 26, 2016 to Office Action dated Aug. 10, 2016; for Chinese Pat. App. No. 201280074944.2; 20 pages. |
Response to Office Action filed on Oct. 25, 2016 for U.S. Appl. No. 14/438,757, 17 pages. |
Notice of Allowance dated Nov. 9, 2016 for U.S. Appl. No. 14/438,757, 10 pages. |
U.S. Appl. No. 14/406,628 Notice of Allowance dated Aug. 15, 2016, 12 pages. |
Response (with Amended Claims in English) to Chinese Office Action dated Nov. 16, 2016 for Chinese Application No. 201280076334.6; 11 Pages. |
Response (with Amended Claims in English) to Chinese Office Action dated Jan. 17, 2017 for Chinese Application No. 201280074944.2; 18 Pages. |
Chinese Office Action (with English Translation) dated Jan. 17, 2017 for Chinese Application No. 201280074944.2; 16 Pages. |
Chinese Office Action (with English translation) dated Jun. 2, 2017, for Chinese Pat. App. No. 201280074944.2, 10 pages. |
Chinese Second Office Action (with English translation) dated Jun. 26, 2017, for Chinese Pat. App. No. 201280076334.6; 14 pages. |
Response to Chinese Office Action dated Jun. 2, 2017 for Chinese Application No. 201280074944.2; Response filed on Aug. 17, 2017; 13 pages. |
Number | Date | Country | |
---|---|---|---|
20160035370 A1 | Feb 2016 | US |