Claims
- 1. A system for suppressing background noise in audio data, comprising:a detector configured to perform a manipulation process on said audio data, said detector including a filter bank that generates filtered channel energy by separating said audio data into discrete frequency channels, said detector including a weighting module that weights selected components of said audio data to suppress said background noise, said weighting module generating noise-suppressed channel energy by applying separate weighting values directly to each of said discrete frequency channels of said filtered channel energy, said separate weighting values being related to background noise values of said discrete frequency channels; and a processor coupled to said system to control said detector for suppressing said background noise.
- 2. The system of claim 1 wherein said audio data includes speech information.
- 3. The system of claim 2 wherein said detector comprises a speech detector that includes program instructions which are stored in a memory device coupled to said processor, said speech detector weighting said selected components of said audio data to suppress said background noise.
- 4. The system of claim 3 wherein said speech information includes digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter.
- 5. The system of claim 4 wherein said speech detector comprises a noise suppressor, said noise suppressor including a noise calculator, a speech energy calculator, and said weighting module.
- 6. A system for suppressing background noise in audio data, comprising:a detector configured to perform a manipulation process on said audio data that includes digital source speech data provided to said speech detector by an analog sound sensor and an analog-to-digital converter, said detector including a filter bank that generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said detector including a speech detector with program instructions that are stored in a memory device, said speech detector including a noise suppressor with a noise calculator, a speech energy calculator, and a weighting module, said speech detector weighting selected components of said audio data to suppress said background noise, said noise calculator calculating background noise values during a silent segment of said audio data, said silent segment being located below an ending noise-calculation threshold that is expressed by the formula: Te+0.125(Ter−Te) where Te is an ending threshold of said audio data and Ter is an ending threshold of a reliable island in said audio data; and a processor coupled to said system to control said detector for suppressing said background noise.
- 7. A system for suppressing background noise in audio data, comprising:a detector configured to perform a manipulation process on said audio data that includes digital source speech data provided to said speech detector by an analog sound sensor and an analog-to-digital converter, said detector including a filter bank that generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said detector including a speech detector with program instructions that are stored in a memory device, said speech detector including a noise suppressor with a noise calculator, a speech energy calculator, and a weighting module, said speech detector weighting selected components of said audio data to suppress said background noise, said noise calculator calculating background noise values during a silent segment of said audio data, said silent segment being located below a beginning noise-calculation threshold that is expressed by the formula: Ts+0.125(Tsr−Ts) where Ts is a beginning threshold of said audio data and Tsr is a beginning threshold of a reliable island in said audio data; and a processor coupled to said system to control said detector for suppressing said background noise.
- 8. The system of claim 5 wherein said noise calculator derives a channel average background noise value “Ni(m)” for a channel m at a frame i by using an iterative equationNi(m)=αNi−1(m)+(1−α)yi(m) m=0, 1, . . . , M−1 where said yi(m) is a signal energy during a silent segment of said channel m at said frame i, said M is a total number of said discrete frequency channels, and said α is a forgetting factor.
- 9. The system of claim 8 wherein A system for suppressing background noise in audio data, comprising:a detector configured to perform a manipulation process on said audio data that includes digital source speech data provided to said speech detector by an analog sound sensor and an analog-to-digital converter, said detector including a filter bank that generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said detector including a speech detector with program instructions that are stored in a memory device, said speech detector including a noise suppressor with a noise calculator, a speech energy calculator, and a weighting module, said speech detector weighting selected components of said audio data to suppress said background noise, said noise calculator deriving a channel average background noise value “Ni(m)” for a channel m at a frame i by using an iterative equation Ni(m)=αNi−1(m)+(1−α)yi(m) m=0, 1, . . . , M−1 where said yi(m) is a signal energy during a silent segment of said channel m at said frame i, said M is a total number of said discrete frequency channels, and said a is a forgetting factor, said α being equal to 0.985 which is equivalent to a window size of 145 frames; and a processor coupled to said system to control said detector for suppressing said background noise.
- 10. The system of claim 5 wherein A system for suppressing background noise in audio data, comprising:a detector configured to perform a manipulation process on said audio data that includes digital source speech data provided to said speech detector by an analog sound sensor and an analog-to-digital converter, said detector including a filter bank that generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said detector including a speech detector with program instructions that are stored in a memory device, said speech detector including a noise suppressor with a noise calculator, a speech energy calculator, and a weighting module, said speech detector weighting selected components of said audio data to suppress said background noise, said noise calculator utilizing a non-linear spectrum subtraction procedure that removes a mean value and produces a channel average background noise variance value “Vi(m)” for a channel m at a frame i, said channel average background noise variance value “Vi(m)” for said channel m at said frame i being calculated using an iterative equation Vi(m)=αVi−1(m)+(1−α)|yi(m)−Ni(m)|m=0, 1, . . . , M−1 where said yi(m) is a signal energy during a silent segment of said channel m at said frame i, said Ni(m) is a channel average background noise value, said M is a total number of said discrete frequency channels, and said a is a forgetting factor; and a processor coupled to said system to control said detector for suppressing said background noise.
- 11. The system of claim 10 wherein said a is equal to 0.985 which is equivalent to a window size of 145 frames.
- 12. A system for suppressing background noise in audio data, comprising:a detector configured to perform a manipulation process on said audio data that includes digital source speech data provided to said speech detector by an analog sound sensor and an analog-to-digital converter, said detector including a filter bank that generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said detector including a speech detector with program instructions that are stored in a memory device, said speech detector including a noise suppressor with a noise calculator, a speech energy calculator, and a weighting module, said speech detector weighting selected components of said audio data to suppress said background noise, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said filtered channel energy, said separate weighting values being related to background noise values of said discrete frequency channels; and a processor coupled to said system to control said detector for suppressing said background noise.
- 13. The system of claim 12 wherein said noise-suppressed channel energy “ET” equals a summation of said filtered channel energy from each of said discrete frequency channels “Ei” multiplied by a corresponding one of said weighting values “wi”.
- 14. The system of claim 13 wherein said noise-suppressed channel energy “ET” is defined by a formula:ET=Σwi*Ei i=0, 1, . . . p−1 where said Ei is a channel energy of said discrete frequency channels.
- 15. The system of claim 12 wherein said weighting module calculates a weighting value “wi(m)” for said channel “i” using a formulawi(m)=1/Vi(m) where “Vi(m)” is a channel average background noise variance value for said channel “i” from said filter bank.
- 16. The system of claim 12 wherein said weighting module calculates a weighting value “wi(m)” for said channel “i” using a formulawi(m)=1/MINV where MINV is a minimum variance of channel background noise, said MINV implementing a saturation limit to reduce a dynamic range of said weighting value “wi(m)” when a channel average background noise variance value “Vi(m)” is less than said MINV.
- 17. The system of claim 16 wherein said MINV is equal to one of a value between 0.0001 and 0.0002, and a value equal to 0.00013.
- 18. The system of claim 12 wherein an endpoint detector analyzes said noise-suppressed channel energy to generate an endpoint signal.
- 19. The system of claim 18 wherein said endpoint detector calculates endpoint detection parameters according to a formula DTF(i)=∑m=0M-1 yi(m)wi(m)where said wi(m) is a respective weighting value, said yi(m) is a channel signal energy value of said channel m at said frame i, and said M is a total number of said channels of said filter bank.
- 20. The system of claim 19 wherein a recognizer analyzes said endpoint signals and feature vectors from a feature extractor to generate a speech detection result for said speech detector.
- 21. A method for suppressing background noise in audio data, comprising:performing a manipulation process on said audio data using a detector that includes a filter bank that generates filtered channel energy by separating said audio data into discrete frequency channels, said detector including a weighting module that weights selected components of said audio data to suppress said background noise, said weighting module generating noise-suppressed channel energy by applying separate weighting values directly to each of said discrete frequency channels of said filtered channel energy, said separate weighting values being related to background noise values of said discrete frequency channels; and controlling said detector with a processor to thereby suppress said background noise.
- 22. The method of claim 21 wherein said audio data includes speech information.
- 23. The method of claim 22 wherein said detector comprises a speech detector that includes program instructions which are stored in a memory device coupled to said processor, said speech detector weighting selected said components of said audio data to suppress said background noise.
- 24. The method of claim 23 wherein said speech information includes digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter.
- 25. The method of claim 24 wherein said speech detector comprises a noise suppressor, said noise suppressor including a noise calculator, a speech energy calculator, and said weighting module.
- 26. The system of claim 25 wherein A method for suppressing background noise in audio data, comprising:performing a manipulation process on said audio data using a detector, said audio data including digital source speech data provided to said speech detector by an analog sound sensor and an analog-to-digital converter, said detector including a filter bank that generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said detector including a speech detector with program instructions that are stored in a memory device, said speech detector including a noise suppressor with a noise calculator, a speech energy calculator, and a weighting module, said speech detector weighting selected components of said audio data to suppress said background noise, said noise calculator calculating background noise values during a silent segment of said audio data, said silent segment being located below an ending noise-calculation threshold that is expressed by the formula: Te+0.125(Ter−Te) where Te is an beginning threshold of said audio data and Ter is an beginning threshold of a reliable island in said audio data; and controlling said detector with a processor to thereby suppress said background noise.
- 27. A method for suppressing background noise in audio data, comprising:performing a manipulation process on said audio data using a detector, said audio data including digital source speech data provided to said speech detector by an analog sound sensor and an analog-to-digital converter, said detector including a filter bank that generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said detector including a speech detector with program instructions that are stored in a memory device, said speech detector including a noise suppressor with a noise calculator, a speech energy calculator, and a weighting module, said speech detector weighting selected components of said audio data to suppress said background noise, said noise calculator calculating background noise values during a silent segment of said audio data, said silent segment being located below an ending noise-calculation threshold that is expressed by the formula: Ts+0.125(Ter−Te) where Ts is a beginning threshold of said audio data and Tse is a beginning threshold of a reliable island in said audio data; and controlling said detector with a processor to thereby suppress said background noise.
- 28. The method of claim 25 wherein said noise calculator derives a channel average background noise value “Ni(m)” for a channel m at a frame i by using an iterative equationNi(m)=αNi−1(m)+(1−α)yi(m) m=0, 1, . . . , M−1 where said yi(m) is a signal energy during a silent segment of said channel m at said frame i, said M is a total number of said discrete frequency channels, and said α is a forgetting factor.
- 29. A method for suppressing background noise in audio data, comprising:performing a manipulation process on said audio data using a detector, said audio data including digital source speech data provided to said speech detector by an analog sound sensor and an analog-to-digital converter, said detector including a filter bank that generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said detector including a speech detector with program instruction s that are stored in a memory device, said speech detector including a noise suppressor with a noise calculator, a speech energy calculator, and a weighting module, said speech detector weighting selected components of said audio data to suppress said background noise, said noise calculator deriving a channel average background noise value “Ni(m)” for a channel m at a frame i by using an iterative equation Ni(m)=αNi−1(m)+(1−α)yi(m) m=0, 1, . . . , M−1 where said yi(m) is a signal energy during a silent segment of said channel m at said frame i, said M is a total number of said discrete frequency channels, and said α is a forgetting factor, said α being equal to 0.985 which is equivalent to a window size of 145 frames; and controlling said detector with a processor to thereby suppress said background noise.
- 30. A method for suppressing background noise in audio data, comprising:performing a manipulation process on said audio data using a detector, said audio data including digital source speech data provided to said speech detector by an analog sound sensor and an analog-to-digital converter, said detector including a filter bank that generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said detector including a speech detector with program instructions that are stored in a memory device, said speech detector including a noise suppressor with a noise calculator, a speech energy calculator, and a weighting module, said speech detector weighting selected components of said audio data to suppress said background noise, said noise calculator utilizing a non-linear spectrum subtraction procedure that removes a mean value and produces a channel average background noise variance value “Vi(m)” for a channel m at a frame i, said channel average background noise variance value “Vi(m)” for said channel m at said frame i being calculated using an iterative equation Vi(m)=αVi−1(m)+(1−α)|yi(m)−Ni(m)|m=0, 1, . . . , M−1 where said yi(m) is a signal energy during a silent segment of said channel m at said frame i, said Ni(m) is a channel average background noise value, said M is a total number of said discrete frequency channels, and said a is a forgetting factor; and controlling said detector with a processor to thereby suppress said background noise.
- 31. The method of claim 30 wherein said α is equal to 0.985 which is equivalent to a window size of 145 frames.
- 32. A method for suppressing background noise in audio data, comprising:performing a manipulation process on said audio data using a detector, said audio data including digital source speech data provided to said speech detector by an analog sound sensor and an analog-to-digital converter, said detector including a filter bank that generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said detector including a speech detector with program instructions that are stored in a memory device, said speech detector including a noise suppressor with a noise calculator, a speech energy calculator, and a weighting module, said speech detector weighting selected components of said audio data to suppress said background noise, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said filtered channel energy, said separate weighting values being related to background noise values of said discrete frequency channels; and controlling said detector with a processor to thereby suppress said background noise.
- 33. The method of claim 32 wherein said noise-suppressed channel energy “ET” equals a summation of said filtered channel energy from each of said discrete frequency channels “Ei” multiplied by a corresponding one of said weighting values “wi”.
- 34. The method of claim 33 wherein said noise-suppressed channel energy “ET” is defined by a formula:ET=Σwi*Ei i=0, 1, . . . p−1 where said Ei is a channel energy of said discrete frequency channels.
- 35. The method of claim 32 wherein said weighting module calculates a weighting value “wi(m)” for said channel “i” using a formulawi(m)=1/Vi(m) where “Vi(m)” is a channel average background noise variance value for said channel “i” from said filter bank.
- 36. The method of claim 32 wherein said weighting module calculates a weighting value “wi(m)” for said channel “i” using a formulawi(m)=1/MINV where MINV is a minimum variance of channel background noise, said MINV implementing a saturation limit to reduce a dynamic range of said weighting value “wi(m)” when a channel average background noise variance value “Vi(m)” is less than said MINV.
- 37. The method of claim 36 wherein said MINV is equal to one of a value between 0.0001 and 0.0002, and a value equal to 0.00013.
- 38. The method of claim 32 wherein an endpoint detector analyzes said noise-suppressed channel energy to generate an endpoint signal.
- 39. The method of claim 38 wherein said endpoint detector calculates endpoint detection parameters according to a formula DTF(i)=∑m=0M-1 yi(m)wi(m)where said wi(m) is a respective weighting value, said yi(m) is a channel signal energy value of said channel m at said frame i, and said M is a total number of said channels of said filter bank.
- 40. The method of claim 39 wherein a recognizer analyzes said endpoint signals and feature vectors from a feature extractor to generate a speech detection result for said speech detector.
- 41. A computer-readable medium comprising program instructions for suppressing background noise by:performing a manipulation process on said audio data using a detector that includes a filter bank that generates filtered channel energy by separating said audio data into discrete frequency channels, said detector including a weighting module that weights selected components of said audio data to suppress said background noise, said weighting module generating noise-suppressed channel energy by applying separate weighting values directly to each of said discrete frequency channels of said filtered channel energy, said separate weighting values being related to background noise values of said discrete frequency channels; and controlling said detector with a processor to thereby suppress said background noise.
- 42. A system for suppressing background noise in audio data, comprising:means for performing a manipulation process on said audio data, said means for performing including a filter bank that generates filtered channel energy by separating said audio data into discrete frequency channels, said means for performing also including a weighting module that weights selected components of said audio data to suppress said background noise, said weighting module generating noise-suppressed channel energy by applying separate weighting values directly to each of said discrete frequency channels of said filtered channel energy, said separate weighting values being related to background noise values of said discrete frequency channels; means for controlling said means for performing to thereby suppress said background noise.
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims priority as a Continuation-in-Part application of U.S. patent application Ser. No. 09/176,178, entitled “Method For Suppressing Background Noise In A Speech Detection System,” filed on Oct. 21, 1998, now U.S. Pat. No. 6,230,122. This application also relates to, and claims priority in, U.S. Provisional Patent Application No. 60/160,842, entitled “Method For Implementing A Noise Suppressor In A Speech Recognition System,” filed on Oct. 21, 1999 Provisional Pat. Application Ser. No. 60/099,599 filed Sep. 9, 1995. The foregoing related applications are commonly assigned, and are hereby incorporated by reference.
US Referenced Citations (13)
Provisional Applications (2)
|
Number |
Date |
Country |
|
60/160842 |
Oct 1999 |
US |
|
60/099599 |
Sep 1998 |
US |
Continuation in Parts (1)
|
Number |
Date |
Country |
Parent |
09/176178 |
Oct 1998 |
US |
Child |
09/691878 |
|
US |