Claims
- 1. A method for detecting speech activity for a signal, the method comprising the steps of:
extracting a plurality of features from the signal; modeling a first and a second probability density functions (PDFs) of the plurality of features, wherein:
the first PDF models active speech conditions for the signal, and the second PDF models inactive speech conditions for the signal; adapting the first and second PDFs to respond to changes in the signal over time; probability-based classifying of the signal based, at least in part, on the plurality of features; and distinguishing speech in the signal based, at least in part, upon the probability-based classifying step.
- 2. The method for detecting speech activity for the signal as recited in claim 1, wherein the probability-based classifying step uses the first and second PDFs.
- 3. The method for detecting speech activity for the signal as recited in claim 1, wherein the modeling step comprises a step of determining a mathematical model for the signal from the plurality of features.
- 4. The method for detecting speech activity for the signal as recited in claim 1, wherein the adapting step comprises a step of increasing a likelihood.
- 5. The method for detecting speech activity for the signal as recited in claim 1, wherein the adapting step comprises a step of identifying extreme values in a long sequence of previous frames.
- 6. The method for detecting speech activity for the signal as recited in claim 1, wherein the probability-based classifying step comprises a step of classifying based on likelihood ratio detection.
- 7. The method for detecting speech activity for the signal as recited in claim 1, wherein the probability-based classifying step comprises applying a log-likelihood ratio test to one of the plurality of features.
- 8. The method for detecting speech activity for the signal as recited in claim 1, wherein at least one of the first and second PDFs comprises a Gaussian mixture model.
- 9. The method for detecting speech activity for the signal as recited in claim 1, wherein at least one of the first and second PDFs uses a non-Gaussian model.
- 10. The method for detecting speech activity for the signal as recited in claim 1, wherein at least one of the first and second PDFs comprises a plurality of basic density models.
- 11. The method for detecting speech activity for the signal as recited in claim 1, wherein at least one of the plurality of features is related to power in a spectral band of the signal.
- 12. The method for detecting speech activity for the signal as recited in claim 1, further comprising a step of smoothing an activity decision for hangover periods to produce a smoothed activity decision.
- 13. A computer-readable medium having computer-executable instructions for performing the computer-implementable method for detecting speech activity for the signal of claim 1.
- 14. A method for detecting sound activity for a signal, the method comprising the steps of:
extracting a plurality of features from the signal; modeling an active speech probability density function (PDF) of the plurality of features; modeling an inactive speech PDF of the plurality of features; adapting the active and inactive speech PDFs to respond to changes in the signal over time; probability-based classifying of the signal based, at least in part, on the plurality of features; and distinguishing speech in the signal based, at least in part, upon the probability-based classifying step.
- 15. The method for detecting sound activity for the signal as recited in claim 14, wherein the probability-based classifying step uses the active and inactive speech PDFs.
- 16. The method for detecting sound activity for the signal as recited in claim 14, wherein the adapting step comprises a step of increasing a likelihood.
- 17. The method for detecting sound activity for the signal as recited in claim 14, wherein at least one of the active and inactive speech PDFs uses a non-Gaussian model.
- 18. A computer-readable medium having computer-executable instructions for performing the computer-implementable method for detecting sound activity for the signal of claim 14.
- 19. A method for detecting sound activity for a signal, the method comprising the steps of:
extracting a plurality of features from the signal; modeling an active speech probability density function (PDF) of the plurality of features; modeling an inactive speech PDF of the plurality of features, wherein at least one of the active and inactive speech PDFs uses a non-Gaussian model; adapting the active and inactive speech PDFs to respond to changes in the signal over time; probability-based classifying of the signal based, at least in part, the active and inactive speech PDFs; and distinguishing speech in the signal based, at least in part, upon the probability-based classifying step.
- 20. The method for detecting sound activity for the signal as recited in claim 19, wherein both the active and inactive speech PDFs use a non-Gaussian model.
- 21. A computer-readable medium having computer-executable instructions for performing the computer-implementable method for detecting sound activity for the signal of claim 19.
Parent Case Info
[0001] This application claims the benefit of U.S. Provisional Patent No. 60/251,749 filed on Dec. 4, 2000.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60251749 |
Dec 2000 |
US |