Claims
- 1. A speech recognition method using a perceptual harmonic cepstral coefficient comprising:
a) processing a speech frame whereby to obtain a short-term power spectrum; b) performing a robust pitch estimation; c) using a peak-picking formula whereby to obtain a pitch harmonic; d) applying class-dependent harmonic weighting whereby to obtain the harmonics weighted spectrum; e) applying a mel-scaled filter to the harmonics weighted spectrum; and f) computing the log energy output which is transformed into cepstrum by the discrete cosine transform.
- 2. The method of claim 1 wherein the processing of said speech frame is by Fast Fourier transform or Discrete Fourier transform.
- 3. The method of claim 1 in which the step of performing the robust pitch estimation is in accordance with the formula:
- 4. The method of claim 3 wherein the step of performing the robust pitch estimation allows for classification of the speech as voiced, unvoiced, or transitional uses the spectro-temporal auto-correlation criterion R(τ), such that if R(τ)>αy, the speech frame is classified as voiced, if R(τ)<αu, the speech frame is classified as unvoiced, and if αy≧R(τ)≧αu, the speech frame is declared transitional, wherein αv=0.8 and αu=0.5.
- 5. The method of claim 1, in which the step of obtaining the harmonics weighted spectrum is in accordance with the formula:
- 6. The method of claim 5, wherein WH=100 for voiced sounds.
- 7. The method of claim 5, wherein WH=10 for transitional sounds.
- 8. The method of claim 1, in which the step of applying mel-scaled filters and computing the log energy output is in accordance with the formula:
- 9. The method of claim 1, further comprising: prior to performing the robust pitch estimation, applying the intensity-loudness power law to the power spectrum to obtain a root-power compressed spectrum.
- 10. A speech recognition method using a harmonic weighing function in accordance with the formula:
- 11. The method of claim 10 wherein ωTη and γ are about 2.5 kHz, 0.5 and 10, respectively.
- 12. A speech recognition method using a harmonic weighing function in accordance with the formula:
- 13. The method of claim 12 wherein η and γ0.5 and 10, respectively.
- 14. The method of claim 12 wherein ωT is 4000π for transitional sounds.
- 15. The method of claim 12 wherein ωT is obtained for voiced sounds in accordance with the formula:
- 16. The method of claim 15 wherein ωT1 is 2000π and ωTh is 6000π.
Priority Claims (1)
Number |
Date |
Country |
Kind |
60237285 |
Oct 2000 |
US |
|
CROSS REFERENCE TO RELATED APPLICATION
[0001] This application claims the benefit of U.S. Provisional Patent Application No. 60/237,285, which application is herein incorporated by reference.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
[0002] This invention was made with Government support under Grant No. IIS-9978001, awarded by the National Science Foundation. The Government has-certain rights in this invention.
PCT Information
Filing Document |
Filing Date |
Country |
Kind |
PCT/US01/30909 |
10/2/2001 |
WO |
|