Claims
- 1. A method for determining a voicing probability of a speech signal comprising the steps of:generating an original speech spectrum Sω(ω) of the speech signal, where ω is a frequency; generating a synthetic speech spectrum Ŝω(ω) from the original speech spectrum Sω(ω) based on the assumption that the speech signal is purely voiced; dividing the original speech spectrum Sω(ω) and the synthetic speech spectrum Ŝω(ω) into a plurality of bands B each containing a plurality of frequencies ω; comparing said original and synthetic speech spectra within each band; and determining a voicing probability for each band on the basis of said comparison, wherein said voicing probability is an energy ratio between a total number of voiced harmonics within each band and a total number of harmonics within each band.
- 2. A method according to claim 1, where ω represents a harmonic of a fundamental frequency of said speech signal, and said comparing step comprises comparing the original speech spectrum and the synthetic speech spectrum for each harmonic of each band b of the plurality of bands B to determine a difference between the original speech spectrum and the synthetic speech spectrum for each harmonic of each band b of the plurality of decision bands B; and said determining step comprises:determining whether each harmonic of the original speech spectrum is voiced, V(k)=1, or unvoiced, V(k)=0, based on the difference between the original speech spectrum and the synthetic speech spectrum for each harmonic k, wherein V(k) is a binary voicing determination, 1<k≦L, and L is the total number of harmonics within a 4 kHz speech band; and determining a voicing probability Pv(b) for each band b, wherein Pv(b)=∑k∈WbV(k)(A(k))2∑k∈Wb(A(k))2where A(k) is a spectral amplitude for the kth harmonic in bth band.
- 3. A method for determining a voicing probability of a speech signal according to claim 2, wherein said step of generating an synthetic speech spectrum comprises the steps of:sampling the original speech spectrum at harmonics of a fundamental frequency of said speech signal to obtain a harmonic magnitude of each harmonic; generating a harmonic lobe for each harmonic based on the harmonic magnitude of each harmonic; and normalizing the harmonic lobe for each harmonic to have a peak amplitude which is equal to the harmonic magnitude of each harmonic to generate the synthethic speech spectrum.
Parent Case Info
This is a continuation of application Ser. No 09/255,263 filed Feb. 23, 1999, now U.S. Pat. No. 6,253,171, issued Jun. 26, 2001, the disclosure of which is incorporated herein by reference.
US Referenced Citations (4)
Number |
Name |
Date |
Kind |
5715365 |
Griffin et al. |
Feb 1998 |
A |
5774837 |
Yeldener et al. |
Jun 1998 |
A |
5890108 |
Yeldener |
Mar 1999 |
A |
6052658 |
Wang et al. |
Apr 2000 |
A |
Non-Patent Literature Citations (2)
Entry |
Daniel Wayne Griffin and Jae S. Lim, “Multiband Excitation Coder,” IEEE Trans on Acoustics, Speech, and Signal Processing, vol. 36, No. 8, p. 1223-1235, Aug. 1988. |
Suat Yeldener and Marion R. Baraniecki, “A Mixed Harmonic Excitation Linear Predictive Speech Coding For Low Bit Rate Applications,” Proc. 32nd IEEE Asilomar Conference on Signals, Systems & Computers, vol. 1, p. 348-351, Nov. 1998. |
Continuations (1)
|
Number |
Date |
Country |
Parent |
09/255263 |
Feb 1999 |
US |
Child |
09/794150 |
|
US |