Claims
- 1. A method for determining if a plurality of parametric model data of a compressed bit stream contain voice data, comprising:
a) computing normalized signal levels for a plurality of frequency sub-bands of said compressed bit stream using at least one of said parametric model data; b) determining a stability level for said compressed bit stream using at least one of said parametric model data; c) estimating a background noise level for said frequency sub-bands based on at least one of said stability level and said normalized signal levels; and d) identifying the presence of voice data in said compressed bit stream based on said estimation and said normalized signal levels.
- 2. The method according to claim 1, wherein said parametric model data comprise at least one of:
a) short term filter coefficients; b) overall frame gain; c) voice cutoff level; and d) pitch.
- 3. The method according to claim 1, further comprising:
e) identifying periods of inactivity between identified voice data; and f) removing said periods of inactivity from said compressed bit stream.
- 4. The method according to claim 2, wherein said short term filter coefficients comprise Line Spectral Frequency form coefficients.
- 5. The method according to claim 2, wherein said compressed bit stream is divided into frames, each frame having a corresponding plurality of parametric model data, and wherein said computing normalized signal levels comprises:
a) computing a spectral envelope of a frame based on said short term filter coefficients; b) computing signal levels for said plurality of frequency sub-bands based on said spectral envelope; c) calculating a frame gain based on said short term filter coefficients; and d) normalizing said computed signal levels based on said overall frame gain and said frame gain based on said short term filter coefficients.
- 6. The method according to claim 1, wherein step b) comprises determining a frequency level of said compressed bit stream above which no voice activity is expected to be present, based on at least one of said parametric model data.
- 7. The method according to claim 5, wherein step c) comprises estimating and updating the background noise level present in each frame at each of said plurality of frequency sub-bands.
- 8. The method according to claim 1, wherein step d) comprises:
a) deciding if a voice signal is present based on at least one of said background noise estimate and said normalized signal levels; and b) indicating the presence of voice activity.
CROSS-REFERENCE TO RELATED APPLICATIONS:
[0001] This application is a continuation of U.S. patent application Ser. No. 09/822,503 filed Apr. 2, 2001 (“Compressed Domain Universal Transcoder”).
Continuations (1)
|
Number |
Date |
Country |
Parent |
09822503 |
Apr 2001 |
US |
Child |
10242465 |
Sep 2002 |
US |