Claims
- 1. A method for low bit rate speech coding of unvoiced speech, comprising;
identifying an incoming speech frame as an unvoiced speech frame; performing linear predictive analysis on the unvoiced speech frame to create an unvoiced linear predictive residue; extracting high-time-resolution energy parameters from the unvoiced linear predictive residue; encoding the high-time-resolution energy parameters; quantizing the high-time-resolution energy parameters to form quantized energy vectors; forming a high-time-resolution energy envelope; generating a quantized unvoiced residue by coloring random noise with the high-time-resolution energy envelope; and generating a quantized unvoiced speech frame.
- 2. The method of claim 1 wherein the extracting high-time-resolution energy parameters comprises extracting a number (M) of local energy parameters Ei, where i=1, 2, . . . , M, is extracted from an unvoiced residue R[n] by performing the following steps:
dividing N-sample residue R[n] into (M−2) sub-blocks Xi, where i=2, 3, . . . , M−1, with each block Xi having a length of L=N/(M−2); obtaining an L-sample past residue block X1 from a past quantized residue of a previous frame; obtaining an L-sample future residue block XM from the linear predictive residue of a following frame; and creating a number M of local energy parameters Ei, where i=1, 2, . . . , M, from each of the M blocks Xi, where i=1, 2, . . . , M, in accordance with the following equation: 5Ei=1L*∑m=1LXi[m]*Xi[m].
- 3. The method of claim 1 wherein the forming a high-time-resolution energy envelope comprises using look ahead parameter values from a next frame and previous parameter values from a preceding frame to smooth the energy envelope for a current frame at the frame boundaries.
- 4. The method of claim 1 wherein the forming a high resolution energy envelope comprises forming an N-sample high-time-resolution energy envelope ENV[n], the length of a speech frame, where n=1, 2, 3, . . . , N from decoded energy values Wi, where i=1, 2, 3, . . . , M, in accordance with the following computations where:
M energy values represent the energies of M−2 sub-frames of a current residue of speech, each sub-frame having a length L=N/M; values W1 and WM represent the energy of the past L samples of the last frame of residue and the energy of the future L samples of the next frame of residue, respectively; and Wm−1, Wm, and Wm+1, are representative of the energies of the (m−1)th, m-th, and (m+1)-th sub-band, respectively; and samples of the energy envelope ENV[n], for n=m*L−L/2 to n=m*L+L/2, representing the m-th sub-frame are computed as: ENV[n]={square root}{square root over (Wm−1)}+(1/L)*(n−m*L+L)*({square root}{square root over (Wm−Wm−1)}). For n=m*L−L/2, until n=m*L; and ENV[n]={square root}{square root over (Wm)}+(1/L)*(n−m*L)*({square root}{square root over (Wm+1)}−{square root}{square root over (Wm)}). for n=m*L, until n=m*L+L/2, wherein the steps for computing the energy envelope ENV[n] are repeated for each of the M−1 bands, letting m=2, 3, 4, . . . , M, to compute the entire energy envelope ENV[n], where n=1, 2, . . . , N, for a current residue frame.
- 5. The method of claim 1 wherein the encoding the high-time-resolution energy parameters comprises encoding the energy parameters are according to a pyramid vector quantization method.
- 6. A speech coder for low bit rate speech coding of unvoiced speech, comprising;
means for identifying an incoming speech frame as an unvoiced speech frame; means for performing linear predictive analysis on the unvoiced speech frame to create an unvoiced linear predictive residue; means for extracting high-time-resolution energy parameters from the unvoiced linear predictive residue; means for encoding the high-time-resolution energy parameters; means for quantizing the high-time-resolution energy parameters to form quantized energy vectors; means for forming a high-time-resolution energy envelope; means for generating a quantized unvoiced residue by coloring random noise with the high-time-resolution energy envelope; and means for generating a quantized unvoiced speech frame.
CLAIM OF PRIORITY UNDER 35 U.S.C. §120
[0001] The present Application for Patent is a Continuation and claims priority to patent application Ser. No. 09/191,633 entitled “LOW BIT-RATE CODING OF UNVOICED SEGMENTS OF SPEECH,” filed Nov. 13, 1998, assigned to the assignee hereof and hereby expressly incorporated by reference herein.
Continuations (1)
|
Number |
Date |
Country |
| Parent |
09191633 |
Nov 1998 |
US |
| Child |
10196973 |
Jul 2002 |
US |