Claims
- 1. A method of encoding a signal having a speech component, the signal being organized as a plurality of frames, the method comprising the steps, performed for each frame, of:
- analyzing a first linear prediction window to generate a first set of filter coefficients for a frame;
- analyzing a second linear prediction window to generate a second set of filter coefficients for the frame;
- analyzing a first pitch analysis window to generate a first pitch estimate for the frame;
- analyzing a second pitch analysis window to generate a second pitch estimate for the frame;
- determining whether the frame is one of a first mode, a second mode and a third mode, depending on measures of energy content of the frame and spectral content of the frame;
- encoding the frame, depending on the second set of filter coefficients and the first and the second pitch estimates, independently of the first set of filter coefficients, when the frame is determined to be the third mode;
- encoding the frame, depending on the first and the second sets of filter coefficients, independently of the first and the second pitch estimates, when the frame is determined to be the second mode; and
- encoding the frame, depending on the second set of filter coefficients, independently of the first set of filter coefficients and the first and the second pitch estimates, when the frame is determined to be the first mode.
- 2. The method of claim 1, wherein the determining step includes the substep of:
- determining a mode depending on a determined mode of a previous frame.
- 3. The method of claim 1 wherein the determining step includes the substep of:
- determining the mode to be the first mode only when the determined mode of a previous frame is either the first mode or the second mode.
- 4. The method of claim 1, wherein the determining step includes the substep of:
- determining the mode to be the third mode only when the determined mode of a previous frame is either the third mode or the second mode.
- 5. The method of claim 1 wherein the determining step further depends on measures of pitch stationarity between the frame and a previous frame.
- 6. The method of claim 1 wherein the determining step further depends on measures of short-term level gradient within the frame.
- 7. The method of claim 1 wherein the determining step further depends on measures of a zero-crossing rate within the frame.
- 8. The encoding method of claim 1, wherein the first linear prediction window is contained within the frame and the second linear prediction window begins during the frame and extends into the next frame.
- 9. The encoding method of claim 1, wherein the first pitch estimate window is contained within the frame and the second pitch estimate window begins during the frame and extends into the next frame.
- 10. The encoding method of claim 1, wherein a frame determined to be of a third mode contains a signal with a speech component composed of primarily voiced speech.
- 11. The encoding method of claim 1, wherein a frame determined to be of a second mode contains a signal with a speech component composed of primarily unvoiced speech.
- 12. The encoding method of claim 1, wherein a frame determined to be of a first mode contains a signal with a low speech component.
- 13. An encoder for encoding a signal having a speech component, the signal being organized as a plurality of frames, comprising:
- a filter coefficient generator for analyzing a first linear prediction window to generate a first set of filter coefficients for a frame and for analyzing a second linear prediction window to generate a second set of filter coefficients for the frame;
- a pitch estimator for analyzing a first pitch analysis window to generate a first pitch estimate for the frame and analyzing a second pitch analysis window to generate a second pitch estimate for the frame;
- a mode determinator for determining whether the frame is one of a first mode, a second mode and a third mode, depending on measures of energy content of the frame and spectral content of the frame; and
- a frame encoder for encoding the frame depending on the determined mode of the frame, wherein
- a frame determined to be of a third mode is encoded depending on the second set of filter coefficients and the first and the second pitch estimates, independently of the first set of filter coefficients,
- a frame determined to be of a second mode is encoded depending on the first and the second sets of filter coefficients, independently of the first and the second pitch estimates, and
- a frame determined to be of a first mode is encoded depending on the second set of filter coefficients, independently of the first set of filter coefficients and the first and the second pitch estimates.
- 14. The encoder of claim 13, wherein the mode determinator determines the mode depending on a determined mode of a previous frame.
- 15. The encoder of claim 13, wherein the mode determinator determines the frame to be of the first mode only when the determined mode of a previous frame is either the first mode or the second mode.
- 16. The encoder of claim 13, wherein the mode determininator determines the frame to be of the third mode only when the determined mode of a previous frame is either the third mode or the second mode.
- 17. The encoder of claim 13 wherein the mode determininator further depends on measures of pitch stationarity between the frame and a previous frame.
- 18. The encoder of claim 13 wherein the mode determinator further depends on measures of short-term level gradient within the frame.
- 19. The encoder of claim 13 wherein the mode determinator further depends on measures of a zero-crossing rate within the frame.
- 20. The encoder of claim 13, wherein the first linear prediction window is contained within the frame and the second linear prediction window begins during the frame and extends into the next frame.
- 21. The encoder of claim 13, wherein the first pitch estimate window is contained within the frame and the second pitch estimate window begins during the frame and extends into the next frame.
- 22. The encoder of claim 13, wherein a frame determined to be of a third mode contains a signal with a speech component composed of primarily voiced speech.
- 23. The encoder of claim 13, wherein a frame determined to be of a second mode contains a signal with a speech component composed of primarily unvoiced speech.
- 24. The encoder of claim 13, wherein a frame determined to be of a first mode contains a signal with a low speech component.
BACKGROUND OF THE INVENTION
This is a division of application Ser. No. 08/229,271 filed Apr. 18, 1994, which is a continuation-in-part of prior application Ser. No. 08/227,881 filed Apr. 15, 1994, of Kumar Swaminathan, Kalyan Ganesan, and Prabhat K. Gupta for METHOD OF ENCODING A SIGNAL CONTAINING SPEECH, which is a continuation-in-part of prior application Ser. No. 07/905,992, filed Jun. 25, 1992, of Kumar Swaminathan for HIGH QUALITY LOW BIT RATE CELP-BASED SPEECH CODEC, issued as U.S. Pat. No. 5,495,555, which is a continuation-in-part application under 37 C.F.R. .sctn.1.162 of prior application Ser. No. 07/891,596, filed Jun. 1, 1992, of Kumar Swaminathan for CELP EXCITATION ANALYSIS FOR VOICED SPEECH (abandoned). The contents of patent application Ser. No. 07/905,992 entitled "HIGH QUALITY LOW BIT RATE CELP-BASED SPEECH CODEC" are hereby incorporated by reference.
US Referenced Citations (4)
Number |
Name |
Date |
Kind |
4058676 |
Wilkes et al. |
Nov 1977 |
|
4771465 |
Bronson et al. |
Sep 1988 |
|
5459814 |
Gupta et al. |
Oct 1995 |
|
5495555 |
Swaminathan |
Feb 1996 |
|
Non-Patent Literature Citations (4)
Entry |
ICC'93, 23 May 1993, Geneva pp. 406-409 P. Lupini et al. `A multi-mode variable rate CELP coder based on frame classification` see the whole document. |
ICASSP 90, vol. 1, 3 Apr. 1990, Albuquerque pp. 477-480 T. Tanguichi et al. `Combined source and channel coding based on multimode coding` see p. 477 left column, paragraph 1-right column, paragraph 2 see Fig. 1,2. |
Atal et al., "A Pattern Recognition Approach to Voiced-Unvoiced-Silence Classification With Applications to Speech Recognition," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-24, No. 3, Jun. 1976. |
Rabiner et al., "Application of an LPC Distance Measure to the Voiced-Unvoiced-Silence Detection Problem," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-25, No. 4, Aug. 1977. |
Divisions (1)
|
Number |
Date |
Country |
Parent |
229271 |
Apr 1994 |
|
Continuation in Parts (3)
|
Number |
Date |
Country |
Parent |
227881 |
Apr 1994 |
|
Parent |
905992 |
Jun 1992 |
|
Parent |
891596 |
Jun 1992 |
|