Claims
- 1. A method for encoding speech comprising the steps of:obtaining first pitch periods of an input speech signal; changing the pitch periods according to condition of the input speech signal to obtain second pitch periods; determining encoding sections corresponding to said second pitch periods, respectively; generating an excitation signal by which distortion of a synthesized speech signal is minimized for each of said encoding sections, the synthesized speech signal being generated by subjecting the excitation signal to synthesis filtering; and outputting at least information representing said changed pitch periods and information on said synthesized speech signal as encoded data.
- 2. The method for encoding speech according to claim 1, further comprising the step of concatenating said second pitch periods which are at least partially adjacent to each other to obtain concatenated pitch periods,the steps of determining encoding sections determines encoding sections based on said concatenated pitch periods as well as said changed pitch periods, said step of outputting encoded data comprising the step of outputting information representing said local concatenated pitch periods as well as the information representing said changed pitch periods and the information on said synthesized speech signal as encoded data.
- 3. A method for encoding speech comprising the steps of:obtaining synthesis filter characteristic information representing the transfer characteristics of a synthesis filter which receives an excitation signal and generates a synthesized speech signal; obtaining first pitch periods of an input speech signal; changing the pitch periods according to condition of the input speech signal to obtain second pitch periods; determining encoding sections corresponding to said second pitch periods, respectively; generating said excitation signal by which distortion of said synthesized speech signal is minimized for each of said encoding sections; and outputting at least said synthesis filter characteristic information, information representing said second pitch periods and information representing said excitation signal as encoded data.
- 4. The method for encoding speech according to claim 3, further comprising the step of concatenating said second pitch periods which are at least partially adjacent to each other to obtain concatenated pitch periods,the steps of determining encoding sections determines encoding sections based on said concatenated pitch periods as well as said changed pitch periods, said step of outputting encoded data comprising the step of outputting information representing said concatenated pitch periods as well as said synthesis filter characteristic information, information representing said second pitch periods and information representing said excitation signal as encoded data.
- 5. A method for encoding speech comprises:setting a plurality of pitch marks in each frame of an input speech signal, each of the pitch marks indicating a position in the frame at which a pitch wave form is to be put; obtaining a plurality of pitch periods corresponding pitch marks, respectively, the pitch periods being changed according to condition of the input speech signal; generating an excitation signal by which distortion of a synthesized speech signal is minimized, for each of said pitch periods, the synthesized speech signal being generated by subjecting the excitation signal to synthesis filtering; and outputting at least information representing said pitch periods and information on said synthesized speech signal as encoded data.
- 6. A method according to claim 5, wherein the step of generating an excitation signal includes putting pitch waveforms on the pitch marks and applying a gain thereto to generate the excitation signal.
- 7. A method according to claim 5, wherein the step of generating an excitation signal includes calculating an error between the synthesized speech signal and the input speech signal, weighting the error with a perceptual weighting method, and selecting an excitation signal for which distortion of the input speech signal is minimum.
- 8. A method according to claim 6, which includes generating the pitch waveforms by sorting a plurality of template pitch waveforms in a codebook in advance and selecting the optimum pitch waveforms from the template pitch waveforms through closed loop search.
- 9. A method for encoding speech comprising the steps of:obtaining local pitch periods representing time lengths of one-pitch waveforms of an input speech signal from said input speech signal; determining encoding sections based on said local pitch periods; generating a synthesized speech signal for which distortion from said input speech signal is minimized in each of said encoding sections; outputting at least information representing said local pitch periods and information on said synthesized speech signal as encoded data; and concatenating said local pitch periods which are at least partially adjacent to each other to obtain local concatenated pitch periods, said step of generating a synthesized speech signal comprising the steps of determining encoding sections based on said local pitch periods and said local concatenated pitch periods and generating a synthesized speech signal for which distortion from said input speech signal is minimized in each of said encoding sections, said step of outputting encoded data comprising the step of outputting at least information representing said local pitch periods, information representing said local concatenated pitch periods and information on said synthesized speech signal as encoded data.
- 10. A method for encoding speech comprising the steps of:obtaining synthesis filter characteristic information representing the transfer characteristics of a synthesis filter which receives the input of an excitation signal and generates a synthesized speech signal and obtaining local pitch periods representing time lengths of one-pitch waveforms of an input speech signal from said input speech signal; determining encoding sections based on said local pitch periods; generating said excitation signal for which distortion of said synthesized speech signal is minimized in each of said encoding sections; outputting at least said synthesis filter characteristic information, information representing said local pitch periods and information representing said excitation signal as encoded data; concatenating said local pitch periods which are at least partially adjacent to each other to obtain local concatenated pitch periods, said step of generating an excitation signal comprising the steps of determining encoding sections based on said local pitch periods and said local concatenated pitch periods and generating said excitation signal for which distortion of said synthesized speech signal is minimized in each of said encoding sections, said step of outputting encoded data comprising the step of outputting at least said synthesis filter characteristic information, information representing said local pitch periods, information representing said local concatenated pitch periods and information representing said excitation signal as encoded data.
Priority Claims (4)
Number |
Date |
Country |
Kind |
9-063450 |
Mar 1997 |
JP |
|
9-179677 |
Jul 1997 |
JP |
|
9-235129 |
Aug 1997 |
JP |
|
9-354806 |
Dec 1997 |
JP |
|
Parent Case Info
This application is a division of application Ser. No. 09/039,317, filed Mar. 16, 1998, now U.S. Pat. No. 6,167,375.
US Referenced Citations (18)