Claims
- 1. A speech analyzing apparatus comprising:
- linear predictive analysis means for performing a linear predictive analysis of an input speech signal for each analysis window of a fixed length to obtain prediction coefficients, said linear predictive analysis means including means for determining whether said input speech signal in an analysis window of fixed length is voiced or unvoiced and for providing a voiced/unvoiced decision signal;
- inverse filter means controlled by said prediction coefficients, for deriving a prediction residual from said input speech signal;
- speech phase equalizing filter means for rendering the phase of said input speech signal into a zero phase to obtain a phase-equalized speech signal;
- prediction residual phase equalizing filter means for rendering the phase of said prediction residual into a zero phase to obtain a phase-equalized prediction residual signal;
- reference time point gathering means for detecting impulses of magnitudes larger than a predetermined threshold value in said phase-equalized prediction residual signal and for outputting the positions of said impulses as reference time points;
- impulse position generating means responsive to said reference time points and said voiced/unvoiced decision signal for producing, based on said reference time points when said decision signal indicates that said speech signal is a voiced sound, differences between successive intervals of said reference time points for comparing the differences with a predetermined limit range, and for determining positions of impulses such that when the differences are within said predetermined limit range, said reference time points are determined as impulse positions, and when said difference are in excess of said predetermined limit range, impulse positions are determined by adding a time point to said reference time points or by omission of one of said reference time points or by shift of one of said reference time points so that the differences between the successive intervals of the processed reference time points are held within said limit range, said impulse positions thus determined being one of the parameters representing the excitation signal as a result of the speech analysis;
- impulse sequence generating means for receiving said impulse positions from said impulse position generating means and generating impulses at said impulse positions;
- all-pole filter means controlled by said prediction coefficients and excited by said generated impulse sequence to generate a synthesized speech; and
- impulse magnitude calculating means for determining magnitude values of said impulses generated by said impulse sequence generating means which minimize an error between a waveform of a synthesized speech obtainable by exciting said all-pole filter means with said impulse sequence and a waveform of said phase-equalized speech supplied from said speech phase equalizing filter means, and means for outputting said impulse magnitudes for use as another one of the parameters representing the excitation signal as a result of the speech analysis by said speech analyzing apparatus.
- 2. The apparatus according to claim 1 further comprising:
- zero filter means for providing said impulse sequence with features of the waveform of said phase-equalized prediction residual signal and supplying the output thereof to said all-pole filter means as the excitation signal; and
- zero filter coefficient calculating means for establishing the coefficients of said zero filter means which minimize an error between a waveform of a synthesized speech obtained by exciting said all-pole filter means with the output of said zero filter means and a waveform of said phase-equalized speech.
- 3. The apparatus of claim 1 or 2, wherein said apparatus further includes random pattern generating means for generating a random pattern which minimizes an error between a waveform of a synthesized speech obtained by exciting said all-pole filter means with one of a plurality of predetermined random patterns and a waveform of said phase-equalized speech in a window during which said decision signal is unvoiced.
- 4. The apparatus of claim 1 or 2, wherein said impulse sequence generating means includes vector quantizing mans for vector quantizing the magnitude values of said impulses determined by said impulse magnitude calculating means.
- 5. A method for analyzing a speech to generate parameters representing an input speech waveform including parameters of an excitation signal for exciting a linear filter representing a speech spectral envelope characteristic, comprising the steps of:
- producing a phase-equalized prediction residual of the input speech waveform;
- determining reference time points where levels of said phase-equalized prediction residual exceed a predetermined threshold;
- determining whether the input speech waveform in each of a plurality of successive analysis windows, each of which is of fixed time length, is voiced or unvoiced sound;
- obtaining the difference between intervals of successive ones of said reference time points in each analysis window;
- when the input speech waveform is voiced sound, selecting impulse positions based on said reference time points such that when the difference between the intervals of the successive reference time points in each analysis window is within a predetermined range, the reference time points are selected as impulse positions, and when the difference between the intervals of the successive reference time points exceeds the predetermined range, impulse positions are selected by moving or deleting the reference time points or inserting reference time points to define a sequence of quasi-periodic impulses so that the differences between successive reference time points are within said predetermined range the positions of said quasi-periodic impulse sequence being one of the parameters representing said excitation signal; and
- so selecting magnitudes of the respective impulses of the quasi-periodic sequence in each analysis window as to minimize an error between the phase-equalized speech waveform and a synthesized speech waveform obtained by exciting said linear filter with said quasi-periodic impulse sequence, the magnitudes of the quasi-periodic impulses being another of the parameters representing said excitation signal.
- 6. The method of claim 5 wherein, before being applied to said linear filter, said quasi-periodic impulses are processed by a zero filter, said method including the step of selecting coefficients of said zero filter which minimize an error between said phase-equalized speech waveform and a synthesized speech waveform obtained by exciting said linear filter with the output of said zero filter, whereby said processing of said quasi-periodic impulses by said zero filter gives the sequence of said quasi-periodic impulses features of the waveform of said phase-equalized prediction residual signal, and using said coefficients of said zero filter as one of said parameters representing said excitation signal.
- 7. The method of claim 5 or 6 wherein said excitation signal is used for a voiced sound and a random sequence selected from a plurality of predetermined random patterns is used as an excitation signal for an unvoiced sound, said method including so selecting one of said predetermined random patterns representing said excitation signal for said unvoiced sound as to minimize an error between said phase-equalized speech waveform nd a synthesized speech waveform obtainable by exciting said linear filter with said random patterns, and using said selected one of the predetermined random patterns to produce one of the parameters representing the input speech waveform.
Priority Claims (1)
Number |
Date |
Country |
Kind |
1-257503 |
Oct 1989 |
JPX |
|
Parent Case Info
This application is a continuation of Ser. No. 07/592,444, filed on Oct. 2, 1990, now abandoned.
US Referenced Citations (6)
Continuations (1)
|
Number |
Date |
Country |
Parent |
592444 |
Oct 1990 |
|