Claims
- 1. A speech signal analyzer for producing a transmission data signal in response to an input speech signal, said speech signal analyzer comprising:
- preliminary processing means supplied with said input speech signal for producing a sequence of processed digital signals sampled from said input speech signal and arranged within an analysis frame, said analysis frame having a predetermined frame time interval, there being defined pulse sequences of said analysis frame each comprising a respectively exclusive plurality of equidistantly timed pulses each corresponding to one of said processed digital signals, each of said processed digital signals corresponding to one of said pulses in one of said pulse sequences, and said pulse sequences defining corresponding phases of said analysis frame;
- parameter calculating means for calculating a sequence of preselected parameters at said analysis frame as regards said input speech signal to produce a parameter signal representative of said preselected parameter sequence;
- impulse response calculating means supplied with said parameter signal for calculating corresponding impulse responses;
- cross correlating coefficient calculating memos supplied with said impulse responses and said processed digital signal sequence for calculating series of cross correlation coefficients between said impulse responses and said processed digital signal sequence within said analysis frame to produce a representative cross correlation coefficient signal;
- autocorrelation coefficient calculating means for calculating autocorrelation coefficients based on said impulse responses, said autocorrelation coefficient calculating means producing a respective series of said autocorrelation coefficients for each of said phases; and
- maximum similarity series extracting means, coupled to said cross correlation coefficient calculating means and said autocorrelation coefficient calculating means, for producing a series of excitation pulses and a pulse phase signal which identifies a selected phase, each of said excitation pulses having an equidistant time interval and an identical amplitude, each of said excitation pulses having a respective polarity such that said respective series of autocorrelation coefficients exhibits, with respect to others of said respective series of autocorrelation coefficients, a maximum similarity to said representative cross correlation coefficient signal for said selected phase;
- wherein said maximum similarity series extracting means comprises:
- autocorrelation series calculating means for successively summing up, as a waveform, said respective series of said autocorrelation coefficients to successively produce a corresponding summation result signal for each said phase;
- similarity measuring means responsive to said corresponding summation result signal and said representative cross correlation coefficient signal for (1) measuring a respective degree of similarity between each said respective series of autocorrelation coefficients and said representative cross correlation coefficient signal, (2) determining said respective polarity of each of said excitation pulses by selecting the maximum similarity, and (3) successively producing a sequence of polarity signals for each phase to provide corresponding provisional excitation pulse sequences; and
- phase determining means responsive to said polarity signal sequences for selecting said selected phase and for producing, as said series of excitation pulses, said provisional excitation pulse sequence corresponding to said select phase;
- wherein said respective polarity of each of said excitation pulses is determined by dynamic programming;
- wherein said dynamic programming determines said respective polarity based on an accumulated similarity evaluation measure.
- 2. A speech signal analyzer as claimed in claim 1, wherein:
- said preselected parameters are specified by linear predictive coding parameters; and
- said parameter calculating means comprises:
- interpolating means for interpolating said linear predictive coding parameters at every one of a plurality of interpolation periods, each of said plurality of interpolation periods being shorter than said analysis frame, said interpolating means producing a sequence of interpolated parameters obtained by interpolating said linear predictive coding parameters; and
- means for producing said interpolated parameters as said parameter signal.
- 3. A speech signal analyzer as claimed in claim 2, wherein said impulse response calculation means comprises:
- calculation means coupled to said interpolating means for calculating the impulse response of an all-pole filter defined by said interpolated parameters; and
- means for supplying said impulse responses to said cross correlation coefficient calculating means.
- 4. A speech signal analyzer as claimed in claim 1, wherein said preliminary processing means comprises:
- spectrum modifying means for modifying said input speech signal in its spectrum into a modified speech signal with reference to said predetermined parameters and attenuated parameters calculated on the basis of said predetermined parameters; and
- means for producing said modified speech signal as said digital signal sequence.
- 5. A speech signal analyzer as claimed in claim 1, wherein:
- each of said impulse responses appears at a predetermined time interval ; and
- said dynamic programming is carried out during said predetermined time interval.
- 6. A speech signal synthesizer communicable with said speech signal analyzer claimed in claim 1, comprising:
- a demultiplexer supplied with said transmission data signal sequence for demultiplexing said transmission data signals into said preselected parameters and a synthesized signal which is produced by synthesizing said phase signal with said polarity signal;
- sound source generating means connected to said demultiplexer and responsive to said phase signal and said polarity signal for generating a series of sound source pulses;
- interpolating means connected to said demultiplexer for interpolating said preselected parameters at every one of interpolation periods to produce a sequence of interpolated parameters obtained by interpolating the preselected parameters; and
- means for processing said sound source pulse series into an output speech signal with reference to said interpolated parameter sequence.
- 7. A speech signal encoding system comprising:
- an analyzing side for analyzing a speech signal into a set of analyzed data signals, and
- a synthesizing side for synthesizing said speech signal from said set of said analyzed data signals;
- said speech signal being a sequence of digital speech signals divisible into frames;
- said analyzing side comprising:
- LPC analyzing means for carrying out linear prediction of said digital speech signals at each of said frames to produce a sequence of linear prediction coding coefficients;
- impulse response calculating means for calculating impulse responses of an all-pole filter defined by said sequence of linear prediction coding coefficients;
- cross correlation calculation means for calculating cross correlations between said impulse responses and said digital speech signals in each of said frames to produce a set of cross correlation coefficients;
- autocorrelation calculation means for calculating autocorrelations of said impulse responses to produce a set of autocorrelation coefficients for each of a plurality of series of said digital speech signals, each said series of said digital speech signals comprising nonadjacent, equidistantly spaced ones of said digital speech signals, each of said digital speech signals of one of said frames corresponding to one of said series of digital speech signals, said plurality of series of digital speech signals defining phases of said frame;
- pulse polarity searching means supplied with said set of cross correlation coefficients and said set of autocorrelation coefficients for each of said phases of said frame, each of said phases comprising polar pulses, each of said polar pulses having an identical pulse period and an identical amplitude, said pulse polarity searching means (1) calculating autocorrelation coefficient waveform summation series obtained by adding, as a waveform, each of said plurality of pulse series to corresponding ones of said set of autocorrelation coefficients corresponding to said polar pulses, and (2) searching, by the use of dynamic programming using a degree of an accumulated similarity as an evaluation measure, for each polarity of said polar pulses that has said corresponding ones of said set of autocorrelation coefficients which define a waveform most closely resembling a waveform defined by said set of cross correlation coefficients;
- pulse series phase searching means for searching for a most similar one of said plurality of pulse series which has a maximum waveform similarity between said autocorrelation coefficient waveform summation series and said set of cross correlation coefficients; and
- transmitting means for (1) producing a synthesized signal by synthesizing pulse information obtained by said searching operation of said pulse series phase searching means and said sequence of linear prediction coding coefficients, and (2) transmitting said synthesized signal as said set of said analyzed data signals; and
- said synthesizing side comprising:
- exciting source generating means for generating a sequence of exciting source pulses in response to said pulse series information; and
- synthesizing means for synthesizing a reproduction of said speech signal by the use of said sequence of linear prediction coding coefficients.
- 8. A speech signal encoding system as claimed in claim 7, further comprising:
- first LPC coefficient interpolating means for interpolating said sequence of linear prediction coding coefficients at every one of a plurality of predetermined periods to produce a first sequence of interpolated parameters; and
- second LPC coefficient interpolating means for interpolating, at each of said plurality of predetermined periods, said sequence of linear prediction coding coefficients transmitted from said analysis section;
- said impulse response calculating means calculating said impulse responses on the basis of said sequence of linear prediction coding coefficients interpolated by said first LPC coefficient interpolating means;
- said synthesizing means synthesizing said speech signal by the use of said sequence of linear prediction coding coefficients interpolated by said second LPC coefficient interpolating means.
- 9. A speech signal encoding system as claimed in claim 7, wherein:
- each of said impulse responses is provided to said pulse series phase searching means at a predetermined time interval; and
- said dynamic programming method is carried out during said predetermined time interval.
- 10. A pulse producing circuit for use in a speech signal analyzer and for producing a series of excitation pulses in response to an input speech signal, each pulse of said series of excitation pulses appearing at an equidistant time interval and an identical amplitude, said pulse producing circuit comprising:
- summation means for successively summing up, as a waveform, a series of autocorrelation coefficients to produce a series of summation result coefficients, each one of said series of autocorrelation coefficients corresponding to polarized pulses, each of said polarized pulses being equal to one another in pulse interval and pulse amplitude, and each of said polarized pulses belonging to one of a plurality of pulse sequences, each of said plurality of pulse sequences having a different respective phase;
- extracting means for extracting a respective pulse polarity of each of said polarized pulses by the use of dynamic programming using a degree of an accumulated similarity as an evaluation measure; and
- selecting means for selecting, as said series of excitation pulses, one of said plurality of pulse sequences which provides a maximum waveform similarity between said series of summation result coefficients and a series of cross correlation coefficients relating to said input speech signal.
- 11. A pulse producing method for use in a speech signal analyzer and for producing a series of excitation pulses in response to an input speech signal, each pulse of said series of excitation pulses appearing at an equidistant time interval and an identical amplitude, said pulse producing method comprising the steps of:
- successively summing up, as a waveform, a series of autocorrelation coefficients to produce a series of summation result coefficients, said autocorrelation coefficients corresponding to polarized pulses which are equal to one another in pulse interval and pulse amplitude, and form a plurality of pulse sequences having phases different from one another;
- using dynamic programming to determine a respective polarity of each of said polarized pulses, wherein a degree of accumulated similarity is used as an evaluation measure; and
- selecting, as said series of excitation pulses, one of said plurality of pulse sequences which provides a maximum waveform similarity between said series of summation result coefficients and a series of cross correlation coefficients relating to said input speech signal.
- 12. A speech signal analyzer for producing a transmission data signal in response to a sampled speech signal, each sample of which is represented by a corresponding digital signal, a predetermined number of said digital signals forming a digital signal sequence corresponding to an analysis frame of predetermined duration and defining pulses of said analysis frame, said analysis frame having said pulses in said predetermined number, said analysis frame having a predetermined number of phases defined such that each phase corresponds to a series of equidistantly timed nonadjacent ones of said pulses, and said pulses each correspond to only one of said phases of said analysis frame, each of said pulses having a respective pulse polarity, said speech signal analyzer comprising:
- preliminary processing means for producing said digital signal sequence from said speech signal;
- parameter calculating means for producing a parameter signal representative of a sequence of preselected parameters which are calculated on the basis of said digital signals of said analysis frame;
- impulse response calculating means for calculating impulse responses on the basis of said parameter signal;
- cross correlating coefficient calculating means for producing a cross correlation coefficient signal representative of a series of cross correlation coefficients which are calculated on the basis of said impulse responses and said digital signals of said analysis frame;
- autocorrelation coefficient calculating means for producing, for each said phase of said analysis frame, a respective series of autocorrelation coefficients of said impulse responses;
- maximum similarity series extracting means for producing an excitation pulse series and a pulse phase signal, and comprising:
- autocorrelation series calculating means for producing, for each said phase of said analysis frame, a corresponding summation result signal which is calculated by successively summing, as a waveform, said respective series of autocorrelation coefficients corresponding to said phase;
- similarity measuring means for producing, for each said phase of said analysis frame, a corresponding provisional excitation pulse sequence based on said corresponding summation result signal of said phase and on said representative cross correlation coefficient signal, such that:
- each pulse of said provisional excitation pulse sequence corresponds to one of said pulses of said phase, is equidistant in time with respect to adjacent pulses of said provisional excitation pulse sequence of said phase, and is identical in amplitude with all other pulses of said provisional excitation pulse sequence;
- each pulse of said provisional excitation pulse sequence has said respective pulse polarity determined through dynamic programming by selecting the maximum degree of similarity, according to an accumulated similarity evaluation measure, between said respective series of autocorrelation coefficients corresponding to said phase and said representative cross correlation coefficient signal; and
- phase determining means for producing, as said excitation pulse series, a selected one of said provisional excitation pulse sequences, said selected provisional excitation pulse sequence having, in comparison to a remainder of said provisional excitation pulse sequences, a maximum similarity to said representative cross correlation coefficient signal, wherein said pulse phase signal identifies said one of said phases to which said selected provisional excitation pulse sequence corresponds.
Priority Claims (1)
Number |
Date |
Country |
Kind |
5-192740 |
Jul 1993 |
JPX |
|
Parent Case Info
This is a Continuation of application Ser. No. 08/271,505, filed on Jul. 7, 1994 now abandoned.
US Referenced Citations (8)
Foreign Referenced Citations (2)
Number |
Date |
Country |
2204766 |
Nov 1988 |
GBX |
2200819 |
Nov 1988 |
GBX |
Non-Patent Literature Citations (1)
Entry |
Parsons, Thomas, Voice and Speech Processing, McGraw-Hill Book Co, 1986, pp. 180-182. |
Continuations (1)
|
Number |
Date |
Country |
Parent |
271505 |
Jul 1994 |
|