This application claims the benefit of Korean Patent Application No. 10-2005-0010992, filed on Feb. 5, 2005, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein by reference.
1. Field of the Invention
The present invention relates to a method and an apparatus for recovering a line spectrum pair (LSP) parameter for speech decoding, and more particularly, to a method and an apparatus for recovering an LSP parameter when frame loss occurs and a speech decoding apparatus using the same.
2. Description of the Related Art
To transmit data in a limited bandwidth environment, a speech coding apparatus does not transmit an actual speech signal but extracts parameters representing the speech signal, encodes the extracted parameters, and generates a speech packet including the coded parameters. A speech decoding apparatus decodes the coded parameters included in the generated speech packet and recovers the speech signal using the decoded parameters.
A line spectrum pair (LSP) parameter is one parameter representing the speech signal. The LSP parameter has good coding characteristics since it is closely related to a speech frequency. Most speech coding apparatuses generate the LSP parameter, code the generated LSP parameter, and speech decoding apparatuses decode the coded LSP parameter.
However, to remove an error from a received speech packet, speech coding apparatuses usually check the received speech packet and, if it is determined that the received speech packet has an error, erase the speech packet. Such erasure of a speech packet causes loss of the LSP parameter and breaking of the recovered speech signal.
To solve such problems, a method of recovering the lost LSP parameter in speech decoding has been proposed.
However, since the same speech signal is recovered for the L frames, continuity between a speech signal recovered for the L subsequent erased frames and a speech signal recovered based on a next good frame (NGF) deteriorates.
The letter w denotes a weight and is determined as a value from 0 to 1 according to the number of the erased frames and whether transmission position of erased frames approaches the PGF or the NGF. Accordingly, the LSP parameter of the L erased frames generated using the LSP parameters of the PGF and the NGF have different values LSP(m+1) . . . LSP(m+x) . . . LSP (m+L).
However, since the LSP parameters are recovered in an LSP parameter region, it is difficult to define a spectrum region, develop an algorithm, and apply the method to a variety of technologies.
An aspect of the present invention provides a method and an apparatus for recovering a line spectrum pair (LSP) parameter in a spectrum region when frame loss occurs during speech decoding and a speech decoding apparatus.
According to an aspect of the present invention, there is provided a method of recovering a line spectrum pair (LSP) parameter for speech decoding, the method including: (a) converting an LSP parameter of a previous good frame (PGF) of an erased frame into a spectrum region to obtain a spectrum envelope of the PGF, when it is determined that a received speech packet has an erased frame; (b) recovering a spectrum envelope of the erased frame using the obtained spectrum envelope of the PGF; and (c) converting the recovered spectrum envelope of the erased frame into an LSP parameter of the erased frame.
According to another aspect of the present invention, there is provided a method of recovering a line spectrum pair (LSP) parameter in speech decoding, the method including: (a) converting an LSP parameter of a previous good frame (PGF) of an erased frame and an LSP parameter of a next good frame (NGF) of the erased frame into spectrum regions and obtaining spectrum envelopes of the PGF and NGF, when it is determined that a received speech packet has an erased frame; (b) recovering a spectrum envelope of the erased frame using the spectrum envelopes of the PGF and the NGF; and (c) converting the recovered spectrum envelope of the erased frame into an LSP parameter of the erased frame.
According to still another aspect of the present invention, there is provided an apparatus for recovering a line spectrum pair (LSP) parameter during speech decoding, the apparatus including: a first converter, when it is determined that a received speech packet has an erased frame, receiving an LSP parameter of a previous good frame (PGF) of the erased frame and converting the received LSP parameter of the PGF into a spectrum region of the PGF, and obtaining a spectrum envelope of the PGF; a spectrum recovering unit recovering a spectrum envelope of the erased frame using the spectrum envelope of the PGF; and a second converter converting the spectrum envelope of the erased frame into an LSP parameter of the erased frame.
According to yet another aspect of the present invention, there is provided an apparatus for recovering a line spectrum pair (LSP) parameter in speech decoding, the apparatus including: a first converter, when it is determined that a received speech packet has an erased frame, converting an LSP parameter of a previous goof frame (PGF) of the erased frame into a spectrum region and obtaining a spectrum envelope of the PGF; a second converter, when it is determined that the received speech packet has an erased frame, converting an LSP parameter of a next good frame (NGF) of the erased frame into a spectrum region and obtaining a spectrum envelope of the NGF; a recovering unit recovering a spectrum envelope of the erased frame using the spectrum envelopes of the PGF and the NGF; and a third converter converting the recovered spectrum envelope of the erased frame into an LSP parameter region of the erased frame.
According to further another aspect of the present invention, there is provided an speech decoding apparatus, including: an excitation signal decoder decoding parameters of a current frame and outputting an excitation signal; a line spectrum pair (LSP) parameter decoder decoding an LSP parameter of the current frame; a frame erasure concealment unit, when a received coded speech packet has an erased frame, recovering an LSP parameter of the erased frame and the excitation signal of the erased frame using parameters of a previous good frame (PGF) or parameters of the PGF and a next goof frame (NGF) of the erased frame in order to conceal the erasure of the erased frame; a parameter transmitter, when the received coded speech packet does not have an erased frame, transmitting the parameters of the current frame to the excitation signal decoder and the LSP parameter decoder and, if the received coded speech packet has the erased frame, transmitting the parameters of the PGF of the erased frame or the parameters of the PGF and the NGF of the erased frame to the frame erasure concealment unit; a converter converting the decoded LSP parameters transmitted from the LSP parameter decoder or the LSP parameter transmitted from the frame erasure concealment unit into an LPC; and a combination filter receiving the excitation signal output from the excitation signal decoder or the excitation signal output from the frame erasure concealment unit and outputting a combined speech signal using the LPC output from the converter.
According to other aspects of the present invention, there are provided computer-readable recording media encoded with processing instructions for causing a processor to execute the aforementioned methods of the present invention.
Additional and/or other aspects and advantages of the present invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
The above and/or other aspects and advantages of the present invention will become apparent and more readily appreciated from the following detailed description, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
A coded speech packet is input to the parameter transmitter 310 after an error check is performed, in which frames with errors are erased from the input coded speech packet.
The parameter transmitter 310 checks each of the frames of the input coded speech packet and transmits parameters included in the speech packet according to whether the frame is erased (or lost). If the speech packet is not received for a predetermined time, the parameter transmitter 310 can determine that frames included in a section corresponding to the predetermined time have been erased.
If the input coded speech packet is a good frame, the parameter transmitter 310 transmits to the excitation signal decoder 320 parameters necessary for decoding an excitation signal among parameters included in the received speech packet and transmits an LSP parameter (or an LSP coefficient) having ten roots to the LSP parameter decoder 330.
If the speech decoding apparatus is a code-excited linear prediction (CELP) speech decoding apparatus, the parameters necessary for decoding the excitation signal may include a pitch used for an adaptive codebook, a codebook index used for a fixed codebook, a gain value gp of the adaptive codebook, and a gain value gc of the fixed codebook.
The excitation signal decoder 320 decodes input parameters and outputs the excitation signal. The output excitation signal is transmitted to the combination filter 350. The LSP parameter decoder 330 decodes the input LSP parameter. The decoded LSP parameter is transmitted to the LSP/LPC converter 340. The LSP/LPC converter 340 converts the decoded LSP parameter into an LPC parameter. The converted LPC parameter is transmitted to the combination filter 350.
The combination filter 350 combination-filters the excitation signal using the LPC parameter and outputs a synthesis speech signal. The output synthesis speech signal is a recovered speech signal.
However, if the frame is erased (or lost), the parameter transmitter 310 transmits the LSP parameter of the previous good frame (PGF) or the LSP parameters of the PGF and the next good frame (NGF), and the parameters for decoding the excitation signal to the frame erasure concealment unit 360 in order to recover an LSP parameter of the erased (or lost) frame.
The frame erasure concealment unit 360 can recover the LSP parameter of the erased frame using an extrapolation method or an interpolation method with recovering the excitation signal.
The excitation signal recovering unit 401 receives the parameters for generating the excitation signal of the PGF transmitted from the parameter transmitter 310 of
The LSP/spectrum converter 402 receives an LSP parameter having ten roots of the PGF from the parameter transmitter 310 of
The spectrum recovering unit 403 transforms the spectrum envelope of the PGF using a predetermined method and recovers a spectrum envelope of the erased frame. The erased frame may be a current frame. The predetermined method can define, for example, so that the spectrum envelope of the PGF is spectral shifted to a predetermined region. The predetermined region is a low frequency region or a high frequency region to be shifted by degrees.
The spectrum recovering unit 403 transforms the spectrum envelope of the PGF using a weight determined according to the correlation between the erased frame and the PGF and outputs the transformed spectrum envelope as the recovered spectrum envelope of the erased frame.
The spectrum/LSP converter 404 receives the recovered spectrum envelope of the erased frame and converts the recovered spectrum envelope into an LSP parameter of the erased frame. The LSP parameter is then transmitted to the LSP/LPC converter 340 of
The LSP/spectrum converter 402 can convert the LSP parameter of the PGF into an LPC parameter, convert the LPC parameter into a Cepstrum of the PGF, and convert the Cepstrum into the spectrum region. In this case, the spectrum/LSP converter 404 can convert the recovered spectrum envelope of the erased frame into a Cepstrum of the erased frame, convert the Cepstrum into the LPC parameter of the erased frame, and convert the LPC parameter into the LSP parameter of the erased frame.
Alternatively, the LSP/spectrum converter 402 can convert the LSP parameter of the PGF into the LPC parameter and convert the LPC parameter into the spectrum region. In this case, the spectrum/LSP converter 404 can convert the recovered spectrum envelope of the erased frame into an auto-correlation coefficient (ACC) parameter of the erased frame, convert the ACC parameter into the LPC parameter of the erased frame, and convert the LPC parameter into the LSP parameter of the erased frame.
Alternatively, the LSP/spectrum converter 402 can convert the LSP parameter of the PGF into the LPC parameter, convert the LPC parameter into the Cepstrum of the PGF, and convert the Cepstrum into the spectrum region. In this case, the spectrum/LSP converter 404 can convert the recovered spectrum envelope of the erased frame into the ACC parameter of the erased frame, convert the ACC parameter into the LPC parameter of the erased frame, and convert the LPC parameter into the LSP parameter of the erased frame.
Alternatively, the LSP/spectrum converter 402 can convert the LSP parameter of the PGF into a pseudo_cepstrum (PCEP) of the PGF and convert the PCEP into the spectrum region. In this case, the spectrum/LSP converter 404 converts the recovered spectrum envelope of the erased frame into the PCEP of the erased frame and converts the PCEP into the LSP parameter of the erased frame.
An apparatus for recovering the LSP parameter of the erased frame according to an embodiment of the present invention shown in
The apparatus for recovering the LSP parameter of the erased frame according to an embodiment of the present invention shown in
The excitation signal recovering unit 501 receives the parameters for generating excitation signals of the PGF and the NGF transmitted from the parameter transmitter 310 of
The first LSP/spectrum converter 502 receives an LSP parameter having ten roots of the PGF from the parameter transmitter 310 of
The second LSP/spectrum converter 503 receives an LSP parameter having ten roots of the NGF from the parameter transmitter 310 of
The recovering unit 504 includes a first spectrum envelope transformer 506, a second spectrum envelope transformer 507, and a combiner 508.
The first spectrum envelope transformer 506 transforms the spectrum envelope of the PGF using a weight determined according to the correlation between the erased frame and the PGF, the correlation between the erased frame and the NGF, and the number of erased frames. The correlation is determined based on the proximity of the erased frame to the PGF and the NGF. The weight has a value from 0 to 1. If the erased frame is closer to the PGF, an input weight of the first spectrum envelope transformer 506 is greater than an input weight of the second spectrum envelope transformer 507. For example, if the input weight of the first spectrum envelope transformer 506 is w, the input weight of the second spectrum envelope transformer 507 is 1−w.
The second spectrum envelope transformer 507 transforms the spectrum envelope of the NGF using the weight.
The combiner 508 combines the transformed spectrum envelope of the PGF received from the first spectrum envelope transformer 506 and the spectrum envelope of the NGF received from the second spectrum envelope transformer 507. Such a combination may result in obtaining the sum of the two transformed spectrum envelopes. The combined spectrum envelope is the recovered spectrum envelope of the erased frame.
The spectrum/LSP converter 505 receives the spectrum envelope of the erased frame and converts the spectrum envelop into the LSP parameter. The LSP parameter is transmitted to the LSP/LPC converter 340. With the spectrum/LSP converter 404 of
Referring to
The recovering unit 704 nonlinearly matches the spectrum bands of the PGF and the NGF using a dynamic frequency warping (DFW) method, obtains a warping path and recovers the spectrum envelope of the erased frame based on the obtained warping path as shown in
The obtained spectrum envelope of the PGF is transformed using one of four conversion methods as described above for the spectrum recovering unit 403 of
The recovered spectrum envelope of the erased frame is converted into an LSP parameter (Operation 904) and the LSP parameter is provided as a recovered LSP parameter of the erased frame (Operation 905).
One of four conversion methods as described above for the LSP/spectrum converter 402 of
If the received speech packet does not have an erased frame (Operation 901), an LSP parameter of a current frame is decoded (Operation 906), and the decoded LSP parameter is provided as the LSP parameter of the current frame (Operation 907).
The obtained spectrum envelopes of the PGF and the NGF are used to recover a spectrum envelope of the erased frame (Operation 903) using one of the methods described above for the recovering unit 504 of
The recovered spectrum envelope of the erased frame is converted into an LSP parameter (Operation 1004) and the LSP parameter is provided as a recovered LSP parameter of the erased frame (Operation 1005).
One of four conversion methods described above for the LSP/spectrum converter 402 of
If the received speech packet does not have an erased frame (Operation 1001), an LSP parameter of a current frame is decoded (Operation 1006), and the decoded LSP parameter is provided as the LSP parameter of the current frame (Operation 1007).
Methods of the present invention can also be embodied as a computer readable storage medium including computer readable code. A computer readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, and optical data storage devices. The computer readable recording medium can also be a distributed ever network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
The above-described embodiments of the present invention can improve the quality of a recovered speech signal, be applied to a variety of technologies, and provide a method of recovering an LSP parameter for the easy development of an algorithm for speech decoding.
Although a few embodiments of the present invention have been shown and described, the present invention is not limited to the described embodiments. Instead, it would be appreciated by those skilled in the art that changes may be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
10-2005-0010992 | Feb 2005 | KR | national |
Number | Name | Date | Kind |
---|---|---|---|
5615298 | Chen | Mar 1997 | A |
5806027 | George et al. | Sep 1998 | A |
5907822 | Prieto, Jr. | May 1999 | A |
6377914 | Yeldener | Apr 2002 | B1 |
6665637 | Bruhn | Dec 2003 | B2 |
6665638 | Kang et al. | Dec 2003 | B1 |
6691082 | Aguilar et al. | Feb 2004 | B1 |
6775649 | DeMartin | Aug 2004 | B1 |
7117156 | Kapilow | Oct 2006 | B1 |
7269553 | Kang et al. | Sep 2007 | B2 |
7324937 | Thyssen et al. | Jan 2008 | B2 |
20010044727 | Nakatoh et al. | Nov 2001 | A1 |
20020091523 | Makinen et al. | Jul 2002 | A1 |
20030074197 | Chen | Apr 2003 | A1 |
20050154584 | Jelinek et al. | Jul 2005 | A1 |
20060206318 | Kapoor et al. | Sep 2006 | A1 |
20070027683 | Sung et al. | Feb 2007 | A1 |
20080249766 | Ehara | Oct 2008 | A1 |
Number | Date | Country |
---|---|---|
0 718 982 | Jun 1996 | EP |
WO 9806090 | Feb 1998 | WO |
9966494 | Dec 1999 | WO |
Number | Date | Country | |
---|---|---|---|
20060178872 A1 | Aug 2006 | US |