These and/or other aspects of the present general inventive concept will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to the embodiments of the present general inventive concept, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present general inventive concept by referring to the figures.
The core layer generation unit 100 generates a core layer that includes encoding information and restores a minimal quality of the speech signal. To achieve this, the core layer generation unit 100 filters an input speech signal using a linear prediction coding (LPC) method to produce an excitation signal corresponding to the speech signal.
The core layer generation unit 100 includes a preprocessor 102, an LPC analyzer 104, an LPC coefficient quantizer 106, a first synthesis filter 108, an adder 110, a first subtractor 112, a first perceptual weighting filter 114, a pitch analyzer 116, a pitch contribution remover 118, a fixed codebook 120, a codebook searcher 122, an adaptive codebook 124, a space determiner 130, an identifier generator 132, a gain quantizer 140, a first multiplier 141, and a second multiplier 142.
The preprocessor 102 removes a direct current (DC) component from a speech signal received via an input port IN. More specifically, the preprocessor 102 removes a noise component in a low frequency band by filtering the speech signal using a high pass filter included in the preprocessor 102.
The LPC analyzer 104 extracts an LPC coefficient from the speech signal from which the DC component has been removed by the preprocessor 102.
The LPC coefficient quantizer 106 vector-quantizes the LPC coefficient extracted by the LPC analyzer 104.
The first synthesis filter 108 generates a synthesized signal corresponding to an excited signal output by the adder 110, using the result of the vector quantization by the LPC coefficient quantizer 106.
The first subtractor 112 subtracts the synthesized signal output by the first synthesis filter 108 from the signal output by the speech signal output by the preprocessor 102.
The first perceptual weighting filter 114 filters the signal output by the first subtractor 112 so that the quantization noise of the signal becomes less than or equal to a masking threshold in order to utilize the masking effect of a human's hearing structure. The first perceptual weighting filter 114 generates a signal including a weight so as to minimize the quanitzation noise of the signal output by the first subtractor 112.
The pitch analyzer 116 divides the signal output by the first perceptual weighting filter 114 into a plurality of sub-frames and analyzes the pitch of each of the sub-frames so as to generate an index and a gain of the adaptive codebook 124.
The pitch contribution remover 118 detects a target signal needed to search for a fixed codebook vector corresponding to the signal output by the first perceptual weighting filter 114 from the fixed codebook 120, using the index of the adaptive codebook 124.
The fixed codebook 120 is configured by classifying combinations of possible pulse positions into a plurality of spaces.
As illustrated in
The first and second spaces 610 and 620 may be distinguished from each other according to whether possible pulse positions are even or odd.
Referring back to
The codebook searcher 122 searches the fixed codebook 120 for a fixed codebook vector corresponding to the target signal detected by the pitch contribution remover 118 and outputs an index and a gain of the fixed codebook 120. More specifically, the codebook searcher 122 searches for a fixed codebook vector that minimizes a mean square error (MSE) of the target signal.
When the codebook searcher 122 searches for the fixed codebook vector, a plurality of spaces included in the fixed codebook 120 are each searched. If the fixed codebook 120 is divided into the first and second spaces 610 and 620 (See
The space determiner 130 detects a least distorted fixed codebook vector from the fixed codebook vectors found in all of the spaces of the fixed codebook 120 by the codebook searcher 122 and outputs the space to which the detected fixed codebook vector belongs.
The identifier generator 132 generates an identifier indicating the space determined by the space determiner 130. For example, a bit “offset” illustrated in
The adaptive codebook 124 outputs an adaptive codebook vector corresponding to the index output by the pitch analyzer 116.
The gain quantizer 140 quantizes the gain of the fixed codebook 120 output by the codebook searcher 122 and the gain of the adaptive codebook 124 output by the pitch analyzer 116 and outputs the results of the quantizations. The gain quantizer 140 outputs a quantized gain Gc of the fixed codebook 120 to the first multiplier 141 and a quantized gain Gp of the adaptive codebook 124 to the second multiplier 142.
The first multiplier 141 multiplies the fixed codebook vector output by the fixed codebook 120 by the quantized gain Gc of the fixed codebook 120 received from the gain quantizer 140.
The second multiplier 142 multiplies the adaptive codebook vector output by the adaptive codebook 124 by the quantized gain Gp of the adaptive codebook 124 received from the gain quantizer 140.
The adder 110 adds the product received from the first multiplier 141 to the product received from the second multiplier 142.
The enhancement layer generation unit 150 generates an enhancement layer to serve as an additional bit other than a bit provided by the core layer generation unit 100 in order to enhance the restored quality of sound. For example, when the core layer provides a bit rate of 8 kbps, the enhancement layer may provide an additional bit rate of 4 kbps.
The enhancement layer generation unit 150 includes a second subtractor 152, a second perceptual weighting filter 154, a codebook searcher 156, a gain difference quantizer 158, a fixed codebook 160, a third multiplier 162, and a second synthesis filter 164.
The second subtractor 152 subtracts a result output by the second perceptual weighting filter 154 from a result output by the first subtractor 112.
The second perceptual weighting filter 154 performs a filtering operation so that quantization noise is less than or equal to a masking threshold in order to utilize the masking effect of a human's hearing structure. More specifically, the second perceptual weighting filter 154 produces a signal including a weight in order to minimize the quantization noise of the signal output by the second subtractor 152.
The fixed codebook 160 outputs a fixed codebook vector corresponding to an index obtained by the codebook searcher 156. The fixed codebook 160 of the enhancement layer generation unit 150 is divided into a plurality of spaces corresponding to the spaces (i.e., the first and second spaces 610 and 620 of
The codebook searcher 156 searches the fixed codebook 160 for a fixed codebook vector corresponding to the result of the filtering by the second perceptual weighting filter 154 and outputs an index and a gain of the fixed codebook 160.
When the codebook searcher 156 searches for the fixed codebook vector, spaces of the fixed codebook 160 excluding the space determined by the space determiner 130 of the core layer generation unit 100 are each searched. Accordingly, if each of the fixed codebooks 120 and 160 of the core layer generating unit 100 and the enhancement layer generation unit 150, respectively, is divided into the first and second spaces 610 and 620 (See
The gain difference quantizer 158 obtains a difference between the gain of the fixed codebook 160 output by the codebook searcher 156 of the enhancement layer generation unit 150 and the quantized gain Gc of the fixed codebook 120 output by the gain quantizer 140 of the core layer generation unit 100 and quantizes the difference. The gain difference quantizer 158 outputs the quantized gain difference Gce to the third multiplier 162 and the multiplexing unit 190.
The third multiplier 162 multiplies the fixed codebook vector output by the fixed codebook 160 of the enhancement layer generation unit 150 by the quantized gain difference Gce received from the gain difference quantizer 158.
The second synthesis filter 164 generates a synthesized signal corresponding to the product output by the third multiplier 162, using the result of the vector quantization by the LPC coefficient quantizer 106.
The multiplexing unit 190 generates a bitstream from the outputs of the LPC coefficient quantizer 106, the pitch analyzer 116, the codebook searcher 122, the identifier generator 132, the gain quantizer 140, the codebook searcher 156, and the gain difference quantizer 158. The multiplexing unit 190 then outputs the bitstream via an output port OUT.
The demultiplexing unit 200 receives a bitstream via an input port IN and analyzes the bitstream. The demultiplexing unit 200 outputs LPC coefficient quantization information to the LPC coefficient decoding unit 210, an index and identifier of a fixed codebook 222 to a fixed codebook decoder 224, an index of an adaptive codebook 226 to an adaptive codebook decoder 228, an index and identifier of a fixed codebook 232 to a fixed codebook decoder 234, gain quantization information to the gain decoding unit 240, and gain difference quantization information to the gain difference decoding unit 250.
The LPC coefficient decoding unit 210 decodes an LPC coefficient using the LPC coefficient quantization information received from the demultiplexing unit 200.
The core layer decoding unit 220 decodes a core layer. The core layer decoding unit 220 includes the fixed codebook 222, the fixed codebook decoder 224, the adaptive codebook 226, and the adaptive codebook decoder 228.
The fixed codebook 222 of the core layer decoding unit 220 is configured by classifying combinations of possible pulse positions into a plurality of spaces, as in the fixed codebooks 120 and 160 of the core layer generation unit 100 and the enhancement layer generation unit 150 of
The fixed codebook 222 may be configured by classifying combinations of possible pulse positions into the first spaces 610 and 620, as illustrated in
The first and second spaces 610 and 620 may be distinguished from each other according to whether the possible pulse positions are even or odd.
Referring back to
The adaptive codebook decoder 228 searches the adaptive codebook 226 for the codeword corresponding to the index output by the demultiplexing unit 200 and decodes the codeword.
The enhancement layer decoding unit 230 decodes an enhancement layer. The enhancement layer decoding unit 230 includes the fixed codebook 232 and the fixed codebook decoder 234.
The fixed codebook 232 is divided into a plurality of spaces corresponding to the spaces of the fixed codebook 222 of the core layer decoding unit 220.
The fixed codebook decoder 234 searches spaces of the fixed codebook 232 excluding the space determined by the fixed codebook decoder 224 of the core layer decoding unit 220 for a codeword corresponding to the index output by the demultiplexing unit 200 and decodes the found codeword. Accordingly, if each of the fixed codebooks 222 and 232 of the core layer decoding unit 220 and the enhancement layer decoding unit 230, respectively, is divided into the first and second spaces 610 and 620, and the first space 610 is determined by the fixed codebook decoder 224, the fixed codebook decoder 234 searches the second space 620 for the codeword. If the second space 620 is determined by the fixed codebook decoder 224, the fixed codebook decoder 234 searches the first space 610 for the codeword.
The gain decoding unit 240 decodes the gain quantization information received from the demultiplexing unit 200, the information including a fixed codebook gain Gc and an adaptive codebook gain Gp of the core layer, and outputs the fixed codebook gain Gc and the adaptive codebook gain Gp.
The gain difference decoding unit 250 decodes a difference between the gains of the fixed codebooks of the core layer and the enhancement layer output by the demultiplexing unit 200.
The first adder 260 adds a result output by the fixed codebook decoder 224 of the core layer decoding unit 220 to a result output by the fixed codebook decoder 234 of the enhancement layer decoding unit 230.
The first switching unit 270 selectively switches between the result output by the fixed codebook decoder 224 or a result of the addition by the first adder 260 according to a control signal.
The third adder 268 adds the fixed codebook gain Gc of the core layer output by the gain decoding unit 240 to a result output by the gain difference decoding unit 250.
The second switching unit 275 selectively switches between the fixed codebook gain Gc of the core layer output by the gain decoding unit 240 or the result of the addition by the third adder 268 according to a control signal.
The second multiplier 264 multiplies the result output by the first switching unit 270 by the result output by the second switching unit 275.
The first multiplier 262 multiplies the result of the decoding by the adaptive codebook decoder 228 by the adaptive codebook gain Gp output by the gain decoding unit 240.
The second adder 266 adds the result of the multiplication by the first multiplier 262 to the result of the multiplication by the second multiplier 264.
The synthesis filter 280 synthesizes the result of the addition by the second adder 266 using the decoded LPC coefficient received from the LPC coefficient decoding unit 210, to thereby restore the speech signal.
The postprocessing unit 290 improves the quality of the speech signal restored by the synthesis filter 280 and outputs the improved speech signal via an output port OUT. More specifically, the postprocessing unit 290 filters the restored speech signal using a high pass filter and the decoded LPC coefficient output by the LPC coefficient decoding unit 210, in order to improve the quality of the speech signal restored by the synthesis filter 280.
A codebook searching apparatus according to embodiments of the present general inventive concept is included in the speech signal encoding apparatus of
In operation 304, an LPC coefficient is extracted from the speech signal from which the DC component has been removed in the operation 302.
In operation 306, the LPC coefficient extracted in the operation 304 is vector quantized.
In operation 308, a subtractor subtracts a signal output by a synthesis filter of a core layer from the speech signal from which the DC component has been removed.
In operation 310, in order to use the masking effect of a human's hearing structure, a perceptual weighting filter of the core layer filters the result of the subtraction in the operation 308 so that quantization noise become less than or equal to a masking threshold. In the operation 310, a signal including a weight is generated so as to minimize the quantization noise of the signal output in the operation 308.
In operation 312, the signal filtered in the operation 310 is divided into a plurality of sub-frames, and the pitch of each of the sub-frames is analyzed to output an index and gain of an adaptive codebook.
In operation 314, a target signal needed to search a fixed codebook for a fixed codebook vector corresponding to the signal filtered in the operation 310 is detected using the index of the adaptive codebook output in the operation 312.
In operation 316, the fixed codebook is searched for a fixed codebook vector corresponding to the target signal detected in the operation 314. In the operation 316, a fixed codebook vector that minimizes a mean squared error (MSE) of the target signal is searched for.
The fixed codebook of the core layer is configured by classifying combinations of possible pulse positions into a plurality of spaces.
As illustrated in
The first and second spaces 610 and 620 may be distinguished from each other according to whether possible pulse positions are even or odd.
Referring back to
In operation 318, the least distorted fixed codebook vector is detected from the fixed codebook vectors found in the spaces of the fixed codebook of the core layer, and the space from which the detected fixed codebook vector is found is output. In the operation 318, an index and gain of the fixed codebook belonging to the determined space are output.
In operation 320, an identifier indicating the space determined in the operation 318 is generated. For example, the bit “offset” illustrated in
In operation 322, the gain of the fixed codebook output in the operation 318 and the gain of the adaptive codebook output in operation 312 are quantized to generate a quantized fixed codebook gain Gc and a quantized adaptive codebook gain Gp.
In operation 324, the fixed codebook vector detected in the operation 318 is multiplied by the quantized fixed codebook gain Gc generated in the operation 322.
In operation 326, the adaptive codebook vector detected in the operation 312 is multiplied by the quantized adaptive codebook gain Gp generated in the operation 322.
In operation 328, the result of the multiplication in the operation 324 is added to the result of the multiplication in the operation 326.
In operation 330, a synthesis filter outputs a synthetic signal corresponding to an excitation signal obtained in the operation 328, using the result of the vector quantization in operation 306.
After the operation 308, a signal corresponding to the result of the subtraction in the operation 308 is filtered so that quantization noise of the signal becomes less than or equal to a masking threshold, in order to utilize the masking effect of the human's hearing structure, in operation 354. In other words, in the operation 354, a signal including a weight is generated so as to minimize the quantization noise of the signal obtained in the operation 308.
In operation 356, a fixed codebook vector corresponding to the result of the filtering in the operation 354 is searched for in the fixed codebook. In the operation 356, an index and a gain of the fixed codebook vector found in the operation 356 are output.
The fixed codebook of the enhancement layer is divided into a plurality of spaces corresponding to the spaces of the fixed codebook of the core layer.
Upon the fixed codebook vector search in the operation 354, spaces of the fixed codebook of the enhancement layer excluding the space determined in the operation 318 are each searched. Accordingly, if each of the fixed codebooks of the core layer and the enhancement layer is divided into the first and second spaces 610 and 620 (See
In operation 358, a difference between the gain of the fixed codebook output in the operation 356 and the quantized gain Gc of the fixed codebook output in the operation 322 is obtained and quantized to generate a quantized gain difference Gce.
In operation 360, the fixed codebook vector output in the operation 356 is multiplied by the quantized gain difference Gce output in the operation 358.
In operation 362, a synthesis filter generates a synthesized signal corresponding to the result of the multiplication in the operation 360, using the result of the vector quantization in the operation 306.
In operation 380, a bitstream is generated from the results output in the operations 306, 312, 318, 320, 322, 356, and 358.
In operation 405, an LPC coefficient is decoded using the LPC coefficient quantization information output in the operation 400.
In operation 415, a to-be-searched space of the spaces of the fixed codebook of the core layer is determined using the identifier output in the operation 400, the determined space is searched for a codeword corresponding to the index output in the operation 400, and the codeword is decoded. Here, the identifier represents a specific space provided in the fixed codebook of the core layer as a bit “offset” illustrated in
The fixed codebook of the core layer is configured by classifying combinations of possible pulse positions into a plurality of spaces, as in the fixed codebook of the enhancement layer.
The fixed codebook of the core layer may be configured by classifying combinations of possible pulse positions into the first spaces 610 and 620, as illustrated in
The first and second spaces 610 and 620 may be distinguished from each other according to whether possible pulse positions are even or odd.
Referring back to
In operation 425, a codeword corresponding to the index of the fixed codebook of the enhancement layer output in the operation 400 is searched for in spaces of the fixed codebook of the enhancement layer excluding the space determined in the operation 415 and is decoded. Accordingly, if each of the fixed codebooks of the core layer and the enhancement layer is divided into the first and second spaces 610 and 620 (See
The fixed codebook of the enhancement layer is configured by classifying combinations of possible pulse positions into spaces corresponding to the spaces of the fixed codebook of the core layer.
In operation 430, the fixed codebook gain and the adaptive codebook gain output in the operation 400 are decoded.
In operation 435, a difference between the fixed codebook gains of the core layer and the enhancement layer output in the operation 400 is decoded.
In operation 440, a predetermined operation is executed on the results of the decoding in the operations 415, 420, 430, and 435.
In operation 445, the result of the operation performed in the operation 440 is synthesized in a synthesis filter using the decoded LPC coefficient output in the operation 405, to thereby restore the speech signal.
In the operation 450, the quality of the speech signal restored in the operation 445 is improved to thereby output an improved restored speech signal. More specifically, in the operation 450, the quality of the speech signal restored in the operation 445 is improved by filtering the restored speech signal using a high pass filter and the decoded LPC coefficient output in the operation 405.
A codebook searching method according to embodiments of the present general inventive concept is performed during the speech signal encoding method of
The first space 610 may include the possible positions of pulses that are highly likely to be searched for in a core layer.
The first and second spaces 610 and 620 may be distinguished from each other according to whether possible pulse positions are even or odd.
Referring back to
In operation 510, a distorted value D1 of the fixed codebook vector selected from the second space 620 of the fixed codebook of the core layer in the operation 500 is subtracted from a distorted value D0 of the fixed codebook vector selected from the first space 610 of the fixed codebook of the core layer in the operation 500.
In operation 520, it is determined whether a value D0-D1 corresponding to the result of the subtraction in the operation 510 is larger than 0.
In operation 530, if it is determined in the operation 520 that the value D0-D1 is larger than 0, an identifier of the first space 610 of the fixed codebook of the core layer is generated. Here, the identifier represents a specific space provided in the fixed codebook of the core layer as a bit “offset” illustrated in
After the operation 530, in operation 540, only the second space 620 of the fixed codebook of the enhancement layer is searched for a fixed codebook vector.
In operation 550, if it is determined in the operation 520 that the value D0-D1 is less than or equal to 0, an identifier of the second space 620 of the fixed codebook of the core layer is generated.
In operation 560, only the first space 610 of the fixed codebook of the enhancement layer is searched for a fixed codebook vector.
In a fixed codebook searching method and apparatus according to embodiments of the present general inventive concept and a speech signal encoding/decoding method and apparatus using the fixed codebook searching method and apparatus, in order to reduce a bit rate without degrading a performance in an enhancement layer based on CELP, each of a fixed codebook of a core layer and a fixed codebook of the enhancement layer is divided into a plurality of spaces. Accordingly, spaces of the fixed codebook of the enhancement layer excluding a space corresponding to the least distorted space determined from among the spaces of the fixed codebook of the core layer are searched.
By doing this, bits for positions values represented with underlining do not need to be allocated to the fixed codebooks of
The general inventive concept can be embodied as computer (which denotes any device having an information processing function) readable codes on a computer readable recording medium. The computer readable recording medium is any data storage device that can store programs or data which can be thereafter read by a computer system. Examples of the computer readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, hard disks, floppy disks, flash memory, optical data storage devices, and so on.
Although a few embodiments of the present general inventive concept have been shown and described, it will be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the general inventive concept, the scope of which is defined in the appended claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2006-47118 | May 2006 | KR | national |