Embodiments of the present invention relate to the field of communications technologies, and in particular, to a method for predicting a bandwidth extension frequency band signal, and a decoding device.
In the field of digital communications, there are extremely widespread application requirements for voice, picture, audio, and video transmission, such as a phone call, an audio and video conference, broadcast television, and multimedia entertainment. To reduce a resource occupied in a process of storing or transmitting an audio and video signal, an audio and video compression and encoding technology comes into existence. Many different technical branches emerge in the development of the audio and video compression and encoding technology, where a technology in which a signal is encoded and processed after being transformed from a time domain to a frequency domain is widely applied due to a good compression characteristic, and the technology is also referred to as a domain transformation encoding technology.
An increasing emphasis is placed on audio quality in communication transmission; therefore, there is a need to increase quality of a music signal as much as possible on a premise that voice quality is ensured. Meanwhile, the amount of information of an audio signal is extremely rich; therefore, a code excited linear prediction (CELP) encoding mode of conventional voice cannot be adopted; instead, generally, to process the audio signal, a time domain signal is transformed into a frequency domain signal using an audio encoding technology of domain transformation encoding, thereby enhancing encoding quality of the audio signal.
In an existing audio encoding technology, generally, by adopting a transformation technology, such as a fast Fourier transform (FFT) or a modified discrete cosine transform (MDCT) or a discrete cosine transform (DCT), a high frequency band signal in an audio signal is transformed from a time domain signal to a frequency domain signal, and then, the frequency domain signal is encoded.
In the case of a low bit rate, limited quantization bits cannot quantize all to-be-quantized audio signals; therefore, an encoding device uses most bits to precisely quantize relatively important low frequency band signals in audio signals, that is, quantization parameters of the low frequency band signals occupy most bits, and only a few bits are used to roughly quantize and encode high frequency band signals in the audio signals to obtain frequency envelopes of the high frequency band signals. Then, the frequency envelopes of the high frequency band signals and the quantization parameters of the low frequency band signals are sent to a decoding device in a form of a bitstream. The quantization parameters of the low frequency band signals may include excitation signals and frequency envelopes. When being quantized, the low frequency band signals may first also be transformed from time domain signals to frequency domain signals, and then, the frequency domain signals are quantized and encoded into excitation signals.
Generally, the decoding device may restore the low frequency band signals according to the quantization parameters that are of the low frequency band signals and in the received bitstream, then acquire the excitation signals of the low frequency band signals according to the low frequency band signals, predict excitation signals of the high frequency band signals using a bandwidth extension (BWE) technology and a spectrum filling technology and according to the excitation signals of the low frequency band signals, and modify the predicted excitation signals of the high frequency band signals according to the frequency envelopes that are of the high frequency band signals and in the bitstream, to obtain the predicted high frequency band signals. Herein, the obtained high frequency band signals are frequency domain signals.
In the BWE technology, a highest frequency bin to which a bit is allocated may be a highest frequency bin to which an excitation signal is decoded, that is, no excitation signal is decoded on a frequency bin greater than the highest frequency bin. A frequency band greater than the highest frequency bin to which a bit is allocated may be referred to as a high frequency band, and a frequency band less than the highest frequency bin to which a bit is allocated may be referred to as a low frequency band. That an excitation signal of a high frequency band signal is predicted according to an excitation signal of a low frequency band signal may be as follows. The highest frequency bin to which a bit is allocated is used as a center, an excitation signal that is of the low frequency band signal and less than the highest frequency bin to which a bit is allocated is copied into a high frequency band signal that is greater than the highest frequency bin to which a bit is allocated and whose bandwidth is equivalent to bandwidth of the low frequency band signal, and the excitation signal is used as the excitation signal of the high frequency band signal.
In a process of implementing the present invention, the inventor finds that at least the following problem exists in the prior art. According to the foregoing method for predicting a bandwidth extension frequency band signal in the prior art, an excitation signal of a high frequency band signal is predicted according to an excitation signal of a low frequency band signal, excitation signals of different low frequency band signals may be copied into a same high frequency band signal in different frames, causing discontinuity of excitation signal and reducing quality of the predicted bandwidth extension frequency band signal, thereby reducing auditory quality of an audio signal.
Embodiments of the present invention provide a method for predicting a bandwidth extension frequency band signal, and a decoding device, so as to improve quality of the predicted bandwidth extension frequency band signal, thereby enhancing auditory quality of an audio signal.
According to a first aspect, an embodiment of the present invention provides a method for predicting a bandwidth extension frequency band signal. The method includes demultiplexing a received bitstream and decoding the demultiplexed bitstream to obtain a frequency domain signal; determining whether a highest frequency bin, to which a bit is allocated, of the frequency domain signal is less than a preset start frequency bin of a bandwidth extension frequency band; when the highest frequency bin to which a bit is allocated is less than the preset start frequency bin of the bandwidth extension frequency band, predicting an excitation signal of the bandwidth extension frequency band according to an excitation signal within a predetermined frequency band range of the frequency domain signal and the preset start frequency bin of the bandwidth extension frequency band; when the highest frequency bin to which a bit is allocated is greater than or equal to the preset start frequency bin of the bandwidth extension frequency band, predicting the excitation signal of the bandwidth extension frequency band according to the excitation signal within the predetermined frequency band range of the frequency domain signal, the preset start frequency bin of the bandwidth extension frequency band, and the highest frequency bin to which a bit is allocated; and predicting the bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and a frequency envelope of the bandwidth extension frequency band.
With reference to the first aspect, in a first implementation manner of the first aspect, predicting an excitation signal of the bandwidth extension frequency band according to an excitation signal within a predetermined frequency band range of the frequency domain signal and the preset start frequency bin of the bandwidth extension frequency band includes making n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal, and using the n copies of the excitation signal as an excitation signal between the preset start frequency bin of the bandwidth extension frequency band and a highest frequency bin of the bandwidth extension frequency band, where n is an integer or a non-integer greater than 0, and n is equal to a ratio of a quantity of frequency bins between the preset start frequency bin of the bandwidth extension frequency band and the highest frequency bin of the bandwidth extension frequency band to a quantity of frequency bins within the predetermined frequency band range of the frequency domain signal.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in a second implementation manner of the first aspect, making n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal, and using the n copies of the excitation signal as an excitation signal between the preset start frequency bin of the bandwidth extension frequency band and a highest frequency bin of the bandwidth extension frequency band includes, when the prediction is started from the preset start frequency bin of the bandwidth extension frequency band, sequentially making integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal and non-integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal, and using the two parts of excitation signals as the excitation signal between the preset start frequency bin of the bandwidth extension frequency band and the highest frequency bin of the bandwidth extension frequency band, where the non-integer part of n is less than 1; or, when the prediction is started from the highest frequency bin of the bandwidth extension frequency band, sequentially making non-integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal and integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal, and using the two parts of excitation signals as the excitation signal between the preset start frequency bin of the bandwidth extension frequency band and the highest frequency bin of the bandwidth extension frequency band, where the non-integer part of n is less than 1.
With reference to the first aspect, in a third implementation manner of the first aspect, predicting the excitation signal of the bandwidth extension frequency band according to the excitation signal within the predetermined frequency band range of the frequency domain signal, the preset start frequency bin of the bandwidth extension frequency band, and the highest frequency bin, to which a bit is allocated, of the frequency domain signal includes making a copy of an excitation signal from the mth frequency bin fexc
With reference to the first aspect and the foregoing implementation manners of the first aspect, in a fourth implementation manner of the first aspect, making a copy of an excitation signal from the mth frequency bin fexc
With reference to the first aspect and the foregoing implementation manners of the first aspect, in a fifth implementation manner of the first aspect, before the predicting of the bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and a frequency envelope of the bandwidth extension frequency band, the method further includes decoding the bitstream to obtain the frequency envelope of the bandwidth extension frequency band.
With reference to the first aspect and the foregoing implementation manners of the first aspect, in a sixth implementation manner of the first aspect, before the predicting of the bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and a frequency envelope of the bandwidth extension frequency band, the method further includes decoding the bitstream to obtain a signal type; and acquiring the frequency envelope of the bandwidth extension frequency band according to the signal type.
With reference to the first aspect and the foregoing implementation manners of the first aspect, in a seventh implementation manner of the first aspect, acquiring the frequency envelope of the bandwidth extension frequency band according to the signal type includes, when the signal type is a non-harmonic signal, demultiplexing the received bitstream, and decoding the demultiplexed bitstream to obtain the frequency envelope of the bandwidth extension frequency band; or, when the signal type is a harmonic signal, demultiplexing the received bitstream, decoding the demultiplexed bitstream to obtain an initial frequency envelope of the bandwidth extension frequency band, and using a value that is obtained by performing weighting calculation on the initial frequency envelope and N adjacent initial frequency envelopes as the frequency envelope of the bandwidth extension frequency band, where N is greater than or equal to 1.
According to a second aspect, an embodiment of the present invention provides a decoding device, including a decoding module configured to demultiplex a received bitstream and decode the demultiplexed bitstream to obtain a frequency domain signal; a determining module configured to determine whether a highest frequency bin, to which a bit is allocated, of the frequency domain signal is less than a preset start frequency bin of a bandwidth extension frequency band; a first processing module configured to, when the determining module determines that the highest frequency bin to which a bit is allocated is less than the preset start frequency bin of the bandwidth extension frequency band, predict an excitation signal of the bandwidth extension frequency band according to an excitation signal within a predetermined frequency band range of the frequency domain signal and the preset start frequency bin of the bandwidth extension frequency band; a second processing module configured to, when the determining module determines that the highest frequency bin to which a bit is allocated is greater than or equal to the preset start frequency bin of the bandwidth extension frequency band, predict the excitation signal of the bandwidth extension frequency band according to the excitation signal within the predetermined frequency band range of the frequency domain signal, the preset start frequency bin of the bandwidth extension frequency band, and the highest frequency bin to which a bit is allocated; and a predicting module configured to predict a bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and a frequency envelope of the bandwidth extension frequency band.
With reference to the second aspect, in a first implementation manner of the second aspect, the first processing module is configured to make n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal and use the n copies of the excitation signal as an excitation signal between the preset start frequency bin of the bandwidth extension frequency band and a highest frequency bin of the bandwidth extension frequency band, where n is an integer or a non-integer greater than 0, and n is equal to a ratio of a quantity of frequency bins between the preset start frequency bin of the bandwidth extension frequency band and the highest frequency bin of the bandwidth extension frequency band to a quantity of frequency bins within the predetermined frequency band range of the frequency domain signal.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a second implementation manner of the second aspect, the first processing module is configured to, when the prediction is started from the preset start frequency bin of the bandwidth extension frequency band, sequentially make integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal and non-integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal, and use the two parts of excitation signals as the excitation signal between the preset start frequency bin of the bandwidth extension frequency band and the highest frequency bin of the bandwidth extension frequency band, where the non-integer part of n is less than 1; or the first processing module is configured to, when the prediction is started from the highest frequency bin of the bandwidth extension frequency band, sequentially make non-integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal and integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal, and use the two parts of excitation signals as the excitation signal between the preset start frequency bin of the bandwidth extension frequency band and the highest frequency bin of the bandwidth extension frequency band, where the non-integer part of n is less than 1.
With reference to the second aspect, in a third implementation manner of the second aspect, the second processing module is configured to make a copy of an excitation signal from the mth frequency bin above a start frequency bin fexc
With reference to the second aspect and the foregoing implementation manners of the second aspect, in a fourth implementation manner of the second aspect, the second processing module is configured to, when the prediction is started from the highest frequency bin to which a bit is allocated, sequentially make a copy of the excitation signal from the fexc
With reference to the second aspect and the foregoing implementation manners of the second aspect, in a fifth implementation manner of the second aspect, the decoding module is further configured to, before the predicting module predicts the bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and the frequency envelope of the bandwidth extension frequency band, decode the bitstream to obtain the frequency envelope of the bandwidth extension frequency band.
With reference to the second aspect and the foregoing implementation manners of the second aspect, in a sixth implementation manner of the second aspect, the device further includes an acquiring module, where the decoding module is further configured to, before the predicting module predicts the bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and the frequency envelope of the bandwidth extension frequency band, decode the bitstream to obtain a signal type; and the acquiring module is configured to acquire the frequency envelope of the bandwidth extension frequency band according to the signal type.
With reference to the second aspect and the foregoing implementation manners of the second aspect, in a seventh implementation manner of the second aspect, the acquiring module is configured to, when the signal type is a non-harmonic signal, demultiplex the received bitstream and decode the demultiplexed bitstream to obtain the frequency envelope of the bandwidth extension frequency band; or the acquiring module is configured to, when the signal type is a harmonic signal, demultiplex the received bitstream, decode the demultiplexed bitstream to obtain an initial frequency envelope of the bandwidth extension frequency band, and use a value that is obtained by performing weighting calculation on the initial frequency envelope and N adjacent initial frequency envelopes as the frequency envelope of the bandwidth extension frequency band, where N is greater than or equal to 1.
According to the method for predicting a bandwidth extension frequency band signal, and the decoding device in the embodiments of the present invention, a start frequency bin of bandwidth extension is set, and a highest frequency bin to which a frequency domain signal is decoded and the start frequency bin are compared, to perform excitation restoration of a bandwidth extension frequency band, so that extended excitation signals are continuous between frames, and a frequency bin of a decoded excitation signal is maintained, thereby ensuring auditory quality of a restored bandwidth extension frequency band signal and enhancing auditory quality of an output audio signal.
To describe the technical solutions in the embodiments of the present invention or in the prior art more clearly, the following briefly introduces the accompanying drawings required for describing the embodiments or the prior art. The accompanying drawings in the following description show some embodiments of the present invention, and a person of ordinary skill in the art may still derive other drawings from these accompanying drawings without creative efforts.
To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the following clearly and completely describes the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. The described embodiments are some but not all of the embodiments of the present invention. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts shall fall within the protection scope of the present invention.
In the field of digital signal processing, an audio coder-decoder (codec) and a video codec are widely applied to various electronic devices such as a mobile phone, a wireless apparatus, a personal data assistant (PDA), a handheld or portable computer, a global positioning system (GPS) receiver/navigator, a camera, an audio/video player, a camcorder, a videorecorder, and a monitoring device. Generally, this type of electronic device includes an audio coder or an audio decoder, where the audio coder or decoder may be directly implemented by a digital circuit or a chip such as a digital signal processor (DSP), or be implemented by driving, by software code, a processor to execute a process in the software code.
For example, an audio encoder first performs framing processing on an input signal to obtain time domain data with one frame being 20 milliseconds (ms), then performs windowing processing on the time domain data to obtain a signal after windowing, performs frequency domain transformation on the time domain signal after windowing, to transform the signal from a time domain to a frequency domain, encodes the frequency domain signal, and transmits the encoded frequency domain signal to a decoder side. After receiving a compressed bitstream transmitted by an encoder side, the decoder side performs a corresponding decoding operation on the signal, performs, on a frequency domain signal obtained by decoding inverse transformation corresponding to the transformation used by the encoding end, to transform the signal from frequency domain to time domain, and performs post processing on the time domain signal to obtain a synthesized signal, that is, a signal output by the decoder side.
As shown in
As shown in
100. The decoding device demultiplexes a received bitstream and decodes the demultiplexed bitstream to obtain a frequency domain signal.
101. The decoding device determines whether a highest frequency bin, to which a bit is allocated, of the frequency domain signal is less than a preset start frequency bin of a bandwidth extension frequency band; when the highest frequency bin to which a bit is allocated is less than the preset start frequency bin of the bandwidth extension frequency band, execute step 102; otherwise, when the highest frequency bin to which a bit is allocated is greater than or equal to the preset start frequency bin of the bandwidth extension frequency band, execute step 103.
102. The decoding device predicts an excitation signal of the bandwidth extension frequency band according to an excitation signal within a predetermined frequency band range of the frequency domain signal and the preset start frequency bin of the bandwidth extension frequency band, and executes step 104.
103. The decoding device predicts the excitation signal of the bandwidth extension frequency band according to the excitation signal within the predetermined frequency band range of the frequency domain signal, the preset start frequency bin of the bandwidth extension frequency band, and the highest frequency bin to which a bit is allocated, and executes step 104.
104. The decoding device predicts the bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and a frequency envelope of the bandwidth extension frequency band.
According to the method for predicting a bandwidth extension frequency band signal in this embodiment, a start frequency bin of bandwidth extension is set, and a highest frequency bin to which a frequency domain signal is decoded and the start frequency bin are compared, to perform excitation restoration of a bandwidth extension frequency band, so that extended excitation signals are continuous between frames, and a frequency bin of a decoded excitation signal is maintained, thereby ensuring auditory quality of a restored bandwidth extension frequency band signal and enhancing auditory quality of an output audio signal.
Optionally, on the basis of the technical solutions of the foregoing embodiment, the following extension technical solutions may also be included to form an extended embodiment of the embodiment shown in
For a specific process of acquiring the excitation signal of the low frequency band signal by the decoding device according to the quantization parameter of the low frequency band signal, refer to the prior art. For example, when the quantization parameter of the low frequency band signal is the excitation signal of the low frequency band signal and a frequency envelope of the low frequency band signal, the decoding device acquiring an excitation signal of the low frequency band signal according to the quantization parameter of the low frequency band signal may be as follows. The decoding device first restores the low frequency band signal (herein, the low frequency band signal is a frequency domain signal) according to the excitation signal of the low frequency band signal and the frequency envelope of the low frequency band signal, and then performs self-adaptive normalization processing on the low frequency band signal, to obtain the excitation signal of the low frequency band signal. When using the excitation signal that is of the low frequency band signal and in the quantization parameter to predict the excitation signal of the bandwidth extension frequency band can meet an energy requirement of a high frequency band signal, the excitation signal that is of the low frequency band signal and in the quantization parameter may be directly used to predict the excitation signal of the bandwidth extension frequency band.
The foregoing manner of self-adaptive normalization processing may use the following several manners. (1) The decoding device restores the low frequency band signal by using the decoded quantization parameter of the low frequency band signal (such as the excitation signal of the low frequency band signal and the frequency envelope of the low frequency band signal), a moving window is set in a frequency domain coefficient, an average value of frequency domain coefficient amplitudes in each moving window is calculated, where a quantity of calculated average values is the same as a quantity of frequency domain coefficients of the low frequency band signal, and the low frequency band signal (the frequency domain signal) is divided by a corresponding average value of frequency domain coefficient amplitudes, to obtain the excitation signal of the low frequency band signal. For example, the low frequency band signal has N1 frequency domain coefficients. An average value of the first frequency domain coefficient to the tenth frequency domain coefficient is calculated, an average value of the second frequency domain coefficient to the eleventh frequency domain coefficient is calculated, and an average value of the third frequency domain coefficient to the twelfth frequency domain coefficient is calculated. By analogy, N1 average values are calculated. Then, N1 low frequency band signals (frequency domain signals) are divided by corresponding average values, to obtain the excitation signal of the low frequency band signal (the frequency domain signal). (2) The decoding device restores the low frequency band signal (the frequency domain signal) by decoding the quantization parameter of the low frequency band signal (such as the excitation signal of the low frequency band signal and the frequency envelope of the low frequency band signal). For a harmonic signal, an average value of N (N>1) adjacent frequency envelopes of the low frequency band signal is calculated and used as a frequency envelope of N adjacent sub-bands, and all frequency domain signals of the N adjacent sub-bands are divided by the average value, to obtain an excitation signal of the low frequency band signals of the N adjacent sub-bands. By analogy, the excitation signal of the entire low frequency band signal is calculated. For a non-harmonic signal, each sub-band of the low frequency band signal is further divided into M (M>1) small sub-bands, a frequency envelope is further calculated for each small sub-band, and a frequency domain signal of the small sub-band is divided by the calculated frequency envelope of the small sub-band, to obtain an excitation signal of the small sub-band. By analogy, the excitation signal of the entire low frequency band signal is obtained. For a detailed process of self-adaptive normalization processing, refer to records in the prior art. Details are not described herein again.
Optionally, in this extended embodiment, before step 104, the method may further include the following step. The decoding device decodes the bitstream to obtain the frequency envelope of the bandwidth extension frequency band so that step 104 can be executed.
Optionally, before step 104, the method may further include the following step. The decoding device decodes the bitstream to obtain a signal type and acquires the frequency envelope of the bandwidth extension frequency band according to the signal type.
For example, when the signal type is a non-harmonic signal, the decoding device demultiplexes the received bitstream and decodes the demultiplexed bitstream to obtain the frequency envelope of the bandwidth extension frequency band. When the signal type is a harmonic signal, the decoding device demultiplexes the received bitstream, decodes the demultiplexed bitstream to obtain an initial frequency envelope of the bandwidth extension frequency band, and uses a value that is obtained by performing weighting calculation on the initial frequency envelope and N adjacent initial frequency envelopes as the frequency envelope of the bandwidth extension frequency band, where N is greater than or equal to 1.
Using the method for predicting a bandwidth extension frequency band signal in the foregoing embodiment, continuity of predicted excitation signals that are of a bandwidth extension frequency band signal and between a former frame and a latter frame can be effectively ensured, thereby ensuring auditory quality of a restored bandwidth extension frequency band signal and enhancing auditory quality of an audio signal.
200. A decoding device receives a bitstream sent by an encoding device and decodes the received bitstream to obtain a frequency domain signal.
The bitstream carries a quantization parameter of a low frequency band signal and a frequency envelope of the bandwidth extension frequency band signal.
201. The decoding device acquires an excitation signal of the low frequency band signal according to the quantization parameter of the low frequency band signal.
202. The decoding device determines a highest frequency flast
In this embodiment, the flast
203. The decoding device determines whether the flast
Referring to schematic diagrams of frequency bins in a frequency band in
In this embodiment, the fbwe
Returning to
In this embodiment, the predetermined frequency band range of the frequency domain signal is a predetermined frequency band range that is from the fexc
For example, the decoding device may make n copies of the excitation signal within the predetermined frequency band range from the fexc
For example, in an implementation, when the prediction is started from the preset start frequency fbwe
The highest frequency bin of the bandwidth extension frequency band refers to a highest frequency, at which a signal needs to be output, of a frequency band or a specified frequency. For example, a wideband signal may be 7 kHz or 8 kHz, and an ultra-wideband signal may be 14 kHz or 16 kHz or another preset specific frequency.
In this embodiment, that when the prediction is started from the preset start frequency fbwe
In this embodiment, the n copies of the excitation signal within the predetermined frequency band range from the fexc
Alternatively, when the prediction is started from the preset highest frequency ftop
When the prediction is started from the highest frequency ftop
In the foregoing two manners, regardless of whether to predict the excitation signal of the bandwidth extension frequency band between the start frequency fbwe
In an implementation process of the foregoing solution, a quotient and a remainder may first be calculated and acquired by dividing a frequency bandwidth between the preset start frequency fbwe
205. The decoding device predicts the excitation signal of the bandwidth extension frequency band according to the excitation signal within a range from the fexc
For example, the decoding device may make a copy of an excitation signal from the mth frequency bin above the start frequency bin fexc
For example, when the prediction is started from the highest frequency flast
In an implementation, when the prediction is started from the highest frequency flast
Alternatively, when the prediction is started from the highest frequency ftop
In an implementation, when the prediction is started from the highest frequency ftop
When the decoding device performs prediction starting from the highest frequency ftop
In the foregoing two manners, regardless of whether to predict the excitation signal of the bandwidth extension frequency band between the highest frequency flast
In addition, in the foregoing solution, when a bandwidth from the (fexc
In an implementation process of the foregoing solution, a quotient and a remainder may first be calculated and acquired by dividing a difference between (fexc
For example, when the encoding rate is 24 kbps, the preset start frequency fbwe
The highest frequency bin of the bandwidth extension frequency band is determined according to a type of the frequency domain signal. For example, when the type of the frequency domain signal is an ultra-wideband signal, the highest frequency ftop
206. The decoding device predicts the bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and a frequency envelope of the bandwidth extension frequency band.
It may be found from the foregoing prediction of the excitation signal of the bandwidth extension frequency band that although start frequency bins of bandwidth extension in the Nth frame and (N+1)th frame are different, an excitation signal of a same frequency band greater than 8 kHz is predicted from an excitation signal of a same frequency band of the low frequency band signal; therefore, continuity between frames can be ensured. Then, step 206 is used so as to implement accurate prediction of the bandwidth extension frequency band.
Using the technical solutions of the foregoing embodiment, continuity of predicted excitation signals that are of a bandwidth extension frequency band signal and between a former frame and a latter frame can be effectively ensured, thereby ensuring auditory quality of a restored bandwidth extension frequency band signal and enhancing auditory quality of an audio signal.
A person of ordinary skill in the art may understand that all or a part of the steps of the foregoing method embodiments may be implemented by a program instructing relevant hardware. The program may be stored in a computer readable storage medium. When the program runs, the steps of the foregoing method embodiments are performed. The foregoing storage medium includes any medium that can store program code, such as a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disc.
The decoding module 30 is configured to demultiplex a received bitstream and decode the demultiplexed bitstream to obtain a frequency domain signal. The determining module 31 is connected to the decoding module 30, and the determining module 31 is configured to determine whether a highest frequency bin, to which a bit is allocated, of the frequency domain signal obtained by decoding by the decoding module 30 is less than a preset start frequency bin of a bandwidth extension frequency band. The first processing module 32 is connected to the determining module 31, and the first processing module 32 is configured to, when the determining module 31 determines that the highest frequency bin to which a bit is allocated is less than the preset start frequency bin of the bandwidth extension frequency band, predict an excitation signal of the bandwidth extension frequency band according to an excitation signal within a predetermined frequency band range of the frequency domain signal and the preset start frequency bin of the bandwidth extension frequency band. The second processing module 33 is also connected to the determining module 31, and the second processing module 33 is configured to, when the determining module 31 determines that the highest frequency bin to which a bit is allocated is greater than or equal to the preset start frequency bin of the bandwidth extension frequency band, predict the excitation signal of the bandwidth extension frequency band according to the excitation signal within the predetermined frequency band range of the frequency domain signal, the preset start frequency bin of the bandwidth extension frequency band, and the highest frequency bin to which a bit is allocated. The predicting module 34 is connected to the first processing module 32 or the second processing module 33. When the determining module 31 determines that the highest frequency bin to which a bit is allocated is less than the preset start frequency bin of the bandwidth extension frequency band, the predicting module 34 is connected to the first processing module 32. When the determining module 31 determines that the highest frequency bin to which a bit is allocated is greater than or equal to the preset start frequency bin of the bandwidth extension frequency band, the predicting module 34 is connected to the second processing module 33. The predicting module 34 is configured to predict a bandwidth extension frequency band signal according to the excitation signal that is of the bandwidth extension frequency band and is predicted by the first processing module 32 or the second processing module 33 and a frequency envelope of the bandwidth extension frequency band.
According to the decoding device in this embodiment, an implementation process of using the foregoing modules to implement prediction of a bandwidth extension frequency band signal is the same as an implementation process in the foregoing related method embodiments. For details, refer to the records of the foregoing related method embodiments. Details are not described herein again.
According to the decoding device in this embodiment, by using the foregoing modules, a start frequency bin of bandwidth extension is set, and a highest frequency bin to which a frequency domain signal is decoded and the start frequency bin are compared to perform excitation restoration of a bandwidth extension frequency band so that extended excitation signals are continuous between frames, and a frequency bin of a decoded excitation signal is maintained, thereby ensuring auditory quality of a restored bandwidth extension frequency band signal and enhancing auditory quality of an output audio signal.
As shown in
Further optionally, in this embodiment, the first processing module 32 in the decoding device is configured to, when the prediction is started from the preset start frequency bin of the bandwidth extension frequency band, sequentially make integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal and non-integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal, and use the two parts of excitation signals as the excitation signal between the preset start frequency bin of the bandwidth extension frequency band and the highest frequency bin of the bandwidth extension frequency band, where the non-integer part of n is less than 1; or the first processing module 32 is configured to, when the prediction is started from the highest frequency bin of the bandwidth extension frequency band, sequentially make non-integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal and integer copies in the n copies of the excitation signal within the predetermined frequency band range of the frequency domain signal, and use the two parts of excitation signals as the excitation signal between the preset start frequency bin of the bandwidth extension frequency band and the highest frequency bin of the bandwidth extension frequency band, where the non-integer part of n is less than 1.
Optionally, in this embodiment, the second processing module 33 in the decoding device is configured to make a copy of an excitation signal from the mth frequency bin above a start frequency bin fexc
Further optionally, in this embodiment, the second processing module 33 in the decoding device is configured to, when the prediction is started from the highest frequency bin to which a bit is allocated, sequentially make a copy of an excitation signal within a frequency band range, from the fexc
Optionally, in this embodiment, the decoding module 30 is further configured to, before the predicting module 34 predicts the bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and the frequency envelope of the bandwidth extension frequency band, decode the bitstream to obtain the frequency envelope of the bandwidth extension frequency band. In this case, the corresponding predicting module 34 is further connected to the decoding module 30, and the predicting module 34 is configured to predict the bandwidth extension frequency band signal according to the excitation signal that is of the bandwidth extension frequency band and is predicted by the first processing module 32 or the second processing module 33 and the frequency envelope that is of the bandwidth extension frequency band and is obtained by decoding by the decoding module 30.
Further optionally, in this embodiment, the decoding device further includes an acquiring module 35.
The decoding module 30 is further configured to, before the predicting module 34 predicts the bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and the frequency envelope of the bandwidth extension frequency band, decode the bitstream to obtain a signal type. The acquiring module 35 is connected to the decoding module 30, and the acquiring module 35 is configured to acquire the frequency envelope of the bandwidth extension frequency band according to the signal type obtained by decoding by the decoding module 30. In this case, the corresponding predicting module 34 is connected to the acquiring module 35, and the predicting module 34 is configured to predict the bandwidth extension frequency band signal according to the excitation signal that is of the bandwidth extension frequency band and is predicted by the first processing module 32 or the second processing module 33 and the frequency envelope that is of the bandwidth extension frequency band and is obtained by the acquiring module 35.
Further optionally, the acquiring module 35 is configured to, when the signal type obtained by decoding by the decoding module 30 is a non-harmonic signal, demultiplex the received bitstream, and decode the demultiplexed bitstream to obtain the frequency envelope of the bandwidth extension frequency band; or the acquiring module 35 is configured to, when the signal type obtained by decoding by the decoding module 30 is a harmonic signal, demultiplex the received bitstream, and decode the demultiplexed bitstream to obtain an initial frequency envelope of the bandwidth extension frequency band, and use a value that is obtained by performing weighting calculation on the initial frequency envelope and N adjacent initial frequency envelopes as the frequency envelope of the bandwidth extension frequency band, where N is greater than or equal to 1.
According to the decoding device in the foregoing embodiment, the present invention is introduced using all of the foregoing optional technical solutions as examples. In an actual application, all of the foregoing optional technical solutions may be randomly combined to form an optional embodiment of the present invention in a random combination manner. Details are not described herein again.
According to the decoding device in the foregoing embodiment, an implementation process of using the foregoing modules to implement prediction of a bandwidth extension frequency band signal is the same as an implementation process in the foregoing related method embodiments. For details, refer to the records of the foregoing related method embodiments. Details are not described herein again.
According to the decoding device in the foregoing embodiment, by using the foregoing modules, a start frequency bin of bandwidth extension is set, and a highest frequency bin to which a frequency domain signal is decoded and the start frequency bin are compared, to perform excitation restoration of a bandwidth extension frequency band, so that extended excitation signals are continuous between frames, and a frequency bin of a decoded excitation signal is maintained, thereby ensuring auditory quality of a restored bandwidth extension frequency band signal and enhancing auditory quality of an output audio signal.
Functions of the decoding device shown in
The decoding device in this embodiment of the present invention may be used together with the encoding device shown in
The methods disclosed in the foregoing embodiments of the present invention may be applied to the decoding processor 803 or implemented by the decoding processor 803. The decoding processor 803 may be an integrated circuit chip and has a signal processing capability. In an implementation process, steps in the foregoing method embodiments may be completed by an integrated logic circuit of hardware in the decoding processor 803 or instructions in a form of software. These instructions may be implemented and controlled by working with the processing unit 804. The foregoing decoding processor may be a general purpose processor, a DSP, an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA) or another programmable logic component, a discrete gate or a transistor logic component, or a discrete hardware component. The methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or performed. The general purpose processor may be a microprocessor, or the processor may be any conventional processor, translator, or the like. Steps of the methods disclosed with reference to the embodiments of the present invention may be directly executed and accomplished by a decoding processor embodied as hardware, or may be executed and accomplished using a combination of hardware and software modules in the decoding processor. The software module may be located in a mature storage medium in the art, such as a RAM, a flash memory, a ROM, a programmable ROM, an electrically-erasable programmable memory, or a register. The storage medium is located in the memory 805. The decoding processor 803 reads information from the memory 805, and completes the steps of the foregoing methods in combination with the hardware.
For example, the signal decoding device in
The memory 805 stores instructions to enable the processing unit 804 or the decoding processor 803 to implement the following operations: Demultiplexing a received bitstream and decoding the demultiplexed bitstream to obtain a frequency domain signal; determining whether a highest frequency bin, to which a bit is allocated, of the frequency domain signal is less than a preset start frequency bin of a bandwidth extension frequency band; when the highest frequency bin to which a bit is allocated is less than the preset start frequency bin of the bandwidth extension frequency band, predicting an excitation signal of the bandwidth extension frequency band according to an excitation signal within a predetermined frequency band range of the frequency domain signal and the preset start frequency bin of the bandwidth extension frequency band; when the highest frequency bin to which a bit is allocated is greater than or equal to the preset start frequency bin of the bandwidth extension frequency band, predicting the excitation signal of the bandwidth extension frequency band according to the excitation signal within the predetermined frequency band range of the frequency domain signal, the preset start frequency bin of the bandwidth extension frequency band, and the highest frequency bin to which a bit is allocated; and predicting a bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and a frequency envelope of the bandwidth extension frequency band.
The described apparatus embodiment is merely exemplary. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on at least two network units. Some or all of the modules may be selected according to an actual need to achieve the objectives of the solutions of the embodiments. A person of ordinary skill in the art may understand and implement the embodiments of the present invention without creative efforts.
Finally, it should be noted that the foregoing embodiments are merely intended for describing the technical solutions of the present invention but not for limiting the present invention. Although the present invention is described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that they may still make modifications to the technical solutions described in the foregoing embodiments or make equivalent replacements to some technical features thereof, without departing from the spirit and scope of the technical solutions of the embodiments of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
201310034240.9 | Jan 2013 | CN | national |
This application is a continuation of International Application No. PCT/CN2013/079883, filed on Jul. 23, 2013, which claims priority to Chinese Patent Application No. 201310034240.9, filed on Jan. 29, 2013, both of which are hereby incorporated by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2013/079883 | Jul 2013 | US |
Child | 14806896 | US |