The present disclosure relates generally to audio signal processing and, more particularly, to all-pass filter phase linearization of elliptic filters in signal decimation and interpolation for an audio codec.
The Enhanced Voice Services (EVS) codec under consideration for implementation by the Third Generation Partnership Project (3GPP) Long Term Evolution (LTE) wireless communication protocol has ambitious requirements for both speech and music & mixed content signals. One way to solve this problem would be to use two parallel cores optimized for each of the two signal types like speech and non-speech signals, e.g., music (otherwise referred to as generic audio signals). To process both speech and generic audio signals, a classifier or discriminator determines, on a frame-by-frame basis, whether an audio signal is more or less speech-like and directs the signal to either a speech codec or a generic audio codec based on the classification. The EVS and other hybrid coders code more speech-like (speech audio) signals using Linear Predictive Coding (LPC). The coding of less speech-like (generic audio) signals is generally performed using a frequency domain transform codec. For example a codec optimized for use in 3GPP EVS could code more speech-like signals using a critically sampled Code Excited Linear Prediction (CELP)-based codec core sampled at 12 kHz or 16 kHz and to code less speech-like signals using a Modified Discrete Cosine Transform (MDCT)-based codec core.
A good decimator is required for the CELP core but seamless switching between the different core types, e.g., the LPC core and the frequency domain core, is required. Elliptic filters have fast roll-offs with modest orders and low delays making them good candidate decimation filters. In Elliptic filters, as illustrated in
The various aspects, features and advantages of the invention will become more fully apparent to those having ordinary skill in the art upon careful consideration of the following Detailed Description thereof with the accompanying drawings described below. The drawings may have been simplified for clarity and are not necessarily drawn to scale.
Generally many audio signals have both speech and non-speech like characteristics. For examples an audio signal may include both speech and music. As used herein, a speech signal refers to an audio signal having more speech-like characteristics and a generic audio signal refers to an audio signal having less speech-like characteristics, e.g., music. Whether an audio signal is as a speech signal or a generic signal is dependent on the classification thereof, usually on a frame-by-frame basis, by a classifier or discriminator. Audio signal classifiers are well known generally by those of ordinary skill in the art and hence not described further herein.
In
In
Linear predictive cores are well suited for encoding speech signals. In this regard, the first resampling filter may be lowpass filter. In embodiments where both encoder paths include a linear predictive encoder, the second resampling filter may also be a lowpass filter. In one embodiment, the resampling filter is an Elliptic filter. As noted, Elliptic filters have fast roll-offs with modest orders and low delays making them good candidate decimation filters. In Elliptic filters, however, the phase is non-linear so switching between cores is not seamless. In other embodiments, the resampling filter may be any of a family of Infinite Impulse Response (IIR) filters that exhibit a non-linear phase or non-uniform group delay property. In some embodiments, a delay element is disposed in the encoder path without the resampling filter, wherein the delay element compensates for delay associate with the first resampling filter.
The reason for resampling is that the speech coder may operate at a lower sampling rate than the audio coder. There may also be auxiliary coding of higher frequency information in the speech path. The coding of higher frequencies is optional, but will be used in practice to equalize the coded bandwidths of the speech and audio paths. Speech coding at higher sampling rates is subject to much higher complexity demands, as well as lower coding efficiency (i.e., more bits are required to produce equivalent quality) and thus will not be used in some applications.
In one embodiment, an all-pass filter is used to compensate for lack of phase linearity in the filter path or in the alternate coded path of the encoder. Alternatively, two all-pass filters may be combined and placed up-front in either branch or path of the encoder. Thus in
The phase compensation filter is configured to filter the input signal before encoding such that characteristics of the first audio signal and the second audio signal are substantially similar. In other words the similarity of the first and second audio signals is more similar in the present of the compensation filter than would be the case in the absence of the phase compensation filter. The similarity of the first and second audio signals may be measured quantitatively in terms of phase, or correlation, or signal-to-noise ratio (SNR) or some other measurable signal characteristic or a combination of such characteristics. The result is a reduction in audible artifacts, resulting from the non-linear phase characteristic of the resampling filter, of the first audio signal combined with the second audio signal, for example during playback of the audio signal.
In one embodiment, the all-pass filter structure has unity gain (all-pass). Also, the numerator and denominator exhibit a time reversal property. In other words, whatever value of z, the numerator and denominator have same magnitudes, as in the following ratio.
H(z)=0.481177−1.150582 z−1−0.053944 z−2+2.226390 z−3−1.394225 z−4−1.042799 z−5+z−6/1.0−1.042799 z−1−1.394225 z−2+2.226390 z−3−0.053944 z−4−1.150582 z−5+0.481177 z−6
For a phase compensation filter cascaded with a lowpass filter as in
In one embodiment, the resampling filter and the phase compensation filter are in the first encoder path wherein the first resampling filter and the phase compensation filter have a joint phase characteristic that is nearly linear in a pass band.
Generally, the required accuracy of the phase correction is dependent on the accuracy of the speech coder. For example, a lower order phase compensation filter may be sufficient in cases where higher frequency coding of the original signal is not very accurate as is typical of a low bit rate speech codec. Thus in the case where higher frequency mapping of the original signal is not very accurate, the approximation of the phase characteristic of the resampling filters need not be as accurate because the speech coder will distort the signal to some extent. Where higher frequency mapping of the original signal is more accurate, as is typical higher bit rate speech codecs, the phase correction is more critical since these codecs perform higher frequency content coding better.
It may be possible to balance complexity of the encoder and decoder (respectively). For example, on the encoder side, the speech path is usually the worst case complexity path. Thus in some embodiments, worst case complexity can be reduced by placing the phase compensation filter in the generic signal coder path. On the decoder side, however, the generic signal coder path is likely the worst case complexity. Thus in the decoder, the compensation filter is disposed in the speech signal coder path.
In
As discussed linear predictive cores are well suited for encoding speech signals. In this regard, the first resampling filter may be lowpass filter. In embodiments where both encoder paths include a linear predictive coder, the second resampling filter may also be a lowpass filter. In one embodiment, the resampling filter is an Elliptic filter. As noted, Elliptic filters have fast roll-offs with modest orders and low delays making them good candidate decimation filters. In Elliptic filters, however, the phase is non-linear so switching between cores is not seamless. In other embodiments, the resampling filter may be any of a family of Infinite Impulse Response (IIR) filters that exhibit a non-linear phase or non-uniform group delay property. In some embodiments, a delay element is disposed in the decoder path without the resampling filter, wherein the delay element compensates for delay associate with the first resampling filter.
In one embodiment, an all-pass filter is used to compensate for lack of phase linearity in the filter path or in the alternate coded path of the decoder. Alternatively, two all-pass filters may be combined and placed at the decoder output of either branch or path. Thus in
The phase correction filters on the encoder/decoder may or may not be grouped together. That is, there may be an advantage to implementing He(z) and Hd(z) as a series combination He(z)*Hd(z). For example if He(z) is an all-pass-filter that linearizes the phase of the resampling filter at the encoder side and the Hd(z) is a corresponding all-pass-filter that linearizes the phase of the resampling filter at the decoder side, then instead of using He(z) and Hd(z) at the encoder and decoder respectively, alternate all-pass filters He′(z) and Hd′(z) can be used at the encoder and decoder sides such that the phase characteristics of He′(z)*Hd′(z) is equal to the phase characteristic of He(z)*Hd(z). This may be true of the filter in the speech path, or in the alternative audio path embodiment.
The phase compensation filter is configured to filter the first audio signal after decoding such that characteristics of the first audio signal and the second audio signal are substantially similar. In other words the similarity of the first and second audio signals is more similar in the presence of the phase compensation filter than would be the case in the absence of the phase compensation filter. As noted, the similarity of the first and second audio signals may be measured quantitatively in terms of phase, correlation, signal-to-noise ratio (SNR) or some other measurable signal characteristic.
In
In one embodiment, the resampling filter and the phase compensation filter are in the first decoder path wherein the first resampling filter and the phase compensation filter have a joint phase characteristic that is nearly linear in a pass band.
An all-pass filter may also be used to compensate for lack of phase linearity in a system including an encoder and a decoder. This embodiment combines the phase correction filters from each of the encoder and decoder paths into a single phase correction filter at the decoder. The phase compensation filter may be disposed in either the encoder path or the decoder path. The system 400 of
In the system 600 of
In the system 700 of
In the system 800 of
While the present disclosure and the best modes thereof have been described in a manner establishing possession and enabling those of ordinary skill to make and use the same, it will be understood and appreciated that there are equivalents to the exemplary embodiments disclosed herein and that modifications and variations may be made thereto without departing from the scope and spirit of the inventions, which are to be limited not by the exemplary embodiments but by the appended claims.
The present disclosure is related to co-pending and commonly assigned U.S. application Ser. No. 13/342,462 filed 3 Jan. 2012 entitled “Method and Apparatus for Processing Audio Frames to Transition Between Different Codecs”, the contents of which are incorporated herein by reference.