This invention relates to methods and apparatus for conversion of input audio frequencies to output audio frequencies.
Digital Audio is based on many different means of communication. The different digital media generally have conflicting sampling frequencies, where those sampling frequencies are in accordance with the Nyquist Sampling theorem. For example, digital transmission of broadcasting programs at 32 kHz, compact discs at 44.1 kHz, digital video discs at 48 kHz and speech recordings at 6 kHz to 8 kHz, as described in “High Quality Digital Audio in the Entertainment Industry”, IEEE ASSP Magazine 1985 pages 2–25. Digital audio requires a sampling frequency conversion technique to handle simple as well as non-trivial ratios efficiently.
Conversion by going from digital to analogue (through a DAC and a low-pass filter) and then re-sampling the smoothed signal at the output rate is simple, but costly and limited by the imperfections (non-linearity, phase response, noise) of the analogue filter as described in “High, Quality Analogue Filters for Digital Audio”, 67th AES Convention, November 1980.
Conversion in simple integer or rational ratios fi/f0 by single or multi-stage FIR filter design, as described in Rabiner and Croichie, Multi-rate Digital Signal Processing, Prentice Hall Publication, 1983. However, it is not particularly suited for many arbitrary ratios, as it leads to far too many filter configurations. An individual filter configuration is suited maximally to a subset of these ratios only.
In accordance with the present invention, there is provided a method for conversion of input audio data, at an input audio data frequency, to output audio data, at an output audio data frequency, including the steps of:
(a) sampling the input audio data;
(b) expanding the input audio data, to produce expanded data; and
(c) interpolating the expanded data to produce output audio data,
wherein the step of interpolating bandlimits the expanded data to either the sampling frequency divided by two or the output audio data frequency divided by two, which ever is the lesser.
Preferably, the step of interpolating includes sinc interpolation substantially in accordance with a sinc interpolation function.
Preferably, said sinc interpolation function is substantially in accordance with:
In accordance with the present invention, there is also provided a method for conversion of input audio data, at an sample frequency, to output audio data, at an output audio data frequency, including the steps of:
Preferably, the commutator, at any one time, selects only two outputs from the polyphase filters.
In accordance with the present invention, there is also provided a frequency converter for conversion of input audio data, at an input audio data frequency, to output audio data, at an output audio data frequency, including:
(a) means for sampling the input audio data;
(b) means for expanding the input audio data to produce expanded data; and
(c) means for interpolating the expanded data to produce output audio data, where the step of interpolating bandlimits the expanded data to either the sampling frequency divided by two or the output audio data frequency divided by two, which ever is the lesser.
In accordance with the present invention, there is also provided a frequency converter for conversion of input audio data, at an input audio data frequency, to output audio data, at an output audio data frequency, including:
(a) means for upsampling the input audio data by an integer factor, so as to increase the sampling rate of the input audio data to produce expanded data and
(b) means for interpolating the expanded data to produce output audio data, wherein the interpolating is linear interpolation, said upsampling includes polyphase filters for filtering said expanded data, and said polyphase filters are in parallel and said upsampling includes a commutator for selecting the outputs of the filters.
Preferably, a frequency converter as claimed in claim 11, wherein the commutator at any one time selects only two outputs from the polyphase filters.
Advantageously the invention will be a single simple structure, often desired in Audio applications, for conversion between commonly occurring audio frequencies. The advantage of using a single structure is that for conversion between different frequency combinations, the same block code and same coefficients can be used. This reduces the program code size. A single simple structure also means it can be implemented efficiently as a hardware block, without excessive chip area.
The invention is further described by way of examples only with reference to; the accompanying drawings, in which:
Consider that x[n] is a uniformly sampled version of the bandlimited analogue signal x(t). If the sampling frequency is Fs, therefore the time period is Ts, then x[n]=x(nTs).
Moreover, if x(t) was band-limited to Fs/2, then perfect reconstruction of x(t) from x[n] can be obtained by applying the interpolation function (sampling theorem)
where
and
ωc=πƒs; the cutoff frequency
Since the summation limit is from −∞ to ∞it cannot be practically implemented. If non-uniform sampling or finite length is considered (about the point of reconstruction) other types of interpolation functions such as spline and Lagrange can be used. Equation (3) is an example of a Lagrange interpolator.
The advantage of the Lagrange interpolator is that it results in a polynomial fit, constructed in such a way that each sample is represented by a function which has zero values at all other sampling points.
Evaluating x(t) for all possible values is physically impossible. However, reconstruction only requires evaluation of x(t) at points t=mT′, corresponding to re-sampling with new sampling frequency Fs, with an associated period T′. Therefore:
The above described technique functions adequately when Fs<F′. However, when reconstructing audio data where Fs>F′, the output audio data will be effected by an effect called aliasing. Aliasing is frequency fold over due to under sampling and can be removed by prefiltering the audio data to effectively bandlimit the audio data to F′/2. This step requires prefiltering of data before reconstruction.
In converters constructed by
where
A=minimum [Fs/2, F′/2]
Equation (5) represents a sinc interpolation reconstruction formula in accordance with the invention. The integral limits, ±A, of this function effectively bandlimit the interpolation. The interpolation is bandlimited to effect filtering of the data output by the interpolation. Therefore, when Fs is less than F′, the equation (5) will function as a standard sinc interpolator whereby the data reconstructed by the interpolation will be bandlimited to Fs/2. However, when Fs is greater than F′, equation (5) will function as a sinc interpolator whereby the data reconstructed by the interpolation will be bandlimited to F′/2. Thus, the reconstructed data will be bandlimited to F′/2 and thereby be filtered from an aliasing effect.
Therefore, the prefiltering step, to remove an aliasing effect in reconstructed data, is no longer required. The cutoff frequency, ωc, is effectively constrained to the minimum of (πFs, πF′), thereby limiting the integral of the reconstruction formula of equation 5, to A. Sinc interpolation may therefore interpolate and filter the expanded data in a single step.
The second stage 34 may comprise a simple linear interpolator, which interpolates the denser expanded samples of y[n] at frequency L*Fs to generate output at required frequency F′36. Upsampling reduces the interpolation error considerably.
This process is known as ‘upsampling’. Upsampling reduces the errors which occur during interpolation considerably. Upsampling by a factor of 16 followed by linear interpolation leads to SNR of ˜60 dB for conversion ratio F′Fs=4.
Converter 30 is simplified by using the same interpolation factor, 16, for all conversion ratios. In effect, the said common interpolation factor enables the same filter coefficients to be used for all ratios. Upsampling may include a normal polyphase filter.
Converter 40 is simplified by using the same interpolation factor, 16, for all conversion ratios. In effect, the said common interpolation factor enables the same filter coefficients to be used for all ratios. A polyphase filter implements the upsampling stage.
For simple operations converter 30, would be used in preference to converter 40.
Upsampling, in the embodiments of
Insertion of I-1 zeros means that Y′(z)=X(zl), where y′[n] is the sequence generated by inserting I-1 zeros in x[n]. In the frequency domain Y′(ejw)=X(ejwl), which essentially means that the spectrum of x[n] has been co pressed I times. Since X(ejwl) is periodic in 2π this leads to creation of extra images in the spectrum. These images are removed by a filter with a bandlimit of ωc=π/I.
Computational efficiency is obtained in the filter structure above by reducing the large FIR polyphase filter (h[n]) of length M into a set of smaller polyphase filters of length K=M/I. Since the upsampling process inserts I-1 zeros between successive values of x[n], only K out of M input values stored in the FIR filter at any time are non-zero. This observation leads to the well-known polyphase filters
The set of I polyphase filers can be arranged as a parallel realisation 62, as shown in
In the case of linear interpolation, two adjacent polyphase filter outputs are required at each time. Further reduction in computation is achieved by noting that in the case of linear interpolation, not all polyphase filter outputs are used in generating the samples at the output.
In a specific example of converter 30, the process of
Since only specific polyphase outputs are required computation can be reduced by skipping those polyphase filters whose output are not required for that period of time. Unless the conversion ratio is an integer no polyphase filter can be absolutely avoided. The above described example, achieves a computation gain of about four.
Internal clock inconsistencies may be a problem in digital frequency conversion. Consider the example of conversion from 32 kHz. to 44.1 kHz. Real-time systems work on limited buffer space and on blocks of data. Suppose the constraint on the system is that it always operates on N output samples. Each time N samples are transmitted at the output the system receives an interrupt for DMA (Direct Memory Access) and all the samples collected at input since the last DMA is copied to internal buffer. Similarly N samples must be ready to be transferred to the output buffer.
Now, the input and output clocks are free running so there is no guarantee that the ratio between the time periods of the two clocks will be exactly as computed. As a result it may happen that either the number of samples obtained from input is too few to produce N samples at output or they produce more than N samples.
If Fs is the input sampling frequency and F′ is the required output sampling frequency, each time N samples are transmitted at output, [N*Fs/F′] samples should accumulate at the input. A small deviation may occur, but on average the above relation must hold. In a case where the deviation is appreciable, samples may have to be dropped. This case arises when the input rate is higher than the output rate. As a result of being dropped samples may have to be repeated.
Therefore, when the input data frequency is higher than the output data frequency, more samples are produced at the output, than the buffer 68 can hold. Overwriting the older samples in the buffer produces a discontinuity and as a result a clicking sound is made.
In the cross fading scheme of
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/SG00/00093 | 6/23/2000 | WO | 00 | 6/12/2003 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO01/99277 | 12/27/2001 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5309484 | McLane et al. | May 1994 | A |
5724396 | Claydon et al. | Mar 1998 | A |
5793818 | Claydon et al. | Aug 1998 | A |
6618443 | Kim et al. | Sep 2003 | B1 |
Number | Date | Country |
---|---|---|
0 450 335 | Oct 1991 | EP |