When performing Digital Signal Processing (DSP) on a digital signal, it is often necessary to convert the sampling rate associated with the signal. For example, a signal associated with a source sampling rate (Fx) may need to be converted into a signal associated with a different, destination sampling rate (Fy).
Typically, the source sampling rate and the destination sampling rate are pre-determined and an appropriate Sampling Rate Conversion (SRC) structure is designed to perform the task. For example, a filter may be designed for a specific application that requires a particular sampling rate ratio (Fx/Fy).
For example, converting a source sampling rate Fx by a factor of 1.5 can be achieved through interpolation 110 by a factor of three and then decimation 130 by a factor of two. This approach, however, can be inefficient when a conversion requires interpolation 110 and/or decimation 130 by a large factor. By way of example, a conversion from 48 Kilohertz (KHz) to 44.1 KHz requires interpolation 110 by a factor of 147 and then decimation 130 by a factor of 160. These types of substantial interpolation and/or decimation factors may also require impractically large filters in order to meet Nyquist requirements.
Another disadvantage with the traditional approach is that a filter must be designed for a specific pair of sampling rates. That is, a filter that is designed to convert a source signal with a sample rate of 48 KHz into a destination signal with a sampling rate of 44.1 KHz cannot be used for source and/or destination signals that have other sampling rates.
Some embodiments described herein are associated with “arbitrary” sampling rates (e.g., arbitrary source or destination sampling rates). As used here, the term “arbitrary” may refer to any sampling rate that is not pre-determined (e.g., that is not known when a SRC structure is designed).
Moreover, as used herein lowercase variables will declare sequences in the time domain, while uppercase variables will declare frequency representations (e.g., x is a sequence represented in the time domain while X is the same variable represented in the frequency domain). The index [n] is used to represent the time index of sequences in the time domain (e.g., x[n]).
In addition, ω is the radian frequency normalized to the sampling frequency Fs:
the convolution of two time domain sequences x[n] and g[n], to produce time domain sequence y[n] is expressed as:
where N is length of the sequence g[n].
Sampling Rate Conversion System
The information is then received by a router or switch 220, which could simply forward the information to a destination 240 (e.g., the information would still be associated with Fs—as is the case with the second destination 240 illustrated in
The router or switch 220 may also perform a rate conversion 230 on the received information. As a result of the rate conversion 230, information can be transmitted to a destination 240 in accordance with a different sampling rate (e.g., as is the case with the first and third destinations 240). For example, a destination 240 associated with a slower link, such as a Digital Subscriber Line (DSL) line or a dial-up connection, might receive information associated with a reduced sampling rate (e.g., 16 or 8 KHz) in order to avoid audio breaks or missed packets.
Note that the potential speeds of various links between the router or switch 220 and the destinations 240 (and thus the sampling rates that will be appropriate for those links) might not be known when the rate conversions 230 are designed (e.g., a new technology might require a new sampling rate). Moreover, the speed of a particular link might dynamically change (e.g., the speed of a link might change due to traffic congestion).
The target sampling rate may then be used in during a rate conversion 320 to generate information for the destination (e.g., by converting x[n] associated with Fx into y[n] associated with Fy). Thus, this embodiment may dynamically determine an appropriate sampling rate for a link in accordance with the link's capacity.
Referring again to
Although the SRC system 200 illustrated in
Moreover, a single element described in
According to some embodiments, the source 210 and/or a destination 240 may comprise a device, such as a PC, a wireless telephone, or a Personal Digital Assistant (PDA). According to other embodiments, the source 210 and/or a destination 240 instead comprises a software application or a peripheral (e.g., a sound card in a PC). According to still other embodiments, the source 210 and/or a destination may comprise an information file (e.g., the source 210 may be a locally stored MP3 file).
The information received from the source 210 and/or transmitted to a destination 240 may comprise, for example, streaming information (e.g., the source 210 may be a Web server adapted to stream video information). The information may also comprise stored information (e.g., a destination 240 may be an MP3 file that will be attached to an email message).
In the SRC system 200 illustrated in
Sampling Rate Conversion Structure
A polynomial interpolator 420 receives at least some of the over-sampled signals from the poly-phase filter 410. Based on the received over-sampled signals, the polynomial interpolator 420 provides information (i.e., y[n]) associated with an output sampling rate (i.e., Fy). Note that the output sampling rate may be an arbitrary sampling rate. The information may be provided to, for example, a destination device or information file.
The polynomial interpolation 420 may perform a non-exact rate conversion, such a zero or higher order approximation. For example,
Sampling Rate Conversion Example
Any non-exact approximation will introduce distortion to the signal. In particular, if a signal with a flat spectrum up to ωx is interpolated by a factor I, the Signal-to-Distortion Ratio (SDR) introduced by linear interpolation is:
By way of example, consider a system that must have a maximum SDR of 96 db. Because the SDR is a function of distortion introduced by both the anti-aliasing filters and the linear interpolation, half the distortion might be allocated to the filter (and the other half to the interpolation). In this case, the SDR requirement for the linear interpolator would be 99 db. For a signal with a flat spectrum up to 0.8π, the amount of interpolation required such that the SDR is at least 99 db is:
That is, the input signal must be over-sampled by a factor of at least 251 in order to meet the SDR requirement. Once the signal is interpolated by 251, a sampling rate conversion of any arbitrary factor may be accomplished.
Note that it may not be necessary to compute all 256 outputs for each input sample in order to achieve sampling rate conversion. That is, a subset of the information from the sub-filters 610 may be selected based on the desired input/output sampling rate relationship. The desired delay is given by:
Note that 0≦δm≦1. Also, L/M is the integer ratio for the desired conversion (e.g., 147/160 in the case of a 44.1 KHz to 48 KHz conversion). In addition, the delay wraps around at m=M, and thus the input and the output are synchronized at every M output samples. The two closest outputs from the poly-phase filter bank to be used for linear interpolation are given by k and k+1, where k is obtained by finding the nearest filter delay:
k=└δm·256┘
Therefore, to construct the output sample y[n], linear interpolation is performed on the convolved outputs of filters k and k+1:
Note that the coefficient αm represents the distance between the selected filter k and the desired value (e.g., as illustrated in FIG. 5):
αm=δm·256−└δm·256┘
Sampling Rate Conversion Method
At 702, information associated with an input sampling rate is received. For example, x[n] associated with an arbitrary source sampling rate might be received from a source device or information file.
A set of over-sampled signals is then generated via a poly-phase filter based on x[n] at 704. At 706, a polynomial interpolation (e.g., a zero or higher order approximation) is performed on a subset of the over-sampled signals to generate information associated with an output sampling rate. For example, y[n] associated with an arbitrary destination sampling rate might be generated.
Information associated with the output sampling rate is then provided at 708. For example, y[n] may be provided to a destination device or information file.
Multi-Stage Sampling Rate Conversion Structure
It might be difficult to design an appropriate filter for interpolation by a large factor (e.g., by a factor of 256) using standard Finite Impulse Response (FIR) filter techniques. According to some embodiments, a multi-stage filtering scheme is used to reduce this problem. For example, the multi-stage SRC structure 800 illustrated in
As a more specific example,
A second stage 930 comprises a poly-phase filter bank that interpolates the signal from H(ω) 920 by a factor of 128. Note that the first stage interpolation 910 may relax the transition and stop band requirements of the poly-phase filter as compared to the structure described with respect to
The first stage may be designed to have a pass band extending from 0 to 0.8π/2, and the transition band extending up to π/2. The second stage may be designed to have a pass band from 0 to 0.8π/256, and a transition band extending to π/64-π/256. The equations described above with respect to
k=└δ
m·127┘
αm=δm·128−└δm·128┘
Note that due to the linear property of convolution, the equation for y[n] may become:
y[n]=x[n]*gm[n]
where:
Thus, in applications where the input and output sampling rate ratio is known beforehand, the filter coefficients gm[n] may be pre-computed. In situations where input and output sampling rates change periodically, the filter coefficients might also be pre-computed once per dynamic period. As a result, only one convolution of the interpolated signal may be required and computational performance may be improved. In other words, the signal may be convolved with an interpolated version of the filters.
Multi-Stage Buffer
Consider now a streaming application (e.g., a streaming audio application) where operations are performed on a block-by-block basis. When the SRC operation is performed in blocks of Pulse Code Modulated (PCM) samples, it may be the case that the input PCM samples are not fully consumed to generate the converted output samples.
To deal with this potential problem, an intermediate buffer can be provided in a multi-stage filter design. For example,
In this case, however, a PCM digital audio sample buffer 1030 stores information from H(ω) 1020 and provides information to a second stage 1040 (i.e., a poly-phase filter bank that interpolates the signal from H(ω) 1020 by a factor of 128 and provides a set of signals to a polynomial interpolation 1050). That is, the PCM buffer 1030 keeps interpolated samples from the first stage. The output stage retrieves samples from the PCM buffer 1030 based on the demand for samples in the second stage 1040, in order to generate the required number of output samples.
The size of the PCM buffer 1030 may be associated with the input/output sampling ratio and the number of samples in each input block to be processed. The number of output samples at any specific input block instance may vary depending on the samples available in the PCM buffer 1030. For example, the number of output samples generated for every input block i may be:
where Nbuffer is the number of samples available n the buffer at any specific block instance, and is given by:
Nbuffer(i)=2Nm(i)+Nextra(i)
where Nin is the number of samples in one input block. Nextra is the number of samples remaining in the buffer from the previous iteration:
where Nextra(0)=0 and the second term in the equation represents the total number of samples consumed during the previous iteration. Moreover, the size of the PCM buffer 1030 may be adjusted based on input and/or output sampling rates (e.g., by dynamically adjusting the size of the PCM buffer 1030).
Thus, embodiments may provide a SRC structure to handle an arbitrary source sampling rate and/or an arbitrary destination sampling rate.
Additional Embodiments
The following illustrates various additional embodiments. These do not constitute a definition of all possible embodiments, and those skilled in the art will understand that many other embodiments are possible. Further, although the following embodiments are briefly described for clarity, those skilled in the art will understand how to make any changes, if necessary, to the above description to accommodate these and other embodiments and applications.
Although particular embodiments have been described herein (e.g., a VoIP network), any number of other embodiments may also be implemented. For example, a software application may convert an MP3 file encoded at one sampling rate into an MP3 file encoded at another sampling rate.
Moreover, although hardware or software implementations have been described with respect to some embodiments, embodiments may be implemented using any combination of software, such as the INTEL® Integrated Performance Primitives (IPP) Version 3.0 library, and/or hardware, such as hardware associated with Very High Speed Integrated Circuit (VHDL) logic.
The several embodiments described herein are solely for the purpose of illustration. Persons skilled in the art will recognize from this description other embodiments may be practiced with modifications and alterations limited only by the claims.
Number | Name | Date | Kind |
---|---|---|---|
4799179 | Masson et al. | Jan 1989 | A |
5274372 | Luthra et al. | Dec 1993 | A |
5331346 | Shields et al. | Jul 1994 | A |
5365468 | Kakubo et al. | Nov 1994 | A |
5610942 | Chen et al. | Mar 1997 | A |
6134268 | McCoy | Oct 2000 | A |
6411225 | Van Den Enden et al. | Jun 2002 | B1 |
6487573 | Jiang et al. | Nov 2002 | B1 |
6546407 | Jiang et al. | Apr 2003 | B2 |
6968353 | Schmidt | Nov 2005 | B2 |
7126505 | Avantaggiati | Oct 2006 | B2 |
Number | Date | Country | |
---|---|---|---|
20040052300 A1 | Mar 2004 | US |