In many applications, there is a need to convert a digital signal that has been sampled at a first sampling rate to another digital signal sampled at a higher second sampling rate. Sample rate conversion is a computationally intensive operation that requires real-time computations involving addition and multiplication.
Continuing with
The up-sampling function 102 up-samples by inserting P-1 zero value samples between successive samples in the input sequence x[n] to increase the sampling rate. The up-sampled sequence u[n] includes a plurality of samples of x[n] and P-1 zeros inserted between successive samples of x[n] and is given by:
For example, with P=5, 4 (P-1) zero values are inserted in the up-sampled sequence u[n] between each input signal x[n] value such that, u[0] =x[0]; u[1], u[2], u[3] and u[4] =0 and u[5] =x[1].
Continuing with
Assuming the filtering function 104 has NP filtering coefficients h[n], the filtered sequence v[n] is given by:
where N is the number of samples and h(n-k) are pre-stored filtering coefficients of the filter.
To regenerate the data at the desired sampling rate, the filtered signal v[n] is down-sampled to provide an output signal y[n], which only includes samples at the new sampling rate. Each filtered sample in the filtered signal v[n] is dependent on the previous NP values of up-sampled sequence u[n] input to the filter. The down-sampling operation 106 down-samples by taking every Qth sample from the up-sampled sequence v[n] to provide the output sequence y[n]. Thus, the output sequence {y[n] y[n+1] y[n+2] . . . } corresponds to {u[n] u[n+Q] u[n+2Q] . . . }.
v[n+0]={h[0] h[1] h[2] . . . h[19]} {u[n+0] u[n−1] u[n−2] . . . u[n−19]}
Thus, as the number of samples in the input sequence and the output sequence increases, the number of MAC cycles and taps in the filter increases accordingly. As is well known in the art of digital signal processing, there are restrictions on the sampling rate; that is, the sampling rate must be at least twice the highest frequency of the signal that is being sampled (Nyquist sampling rate), so that the sampled signal can be reconstructed without aliasing.
As the number of taps are increased, the number of Multiply-and-Accumulate (MAC) cycles to generate each successive output may make it unfeasible to compute the output sequence in real-time or with restricted hardware.
A scalable, efficient, programmable, hardware architecture for a sample rate conversion engine is presented. A sample rate conversion engine for upconversion, filtering with a set of filtering coefficients applied to delayed samples and down conversion includes a multiplier accumulator unit, a sample delay line and a bank of coefficient registers. Each register in the bank of coefficient registers contains a reduced set of the set of filtering coefficients to be applied to input samples for each of successive outputs. The sample rate conversion engine also includes control logic which, for each of the successive outputs, selects a reduced set of filtering coefficients from the bank of coefficient registers and a reduced set of samples from the sample delay line to be applied to the multiplier accumulator unit.
The sequence of multiply operations performed in the MAC processing unit and the sets of filtering coefficients stored in the coefficient registers are selected dependent on the conversion ratio P/Q and the number of multipliers N in the MAC unit, so that computations are not performed for intermediate samples u[n] that are later discarded by the down-sampling function. The sequence of multiply operations is also selected so that computations are not performed for the P-1 interpolated zero valued samples that have no affect on the output signal y[n]. Thus, the number of multipliers required to derive the output samples is less than in the prior sample rate converter described in conjunction with
The bank of coefficient registers is programmable, allowing the conversion rate to be varied by modifying the reduced sets of filtering coefficients stored in the bank of coefficient registers. The number of reduced sets of filtering coefficients stored in the bank of coefficient registers is dependent on a number of interpolated zero samples and the number of reduced sets of samples is dependent on the output sampling rate. The number of multipliers in the multiplier accumulator unit is fixed.
In one sample, with a conversion ratio P/Q, the number of multipliers is N and the number of sets (banks) of filtering coefficients is P. To compute successive Q output samples one of the sets of filtering coefficients is applied to each of Q sets of input samples in the multiplier accumulator unit.
The reduced set of filtering coefficients in each bank is h[(r*P)+(n*Q (mod P))], a reduced set of input samples is x([((n*(Q/P))−r)] and a successive output y[n] is the product of the reduced set of filtering coefficients and the reduced set of input samples for r=0 to N−1.
The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
A description of preferred embodiments of the invention follows.
The amplified signal is coupled to a wideband data converter 404. The wideband data converter 404 includes a high speed, high resolution Analog to Digital (A/D) and a Digital to Analog (D/A) converter. The Analog to Digital (A/D) converter digitizes received wideband signals for subsequent digital signal processing by a digital processor. For example, the 2.4 GHz wideband signal can be digitized by a 200 MHz A/D converter.
The high sampling rate and bit resolution offered by the A/D data converter enables the design of systems where multiple data channels and/or protocols can be processed simultaneously and cost effectively. A fundamental requirement in such wideband multi-protocol systems is the ability to seamlessly change sampling rates based on protocol specific requirements.
The data in the received wideband signal is modulated using modulation schemes such as Orthogonal Frequency Division Multiplexing (OFDM) and Direct Sequence Spread Spectrum (DSSS) modulation. OFDM modulates with a sample rate of 20 Msps and DSSS modulates with a with a sample rate of 11 Msps. The IEEE 802.11 g wireless LAN protocol supports both OFDM and DSSS.
The digital signal is forwarded to a digital processing circuit 406 for processing. To support the ability to change sampling rates based on the 802.11 g wireless LAN protocol, a filter 408 in the digital processing circuit 409 includes a sample rate conversion engine which converts received signals with a sampling rate of 11 Msps to signals with a sampling rate of 20 Msps. After the sampling rate has been converted in the filter, the converted signal is forwarded for further processing to a protocol processor 410.
The input sequence x[n] includes a plurality of samples that are sampled at an input sampling rate ƒin. Sample rate conversion engine 500 minimizes the number of computations by computing output sequence samples y[n] directly from input sequence samples x[n] in the Multiplier and Accumulator (MAC) processing unit 504. The input sampling rate is ƒin and the desired output sampling rate is ƒout, the ratio ƒout/ƒin is rational and can therefore be expressed as a fraction P/Q.
The MAC processing unit 504 includes N multipliers and an N-input adder capable of computing one sample of the output y[n] in a single cycle. A reduced number of computations per output sample y[n] are performed using a subset of the NP filtering coefficients h[n] for successive output samples y[n].
An ordered sequence of P reduced sets of filtering coefficients are stored sequentially in a bank of P coefficient registers 502. Each coefficient register in the bank of coefficient registers 502 stores a reduced set of N filtering coefficients, one filtering coefficient per multiplier in the MAC processing unit 504. The ordered sequence of P reduced sets of filtering coefficients (filter banks) is modular; that is, the sequence is repeated for every P output samples.
The reduced set of filtering coefficient inputs 512 are applied in parallel, from successive filter banks (coefficient registers), for each output sample y[n] computed. Under control of the control logic 516,a modulo P counter 510 selects a reduced set of filtering coefficients to be applied to the MAC processing unit 504 through mux logic 508 and a bank of coefficient (H) inputs 512.
A sample delay line implemented by an N deep shift register 506 buffers successive input samples x[n]. The data shift registers 506 are also controlled by the control logic 516. Every incoming data sample x[n] is associated with an input data valid signal that is set to a logic “high” for a single clock cycle of the engine. The average frequency of the input data valid pulses corresponds to the input sampling rate ƒin.
The engine 500 issues output data valid pulses with an average frequency ƒout=(P/Q)ƒin. Control logic 516 synchronizes the input samples x[n] and output samples y[n] in such a way that for every Q input samples x[n] shifted in, P output samples y[n] are computed. For the example, with Q=3 and P=5, the control logic issues data valid pulses to the data shift register 506 such that for every 3 samples shifted in, 5 output samples are computed.
The sequence of multiply operations performed in the MAC processing unit 504 and the sets of filtering coefficients stored in the coefficient registers 502 are selected dependent on the conversion ratio P/Q and the number of multipliers N in the MAC unit, so that computations are not performed for intermediate samples u[n] that are later discarded by the down-sampling function. The sequence of multiply operations is also selected so that computations are not performed for the P-1 interpolated zero valued samples that have no affect on the output signal y[n]. Thus, the number of multipliers required to derive the output samples is less than in the prior sample rate converter described in conjunction with
The bank of coefficient registers 502 are software programmable through control logic 516 allowing different conversion rates to be supported; that is, the re-sampling ratio P/Q is variable. The number of multipliers in the MAC processing unit 504 is fixed. Thus, only the number of coefficient registers is modified when the conversion ratio P/Q is modified.
The sequence of multiply operations to be performed with a reduced set of multipliers for a sampling ratio P/Q is derived as follows. As described in conjunction with
Assuming the interpolating filter has NP taps, where N is the number of samples in the input sequence, the interpolated sequence v[n] is given by
The number of interpolated sequence values computed can be reduced, knowing that the P-1 zeros do not add to the accumulator. Substituting for index variable k such that
the following expression for the interpolated sequence v[n] is obtained
Knowing that only every Qth sample is output as the output signal y[n], i.e. v[nQ]. The output sequence y[n] is obtained by taking every Qth sample from v[n]:
Thus, from the above expression for y[n], the output signal y[n] can be selected directly from the input sequence x[n] by observing the following:
(i) The total number of Multiply and Accumulate (MAC) operations required per output sample y[n] is N for an NP tap interpolation. (ii) A sequence of P consecutive output samples y[n] uses P distinct, non-overlapping filter banks where each filter bank is a subset of N coefficients from the set of NP coefficients, with one of the N coefficients used for each of the N MAC operations. (iii) For a given re-sampling ratio, P/Q, the P filter banks are uniquely determined and the NP coefficients can be reorganized into P groups of N coefficients each and stored in sequential memory locations. (iv) The input sample x[n] selection is determined by the ratio Q/P and dependent on this ratio, input samples may be reused or skipped in successive computations of y[n]. The P reduced sets of filtering coefficients are pre-computed for a P/Q ratio and stored in the coefficient registers.
Table 1 illustrates P reduced sets of coefficients selected from the set of NP coefficients stored in P coefficient registers 502. Each output sample y[n] requires N=4 Multiply and Accumulate (MAC) cycles. The ratio of the input sampling rate to the output sampling rate is P/Q where P=5 and Q=3. The operation for each output sample y[n] is represented by a dot product between vectors in Table 1.
The P reduced sets of N filtering coefficients for computing output samples y[n+0]-y[n+4] shown in Table 1 are stored in the bank of coefficient registers 502. The first vector represents the reduced set of N coefficient values h[n] stored in the coefficient registers 504. The second vector represents the subset of input samples in the data shift registers 506 to be applied to the reduced set of coefficient values. In this example there are P=5 co-efficient registers with non-overlapping coefficients. For subsequent samples of y[n], the coefficients stored in the coefficient registers are used in the same order. The input samples are dithered in the ratio Q/P=3/5 with respect to the output samples, i.e., for every 5 output samples of y, only 3 input samples of x are used.
The filter has 20 taps (NP=20), thus, there are twenty coefficient values h[0]-h[19]. There are 4 (N=4) multipliers in the multiplier accumulator unit, therefore a reduced set of four of the set of twenty coefficient values h[0]-h[19] and a reduced set of four samples from the sample delay line are applied to the multiplier accumulator unit 504 to compute each output sample y[n]. Each reduced set of coefficient values is selected so that zero valued interpolation samples are not computed. Only every Qth sample is computed, that is, y[n]=v[Qn], thus, three subsets (Q=3) of input samples are selected to compute each five (P=5) successive output sample values.
The coefficients stored in the filter banks are computed are based on the following expression for y[n]:
The set of operations to be performed for each output sample y[n] for a conversion ratio of P/Q with N=4, P=5 and Q=3, is shown in Table 1. Referring to Table 1, the subset of coefficient values for y[n+0]; that is, h[0], h[5] h[10], h[15] are computed by substituting n=0, P=5 and Q=3 in the expression for y[n], that is, r*5+0 *3(mod 5), for r=0 to 3. The subset of coefficient values for y[n+1]; that is, h[3], h[8] h[13], h[18] are computed by substituting n=1 and P=5 in the expression for y[n] for r=0 to 3; that is, r*5+1*3 (mod 5), where 3(mod 5)=3 (the remainder resulting from dividing 3 by 5). The input data samples are shifted by one sample to compute the next output sample y[n+2]. The subset of coefficient values for y[n+2]; that is, h[1], h[6] h[11], h[16] are computed by substituting n=2, Q=3 and P=5 in the expression for y[n] for r=0 to 3. The subset of coefficients for y[n+3] are computed in a similar manner with n=3. The input data samples are shifted by one sample to compute the next output sample y[n+4] with n=4. Thus, by appropriate selection of filtering coefficients no computations are performed for the P-1 inserted zero values and output samples that are later discarded are not computed.
The resampling engine 500 is programmable to support a range of re-sampling ratios given by (P,Q) where Pmin≦P≦Pmax and Qmin≦Q≦Qmax. The MAC processing unit 504 is clocked at a maximum speed that sets the limit on the maximum input and output sampling rate that can be supported. Noise is introduced during resampling by aliasing of unsuppressed images. The minimum attenuation required to suppress images introduced during upsampling is based on a minimum signal to noise ratio (SNR) allowed for rate conversion. The maximum number of taps is dependent on minimum attenuation required and the maximum up-sampling rate Pmax to be supported. This determines the size of the coefficient register array (NPmax) and the number of MACs required in the resampling engine (N).
Perfect interpolation with no aliasing requires an ideal filter if the input signal x is sampled at the Nyquist rate, that is, the bandwidth B=π. However, such a filter is impossible to realize in a real system because the transition bandwidth is zero. In practice, signals are usually oversampled such that B<π. The greater the transition bandwidth, the easier it is to realize a filter. Since the spectral images are equal in power, an efficient equiripple, low pass Finite Impulse Response (FIR) filter can be used to suppress the spectral images below a specified level in the stopband.
The transition bandwidth is minimum when the upsampling ratio is Pmax, thus the maximum number of filter taps (Pmax N) is required to obtain the desired attenuation. Assuming the maximum input signal bandwidth is Bmax, similar curves for attenuation as a function of filter taps and upsampling rate can be obtained. Thus, the number of taps can be computed for a range of minimum attenuation values and sampling rates. As shown, for an attenuation of 50 dB at an upsampling rate of 7, a filter with 72 taps required.
If the range of re-sampling ratios is limited and more flexibility is desired, a plurality of engines can be cascaded in series such that
where Pk/Qk are the re-sampling ratios of the kth engine. In such cascaded systems, the re-sampling is distributed uniformly over individual engines.
The engine has been described for use in a wideband system to efficiently extract and process digital sequences with protocol specific sampling rate requirements. The engine can also be used as a hardware accelerator for software defined radios and for communication systems that require adaptive sampling rates.
While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.
This application claims the benefit of U.S. Provisional Application No. 60/424,375, filed on Nov. 6, 2002. The entire teachings of the above application are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
5389923 | Iwata et al. | Feb 1995 | A |
5621404 | Heiss et al. | Apr 1997 | A |
5982305 | Taylor | Nov 1999 | A |
6288794 | Honary | Sep 2001 | B1 |
6462682 | Hellberg | Oct 2002 | B2 |
6522275 | May | Feb 2003 | B2 |
20020046227 | Goszewski et al. | Apr 2002 | A1 |
20030025619 | Zhong | Feb 2003 | A1 |
Number | Date | Country | |
---|---|---|---|
20040117764 A1 | Jun 2004 | US |
Number | Date | Country | |
---|---|---|---|
60424375 | Nov 2002 | US |