This invention relates to digital signal processing (“DSP”) circuitry, especially on field-programmable gate array (“FPGA”) devices. More particularly, the invention relates to such DSP circuitry that is adapted to implement resampling filters.
A resampling filter, or a sample rate converter, converts an input sample rate to a different output sample rate. It is a widely used filter structure that appears in a variety of applications including general signal processing, medical imaging, wireless communications, and military applications. One particularly useful type of resampling filter is a fractional rate resampling filter. In this type of resampling filter, the output sample rate is a well-defined fraction, U/D, of the input sample rate, where U and D are integers, typically co-prime.
A common implementation of a resampling filter uses decimation or interpolation, or a combination of both. Decimation (or equivalently downsampling) decreases the number of samples of an input signal by a factor of D by removing D minus 1 out of every D samples. Decimation may therefore result in aliasing unless the input signal is band limited in such a way that it is possible to recover the input signal from the downsampled signal without loss of information. Conversely, interpolation (or equivalently upsampling) increases the number of samples of an input signal by a factor of U by inserting (“interpolating”) U minus 1 samples between adjacent samples.
In conventional fractional rate resampling filters, interpolation is generally performed before decimation to preserve the properties of the input signal spectrum and to protect the input signal from aliasing. For example, a conventional U/D fractional rate resampling filter first upsamples the input signal by an upsampling or interpolation factor, U, and second, downsamples the upsampled signal by a downsampling or decimation factor, D. Conventional fractional rate resampling filters thus need to first raise the input signal sample rate before processing and/or downsampling. However, if the input signal sample rate is too high, such implementations of the conventional form of a fractional rate resampling filter may not be feasible. For instance, an FPGA may receive input data from a high speed Analog to Digital Converter at a rate of 500 MHz. If a 2/5 fractional rate conversion is desired (i.e., U=2 and D=5) and upsampling by a factor of U=2 is performed first, the FPGA will need to process signals at a rate of 1 GHz. Such high rates may not be feasible on some devices. Furthermore, even for moderate input sample rates, a large interpolation factor U may raise the sample rate higher than is feasible on some devices.
An alternative to the interpolation and decimation cascade described above is a Farrow filter. A Farrow filter uses polynomial approximation to replace a conventional resampling filter, such that the approximation is done section by section. Commonly used Farrow filters interpolate neighboring sample points via cubic or parabolic interpolation. However, if the application has a strict requirement on the filter response, approximations, and therefore a Farrow filter, cannot be used. Furthermore, if the input sample rate exceeds the device clock rate, a Farrow filter cannot be used either.
The present invention relates to circuitry and methods for effectively implementing a fractional rate resampling filter. In particular, a programmable logic device can be configured as a fractional rate resampling filter capable of performing downsampling prior to upsampling without modifying the overall filter response.
In some embodiments, a method and circuit are provided for resampling data from a first input sample rate to a second output sample rate. Received input data may be downsampled to generate downsampled data at a rate lower than the input sample rate. A first portion of the downsampled data may be output to a first filtering path and a second portion of the downsampled data may be output to a second filtering path. Each filtering path may include a cluster of filter components such that a first portion of the cluster is operable to receive and process, during a first cycle, one of the portions of the downsampled data and a second portion of each cluster is operable to receive and process, during a second cycle, the portion of the downsampled data received by the first portion of the cluster. Outputs of each cluster of the first and second filtering paths may be combined to generate output data at a second sample rate. In some implementations, the first portion of each cluster of the first and second filtering paths may respectively process a first subband of the portions of the downsampled data, while the second portion of each cluster of the first and second filtering paths may respectively process a second subband of the portions of the downsampled data. In some implementations, outputs of each cluster of the first and second filtering paths may be combined by upsampling the outputs of each cluster of the first and second filtering paths following the downsampling. In some implementations, outputs of each cluster of the first and second filtering paths may be combined by summing outputs of the first portion of each cluster of the first and second filtering paths to generate a first output, and summing outputs of the second portion of each cluster of the first and second filtering paths to generate a second output. A final output may then be generated by selectively outputting, using selection circuitry, one of the first and second outputs at the second sample rate.
In some embodiments, the first and second portions of each cluster may be operable to share resources such that the first portion of each cluster uses the resources during the first cycle and the second portion of each cluster uses the resources during the second cycle. In some implementations, these shared resources may include multiplier circuits and a selection circuitry associated with each multiplier circuit. In some implementations, the selection circuitry may operate to selectively output to each multiplier circuit one of a first and second filtering coefficients associated respectively with one of the first and second portions of each cluster. The first filtering coefficient may be selected during the first cycle and the second filtering coefficient may be selected during the second cycle. In some implementations, one of the first and second portion of downsampled data may be delayed and output into one of the multiplier circuits for multiplying with one of the first and second filtering coefficients. In some implementations, outputs of each cluster of the first and second filtering paths may be combined by summing outputs of the multiplier circuits. In some implementations, the downsampling may be performed using a low-voltage differential signaling (LVDS) receiver implemented on a field-programmable gate array (FPGA).
Further features of the invention, its nature and various advantages will be apparent upon consideration of the following detailed description, taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which:
A simplified block diagram of an integrated circuit (“IC” or “device”) 100 in accordance with embodiments of the present invention is shown in
Unlike conventional resampling filters that implement upsampling prior to downsampling, circuitry 100 of
As will be illustrated in connection with
The following description of exemplary embodiments of the present disclosure provides illustration and description for the case of a 2/5 fractional rate resampling filter with an upsampling factor U=2 and a downsampling factor D=5. This is not intended to be exhaustive or to limit the disclosure to the precise form disclosed. Modifications and generalizations to an arbitrary resampling rate are possible in light of the present teachings or may be acquired from practice of what is disclosed herein, and will be further discussed below.
The structure shown in
The prototype filter response function H(z) can further be decomposed by U to yield U filtering paths. This is shown in
In the example of
H(z)=H0(z2)+H1(z2)z−1, (EQ. 1)
where H0(z) corresponds to a first polyphase component or subfilter of H(z), and H1(z) corresponds to a second polyphase or subfilter of H(z). For example, H0(z) may represent the even-numbered coefficients of the prototype filter response H(z) while H1(z) may represent the odd-numbered coefficients of the prototype filter response H(z). Polyphase decomposition thus offers one way to decompose the filtering response into two separate filtering paths 202 and 204 corresponding, respectively, to subfilters H0(z) and H1(z). Input signal X(z), having an input sampling rate fs, may thus be filtered independently along the two filtering paths through subfilters 206 corresponding to H0(z) and 212 corresponding to H1(z). In the top filtering path 202, upsampler block 208 upsamples the output of subfilter 206 (i.e., of subfilter H0(z)), and downsampler block 210 downsamples the upsampled data by 5. In the bottom filtering path 204, upsampler block 214 upsamples the output of subfilter 212 (i.e., of subfilter H1(z)), delay block 216 introduces a delay of a single time cycle according to the polyphase decomposition of EQ. 1, then downsampler block 218 downsamples the upsampled data by 5. Finally, adder circuitry 220 combines the outputs of filtering paths 202 and 204 to generate output signal Y(z). This structure is equivalent to a filter that first upsamples by 2, processes the upsampled signal using prototype filter response function H(z), and finally downsamples by 5.
Recognizing that a single cycle delay element z−1 can be expressed as a combination of advance and delay elements, delay element 216 can be decomposed into elements whose exponents are multiples of U or D. In some implementations, a factorization is selected into advance and delay elements, such that a first exponent is the smallest integer multiple of the upsampling factor U, and a second exponent is the smallest integer multiple of the downsampling factor D. In this particular example, a delay by 1 can be represented by an advance of 4 followed by a delay of 5, i.e., z−1=(z2)2(z−1)5. Through this decomposition, the filtering blocks may be rearranged by applying the Noble Identity for commuting downsamplers and/or upsamplers with filter response function blocks, as illustrated in
According to the Noble Identity, the delay component whose exponent is a multiple of U may be moved to the left of an upsampler by U, and the delay component whose exponent is a multiple of D may be moved to the right of the downsampler by D. Thus, the z−1 delay element 216 of
At the end of performing these operations, the upsampler and downsampler blocks of
In addition, the z2 delay element 232 of
The filtering systems in
As shown in
Commutator circuitry 306 cycles through filtering paths 310 through 314, delivering one input sample to a path at each time unit, e.g., every clock cycle. It thus takes D=5 cycles for the next valid sample to appear on each filtering path. Accordingly, the sample rate on each path 310 through 314 is fs/D, and that commutator circuitry 306 thus performs downsampling by a factor of D. The same applies to commutator circuitry 308 and filtering paths 320 through 324.
Because of the two-cycle advance introduced in
Adder circuitry 326 combines the outputs of each filter component A0(z) through A4(z). Upsampler block 330 upsamples the output of adder circuitry 326 to generate output data at (U/D)fs. Similarly adder circuitry 328 combines the outputs of each filter component B0(z) through B4(z). Upsampler block 328 upsamples the combined result to generate output data at (U/D)fs. Delay element 334 introduces a single delay element in accordance with the polyphase decomposition of EQ. 1. Finally, adder circuitry 336 outputs the final filtered output data, Y(z), at a rate of (U/D)fs.
Filter 300 of
Indices i0 through i4 of filter components Ai
The LVDS receiver 404 functions similarly to commutator circuitries 306 and 308 from
Filter 400 may be configured to accommodate additional delays or advances, for example, as illustrated in
Adder circuitry 430 sums the outputs of filter components Ai
The structure shown in
These various optimizations are illustrated in filter 500 of
In each filtering cluster 510-514, sharing multipliers may involve the use of a multiplexer to select which filter component coefficient to use. This selection may be controlled by control inputs 525-529, respectively. For example, in filtering cluster 510, control input 525 may control the switching between coefficients corresponding to filter component A3(z) and coefficients corresponding to filter component B0(z). The switching occurs at rate (U/D)fs, e.g., at each clock cycle. Therefore, even though data is fed at fs/D to filtering cluster 510, data is output at rate (U/D)fs because of the time division multiplexing in filtering cluster 510.
It should be noted that switching between filter coefficients in filtering clusters 510-514 does not require the use of selection circuitry. For example, instead of a multiplexer, a dual memory bank can be used to store coefficients corresponding to the two filter components, and a single bit bank selector may be used to determine which coefficient to use at each clock cycle.
Finally, adder circuitry 530 combines the outputs of filtering clusters 510-514, having each a rate of (U/D)fs. Adder circuitry 530 thus generates final output signal Y(z) at rate of (U/D)fs.
Because of the timing adjustment introduced in
A FIR filter calculates a weighted sum of a finite number of inputs, summing a number of multiplication results, where each multiplication is between a sample and a coefficient. Each such multiplication may be referred to as a tap. Mathematically, a FIR filter may be described as:
where Yk is the kth output term, ci is the ith coefficient, sk-i is the (k−i)th sample, and n is the number of taps in the filter. For example, an n-tap implementation of filter component A0(z) may be implemented using a bank of n filtering coefficients A0(1), A0(2), . . . , A0(n). Similarly, an n-tap implementation of filter component B2(z) may be implemented using a bank of n filtering coefficients B2(1), B2(2), . . . , B2(n).
As illustrated in
Downsampled data, e.g., from filtering path 507 of
The output of multiplier circuit 604 is fed to adder tree circuitry 650, which collects data from all multiplier circuits of filter component 512. The data output at 652 corresponds to the output of filtering cluster 512, and is at rate (U/D)fs.
As can be seen from
At 802, input data is received at an input sample rate fs. For example, input data may be received at 500 MHz using receiver circuitry such as commutator circuitry 306 of
At 804, the data received at 802 is downsampled along a number of filtering paths. The downsampling may be performing using commutator circuitry, e.g., commutator circuitry 306 of
At 806, outputs of the filtering paths are combined to generate output data at an output sample rate. For example, this step may involve upsampling the outputs of each cluster of the first and second filtering paths following the downsampling. In some embodiments, 806 may be implemented using upsampler blocks and adder circuitry, e.g., upsampler blocks 330 and 332 and adder circuitry 336 of
The structures described above allow for significant reuse of resources. The extent to which resources could be reused depends on a number of factors, including the device clock rate, number of supported input channels, and the desired decimation and interpolation factor (U,D). In the embodiments illustrated in
The above examples illustrate the case where U=2 and D=5. One of ordinary skill in the art would appreciate that similar techniques may be generalized to other resampling factors as well. The approaches disclosed above may for example be used with any (U,D) combination where U and D are co-prime, which is commonly the case in resampling filters. The disclosed approach may also apply to arbitrary (U, D) values and output rates by using different levels of hardware resource reuse.
The structures discussed above may support both single channel and multiple channels inputs. Depending on the device clock rate, resource reuse can be applied both across multiple channels, and/or across multiple filter components Ai(z) and Bj(z).
One advantage of the structures discussed above is that they can be configured as a conventional decimation or interpolation filter by setting either U or D to 1. This would require no modification to the hardware. Another advantage of the structures discussed above is that they allow easy run time reconfiguration of the resampling rate without requiring any hardware change. For example, one may configure the system to not perform any upsampling (i.e., U=1) or to perform upsampling by a configurable ration (U>1) simply by adjusting the control inputs of filtering clusters, e.g., by setting the control inputs 525-529 of filtering clusters 510-514 in the example of
The embodiments shown above are merely exemplary. These and other configurations in accordance with embodiments of the present invention can be implemented in programmable integrated circuit devices such as programmable logic devices, where programming software can be generated to allow users to configure a programmable device to perform the various multiplications and other operations. Although the filters illustrated in
The structures described above also may be generated in fixed logic, in which case the sizes of the various computational components may be fixed to a particular application. Alternatively, the fixed logic circuitry could allow for limited parameterization.
Instructions for carrying out a method according to embodiments of the present invention for programming a programmable device to perform sample rate conversion may be encoded on a machine-readable medium, to be executed by a suitable computer or similar device to implement the method of embodiments of the present invention for programming or configuring programmable logic devices (PLDs) or other programmable devices. For example, a personal computer may be equipped with an interface to which a PLD can be connected, and the personal computer can be used by a user to program the PLD using a suitable software tool, such as the QUARTUS® II software available from Altera Corporation, of San Jose, Calif.
The magnetic domains of coating 852 of medium 850 are polarized or oriented so as to encode, in manner which may be conventional, a machine-executable program, for execution by a programming system such as a personal computer or other computer or similar system, having a socket or peripheral attachment into which the PLD to be programmed may be inserted, to configure appropriate portions of the PLD, including its specialized processing blocks, if any, in accordance with embodiments of the present invention.
In the case of a CD-based or DVD-based medium, as is well known, coating 812 is reflective and is impressed with a plurality of pits 813, arranged on one or more layers, to encode the machine-executable program. The arrangement of pits is read by reflecting laser light off the surface of coating 812. A protective coating 814, which preferably is substantially transparent, is provided on top of coating 812.
In the case of magneto-optical disk, as is well known, coating 812 has no pits 813, but has a plurality of magnetic domains whose polarity or orientation can be changed magnetically when heated above a certain temperature, as by a laser (not shown). The orientation of the domains can be read by measuring the polarization of laser light reflected from coating 812. The arrangement of the domains encodes the program as described above.
A PLD 90 programmed according to embodiments of the present invention may be used in many kinds of electronic devices. One possible use is in a data processing system 900 shown in
System 900 can be used in a wide variety of applications, such as computer networking, data networking, instrumentation, video processing, digital signal processing, or any other application where the advantage of using programmable or reprogrammable logic is desirable. PLD 90 can be used to perform a variety of different logic functions. For example, PLD 90 can be configured as a processor or controller that works in cooperation with processor 901. PLD 90 may also be used as an arbiter for arbitrating access to a shared resources in system 900. In yet another example, PLD 90 can be configured as an interface between processor 901 and one of the other components in system 900. It should be noted that system 900 is only exemplary, and that the true scope and spirit of the invention should be indicated by the following claims.
Various technologies can be used to implement PLDs 90 as described above and incorporating the embodiments of the present invention.
It will be understood that the foregoing is only illustrative of the principles of the invention, and that various modifications can be made by those skilled in the art without departing from the scope and spirit of the invention. For example, the various elements of this invention can be generated on a PLD in any desired number and/or arrangement. One skilled in the art will appreciate that the present invention can be practiced by other than the described embodiments, which are presented for purposes of illustration and not of limitation, and the present invention is limited only by the claims that follow.
Number | Name | Date | Kind |
---|---|---|---|
5173948 | Blackham et al. | Dec 1992 | A |
6531969 | Chu | Mar 2003 | B2 |
7078946 | van der Valk et al. | Jul 2006 | B2 |
7471843 | Messing et al. | Dec 2008 | B2 |
7511656 | Callison | Mar 2009 | B2 |
7529139 | Huang et al. | May 2009 | B2 |
7598790 | Esposito et al. | Oct 2009 | B1 |
7949699 | Neoh et al. | May 2011 | B1 |
8108166 | Arnold et al. | Jan 2012 | B2 |
8156452 | Neoh | Apr 2012 | B1 |
8180002 | Kino et al. | May 2012 | B2 |
8189656 | Ma et al. | May 2012 | B2 |