This invention relates to the field of digital signal processing, and more particularly to a decimator.
Digital filtering is employed in many areas, for example, in measurement systems, source coding, echo cancellation etc. A relatively common filter function is decimation.
A decimator is a structure that combines samples into a single sample. It typically consists of an electronic hardware structure (ASIC's, FPGA's and the like) or software (DSP's), and can be used in any environment where sampling is possible. The decimation function typically has two objectives, namely reduction of the number of samples and an increase in the accuracy of the samples.
Sometimes a side effect of decimation is the most useful property, namely its low pass characteristic. Fast variations between samples ‘disappear’, or better, average out. Although the low pass characteristic really is a side effect, it is possible to make a low pass function without reducing the number of samples.
The way that the decimation function performs this operation is relatively straightforward. The samples are mixed together and averaged. The increase in accuracy is related to averaging the spread of the samples.
In electronic and software environments such decimation is a common function, used in many applications. In these fields a few factors typically influence the decimator, typically the input sample rate, the output sample rate, and the allowable chip real estate (hardware) and time (software).
Sometimes these factors are hard to satisfy. For example, it can happen that the input rate has a large dynamic range, whereas the output does not scale along with it. Such an example can be seen in PLL's, where the reference frequency may be as low as in the Hz range, but as high as 10 GHz. Sample processing at 10 GHz is not feasible with current technologies, and only limited processing, such as counting, is really feasible. Still, up to about 1 GHz current technologies can properly handle the processing, albeit at the cost of power and complexity.
The consequence of a large dynamic range for sampling implies that a decimator function may necessarily be flexible. Normally flexibility in a decimator requires extra hardware.
A conventional decimator is a structure with a group of memory buffers. The simplest form of decimator decimates by two. A single memory stores a first sample, which is added to a second sample to yield one combined sample. The second sampler uses a slower sampling rate; twice as slow. Such a structure is shown in
If the circuit needs to be expanded to three samples being combined an extra memory and an extra adder operation are added. Such an arrangement is shown in
A decimator that averages for instance 128 samples requires a lot of hardware with the above structures. It is possible to change the structure slightly, so that at least the number of adder stages is limited, as is shown in
There is another problem with this structure, and that is the output divider. Only representations that fit well with the division are simple to divide. As such a ternary coding scheme allows simple division by 3. However, most digital hardware is based on binary coding, and thus is only simple to use with divisors that are powers of two; in that case the division is a simple shift, which does not cost any hardware. Most applications use division by powers of two and rate reduction. This can be achieved by using the circuit of
Which of the above prior art structures is most attractive for a particular application depends on many factors, and is not very easily established. Typical design factors are the process for which the design is intended, the sample rate and the sample size (wordsize). Microcontrollers and DSP's find a load of memory with a round robin structure attractive and fast; memory is low cost. Thus the structure shown in
For all three structures, however, it is not very simple to introduce flexibility. Existing flexible structures normally use a mixed approach, as shown in
The structures that are relatively common have two or three modules, a fact which illustrates the attractiveness of this approach. The flexibility that typically is required will, even with this structure, require a considerable amount of programming of the constituent parts. In general this is highly unattractive.
According to the present invention there is provided a decimator for use in digital signal processing comprising an input line for receiving a sequence of input samples at a first sampling rate; a first register for accumulating input samples for which the order of said sequence is a power of a predetermined number greater than one; and a control unit for outputting samples from said first register at a second sampling rate.
In a preferred embodiment the invention also comprises a second register for accumulating input samples for which the order of said sequence is a not power of said predetermined number, and wherein said first register accumulates input samples for which the order of said sequence is a power of said predetermined number combined with a current accumulated value in said second register. The preferred number is preferably two, although other numbers greater than one can be employed.
The invention also provides a method of decimating an input signal in the form of a sequence of input samples at a first sampling rate, comprising accumulating input samples for which the order of said sequence is a power of a predetermined number greater than one; and outputting the accumulated samples a second sampling rate.
The invention will now be described in more detail, by way of example only, with reference to the accompanying drawings, in which:
The novel decimator is best explained first in mathematical equivalencies, since these show best illustrate the underlying principles of the invention. A normal decimator over M will perform the following operation:
This means that the output process is M times slower than the input process, and that the output is the average of the last block of M samples.
It is simple for a decimator to process only a number of samples that is equal to a power of two. This makes the division equal to a simple shift, which in general is preferable. The operation of the decimator is to collect samples until a power of two is reached (so 1, 2, 4 etc), and then keep that value as a decimated value. If the value is sampled, that value is presented. If the next power of two is reached without an output sample having occurred, the new (larger) power of two is kept as the decimated value. Thus each sample effectively contains the largest number of samples in the recent history of input samples, for which the number of samples is a power of two.
The method can be formalized by using a mathematical recurrent expression:
This formula is quite complex and is best discussed in its parts. First the summation of the decimation:
This sets the output as the remainder of the total of all N input samples minus all M samples that are already been sampled on the output (times the number of input samples that were part of that output sample for correct weighting) minus the part of the input samples (the most recent few samples) that are not yet contained in the output.
The formula can be rearranged by moving the left hand side to the right hand side, to obtain a slightly more compact and more mathematical formula:
No information is lost or repeated as long as the decimator remembers the summation of all input samples minus all the samples that have already left the circuit, minus all the samples that are still somewhere in processing. The proof that no information is lost is of some importance for correctness under all circumstances.
Another form of rewriting the same formula yields:
which states that the current output, multiplied by the number of samples (so the undivided output), plus the most recent piece of data that is not yet ‘decimated’, equals the complete history of input minus the output up to the present time. This formula is central to the implementation of the invention.
Then the part that chooses how many samples there in a specific output sample, namely
can also be rewritten as:
This makes the number of samples (ns) in each output sample equal to the maximum power of two that fits (for the remainder that is not processed yet is smaller than that number).
The formulae well explain the implementation of the invention. However, there are aspects of the formulae that are of importance. If both input and output sample rates are such that they have a fixed ratio of a two power which is 2β, the integers x for the respective M samples all will be identical to β. Rephrased, the number of input samples in each output sample will be fixed and equal to 2β. If both input and output sample rates are such that they have a fixed ratio, which is not a two power, but a number γ, with 2β<γ<2β+1, the number of samples in any output sample will be either 2β or 2β+1. Only when the sampling rates are variable compared to each other, will the number of samples in a single output sample have a variance that is larger than a maximum factor 2. This factor two is the difference between the number 2β and 2β+1.
From these observations a number of derivative observations can be concluded. For normal processes where the sample rates have a fixed ratio which is not a power of two, the formulae would lead to a sample distance that varies with a factor of 2, which is the difference between 2β and 2β+1. It may be expected that the processes will use some form of oversampling. The places where 2β+1 samples are summed, the oversample rate will be 2(β+1−
If the oversample rate is high, the reduction of the sample rate yields only limited inaccuracies. Depending on the precise error behaviour (so also dependent on the feeding process) the error behaviour can be calculated or estimated. If the rate variations force a bigger change of sample rate ratio than 2 it is relatively simple to develop some idea of the related inaccuracies.
The circuit has an input, indicated as in and an enable line in-en. The latter provides the sample signal for the input, an output, indicated by datalines out and enable line out-en. The latter is the sample signal from an external circuit into the flexible decimator.
The circuit has several registers, decimation_part 16, decimation_passed 18 and decimation_divided 20. The critical memory elements are the decimation_passed and decimation_part registers.
The register decimation_passed 18 contains all accumulated samples for the most recent occurring samples that are a power of 2, i.e. 20, 21, 22, 23 etc. or, 1, 2, 4, 8 etc.
The register decimation_part 16 contains all accumulated samples that have not been stored in decimation_passed register 18. Thus its contents may be sample 3, or sample 5, or sample 5 plus sample 6 etc.
The register decimation_passed 18 contains the total accumulated version. However, a decimator should also divide its output by the number of samples accumulated in the register. The register that contains the divided version of decimation_passed is register decimation_divided 20.
The shift unit 22 performs the division. Since circuit operates on powers of two, a simple shift block is sufficient to implement a divider. The shift value, held by the block shift_indicator 12, is a function of the value of the counter 10 at the moment when the last power of two was reached.
The counter 10 is incremented by one for each incoming sample and decremented by the most recent reached power of two for each sample on the output. The only limitation on the latter is that the counter must always remain positive. This means that two samples on the output that appear relatively quickly after each other, may lead to only one decrement, namely on the first sample.
The & blocks 24, 26, 28 are AND gates; by having the control unit 30 put a 0 signal on the control inputs of the AND blocks, all output bits can be reset. In this way AND block 24 resets the decimation_part register, AND block 26 makes sure that the decimation_passed register is maintained at the same value (no extra added value), and AND block 28 resets the decimation_passed register.
Decimation_part register 16 is always reset unless there is an input sample when the power of two limit is not yet reached; then the sample must be stored in decimation_part register 16.
Decimation_passed register 18 is always kept stable by resetting AND block 26 unless there is an input sample available, and the power of two limit is reached. The decimation_passed register 18 is reset with AND block 28 when the output sample is taken.
In the memory decimation_passed register 18 the first term is found:
Output(M)*ns(M)
with the ns(M) always a power of two.
In the memory decimation part_register 16, the second term can be found:
A typical sequence of contents that might appear in these two components would, assuming no samples are fetched from the output, be as follows:
The first two samples go into the decimation_passed register since they are powers of two. The third sample goes into the decimation_part register since it is not a power of two.
Of course the sampling on the output may also appear on some places, which drastically changes the contents. The designation 0 means no sample, 1 means sample.
It will be seen that the new counter position is now radically different. For each output sample the decimation_passed register is emptied into the output. Thus each next counter position is, relative to the previous one, 1 higher (1 sample in) minus the number of samples in decimation_passed*(output sample). So each time the output is sampled, the counter position is reduced by the number of samples in decimation_passed.
From these tables it becomes apparent that the AND functions are used for resetting the decimation_passed and decimation_part registers, and also enabling the addition of extra samples from the decimation_part register into the decimation_passed register.
The major advantages in this approach relative to older approaches are the fact that the external sample rate may be chosen independently of the internal process without requiring any setting. A large dynamic range may be achieved by adding enough bits on the word size of decimation_part, decimation_passed and the counter. Each factor of 2 adds one bit on each of these structures. In older structures the extra hardware requires complete registers. So the hardware is in the new solution is relatively small. This is less costly and power hungry. A dynamic range of a factor 216 is for instance quite simple to implement. The decimator also works for any external sample rate ratio (input/output) whereas the ratio itself may even be dynamic.
There are a few other properties that make the above solution relatively advantageous compared to other circuits that could be derived from the same formulae. Samples are stored in one memory only. In case of hardware failures this ensures that no long term errors can arise. This makes the design robust. The checks of the counter positions are quite straightforward to implement, just like the remainders of the datapath. The separation of the shift operator from the internal memories makes it simple to do the division as a post process, instead of inline. The inline process would require shifting on each storage and fetch from decimation_passed. The post process in the form of a shift also yields the possibility to perform some other operation for the shift.
For many applications decimation is a simple matter of combining the samples, and the actual number used for division is not very relevant. In fact, if the division is done with the wrong number, the results will simply be wrong by some gain factor. In the situation where the input and output sample rates have a fixed ratio, which is not a power of two, using a fixed division which is a two power is in many applications fully acceptable. Thus each quantity of samples in the decimator is in fact equal to a possible endvalue. Thus the structure would change to a simpler circuit as shown in
The circuit will now just integrate the samples from the input and present the integrated value, not necessarily with the correct division, but with an approximately correct shift. Therefore the maximum division error is almost 2.
If the division error is too large, a reduced complexity divider may be appropriate as shown in
In the first formula the base number was always a power of two since that yielded a simple division in the form of a binary shift. In fact, if the decimation is coded as a number of BCD terms, shifting over a BCD section would yield a division by 10. If that is more attractive than the standard binary shift, such coding is implicitly attractive, although the hardware is slightly more complex. Such coding is applicable to any base number, inclusive 3 (ternary coding), 4 (which is just a power of 2), 5 (quintary coding) etc. This change does in fact not change the block diagram, but merely the coding inside the blocks.
Other variations of the invention are also possible. Of course the block diagrams allow for quite a variety of implementations. One such variant, shown in
In this embodiment there is a shift 34 inside decimation memory loop. This block diagram has two shifts (in order to maintain correct addition) in the memories. The drawbacks, however, are extra hardware requirements (two shifters instead of one) and smaller margins on timing.
Another embodiment with redundant data is shown in
The invention in effect an auto adaptive structure that does not require any setting as shown in
It will be apparent to one skilled in the art that many additional variations of the invention are possible without departing from the scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
0201333 | Jan 2002 | GB | national |
Number | Name | Date | Kind |
---|---|---|---|
4281318 | Candy et al. | Jul 1981 | A |
5068818 | Uramoto et al. | Nov 1991 | A |
5079734 | Riley | Jan 1992 | A |
5191547 | Kawamoto et al. | Mar 1993 | A |
5548540 | Staver et al. | Aug 1996 | A |
6137349 | Menkhoff et al. | Oct 2000 | A |
Number | Date | Country |
---|---|---|
0 909 028 | Apr 1999 | EP |
2077068 | Dec 1981 | GB |
Number | Date | Country | |
---|---|---|---|
20030177156 A1 | Sep 2003 | US |