The present invention is generally directed to computer pipelines. More specifically, the present invention is directed to increasing the processing performance of a pipelined averaging filter.
A technique of processing pipelines to increase data throughput has long been known in the art. A long task is divided into components and each component is distributed to one processor. A new task can begin even though the former tasks have not been completed. In the pipelined operation, different components of the different tasks are executed at the same time by different processors. Presently, pipelines are in widespread use in nearly all types of data processing electronic equipment, such as sophisticated supercomputers, in which fast and efficient processing of data is essential to the overall operation of the system.
Pipelines have been developed in a wide variety of electronic manufacturing and circuit design configurations. One example of the use of a pipeline is an averaging filter used in digital signal processing. An averaging filter generally consists of at least one subtractor module in series with at least one adder module. Each subtractor and adder module typically has numerous adder logic units and data registers. In order to increase the processing efficiency and speed in each of the subtractor and the adder modules, their respective internal adder logic units and registers are typically placed in pipelined arrangements. While an effective approach for increasing processing efficiency and speed, the typical pipelined configuration is not without shortcomings in other aspects. These shortcoming are even more apparent in high speed averaging filters which operate at high clock rates.
A pipelined processor such as an averaging filter including at least one subtractor section and at least one adder section is disclosed. Both of the subtractor section and the adder section have a plurality of adder logic units. In comparison to the conventional processor, the processor of the present invention is streamlined by the application of one or more of three techniques. First, there is the interleaving approach where the subtractor section and the adder section are interleaved with one another. Second, there is the one delay feedback approach where the adder section includes a one delay feedback for each of the adder logic units. Third, there is the delay enable signal output approach where the averaging filter includes a delay enable signal output for each of the adder logic units of the adder section.
The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate one or more exemplary embodiments of the present invention, and together with the detailed description, serve to explain the principles and exemplary implementations of the invention.
In the drawings:
Various exemplary embodiments of the present invention are described herein in the context of methods and apparatus for increasing the processing performance of pipelined averaging filters. Those of ordinary skill in the art will realize that the following detailed description of the present invention is illustrative only and is not intended to be in any way limiting. Other embodiments of the present invention will readily suggest themselves to such skilled persons having the benefit of this disclosure. Reference will now be made in detail to exemplary implementations of the present invention as illustrated in the accompanying drawings. The same reference indicators will be used throughout the drawings and the following detailed descriptions to refer to the same or like parts.
In the interest of clarity, not all of the routine features of the exemplary implementations described herein are shown and described. It will of course, be appreciated that in the development of any such actual implementation, numerous implementation-specific decisions must be made in order to achieve the developer's specific goals, such as compliance with application- and business-related constraints, and that these specific goals will vary from one implementation to another and from one developer to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking of engineering for those of ordinary skill in the art having the benefit of this disclosure.
Turning first to
y(n)=ax(n)+(1−a)y(n−1) Eq. 1
where y(n) represents the filter output at time n, ax(n) represents the adder component, (1−a) represents the subtractor component, and y(n−1) representing the delay element at time n−1.
For discussion purposes, the adder module 100 shown operates on thirty six bit numbers and may be used in a conventional thirty six bit averaging filter. The pipelining takes the form of splitting each add operation into smaller add blocks which can be completed within the clock period of the digital clock that drives the circuit. The faster the clock is the smaller the add blocks have to be. In this case, assume that the thirty six bit addition is broken into six blocks of six bit additions. Other combinations of blocks are also possible with multiples of two bits being preferred. The addition results are stored in registers or D-type flip flops (DFFs). As shown, the thirty six bit adder module 100 includes adding blocks in the form of six adder logic units a1–a6 and five associated carry delay gates c1–c5. The adder module 100 also includes three hundred delay elements in the form of fifty delayed flip flop gate icons D0–D50 where each of the fifty icons D0–D50 represent six actual DFFs.
In order to complete the addition process, more than one processing cycle is required. The thirty six bit number (bits 0–35) is first parsed into six segments of six bit lengths and entered into the adder module 100 via input gates 120–125. Each six bit segment is added in turn along parallel paths. In the first clock cycle, the adder logic unit a1 receives the first of six bit segments x(5–0), performs the addition operation, and stores the result in DFF 10 with the carry going to carry delay gate c1. Simultaneously, the bits in each of the other five segments of the thirty six bit number are loaded into the DFFs 1, 3, 5, 7, 9, respectively. Subsequent additions of the next bit segments are completed in the subsequent clock cycles in a similar manner via adder logic units a2–a5. The addition result for each six bit segment is stored or accumulated until all of the six bit segments have been processed. For the first six bit segment, the stored result is passed down a chain of DFFs 10, 20, 29, 37, 44, 50 with each subsequent clock cycle. Finally, once all of the necessary operations are carried out, the results are outputted in the form of outputs 110–116.
The subtractor module that is necessary to the operation of an averaging filter is also made of adders and therefore also uses adding blocks as shown in
Turning now to
As shown, the averaging filter 200 employs an interleaving approach. According to this approach, the adder logic units s1–s4 in the subtractor section 220 are interleaved by being coupled to a corresponding adder logic unit a1–a4 in the adder section 210. In this way an output of each of the adder logic units s1–s4 in the subtractor section 220 is inputted directly into an input of a corresponding adder logic unit in the adder section 210. One advantage of the interleaving approach is that each segment of the inputted data string which is processed by an adder logic unit in the subtractor section 220 is then processed by an adder logic unit in the adder section 210 without first having to await the processing completion of the entire string in the subtractor section 220. For example, segment x(5–0) is processed by the adder logic unit s1 in the subtractor section 220 and then by the adder logic unit a1 in the adder section 210 before segment x(18) is processed by the adder unit s4 in the subtractor section 220. By comparison to the sequential two module approach described with respect to
Also shown in
Also shown in
Turning now to
Applied together, the three approaches presented with respect to
It should be noted that the three approaches presented with respect to
Other embodiments, features, and advantages of the present invention will be apparent to those skilled in the art from a consideration of the foregoing specification as well as through practice of the invention and alternative embodiments and methods disclosed herein. Therefore, it should be emphasized that the specification and examples are exemplary only, and that the true scope and spirit of the invention is limited only by the following claims.
This application claims the benefit of the U.S. Provisional Patent Application Ser. No. 60/287,229, filed on Apr. 27, 2001.
Number | Name | Date | Kind |
---|---|---|---|
4128890 | Irwin et al. | Dec 1978 | A |
4658355 | Hatakeyama et al. | Apr 1987 | A |
4819081 | Volk et al. | Apr 1989 | A |
5159292 | Canfield et al. | Oct 1992 | A |
5572453 | Miyake et al. | Nov 1996 | A |
5590065 | Lin | Dec 1996 | A |
5883533 | Matsuda et al. | Mar 1999 | A |
5905388 | Van Der Valk et al. | May 1999 | A |
5970110 | Li | Oct 1999 | A |
6121816 | Tonks et al. | Sep 2000 | A |
6198355 | Lindquist et al. | Mar 2001 | B1 |
6304116 | Yoon et al. | Oct 2001 | B1 |
6323692 | Tsinker | Nov 2001 | B1 |
6429707 | Lamb et al. | Aug 2002 | B1 |
6483389 | Lamb | Nov 2002 | B1 |
20020124038 | Saitoh et al. | Sep 2002 | A1 |
Number | Date | Country | |
---|---|---|---|
60287229 | Apr 2001 | US |