1. Field of Invention
The present invention relates to logic elements for use with programmable logic devices or other similar devices.
2. Description of Related Art
Programmable logic devices (PLDs) (also sometimes referred to as CPLDs, PALs, PLAs, FPLAs, EPLDs, EEPLDs, LCAs, FPGAs, or by other names), are well-known integrated circuits that provide the advantages of fixed integrated circuits with the flexibility of custom integrated circuits. Such devices are well known in the art and typically provide an “off the shelf” device having at least a portion that can be electrically programmed to meet a user's specific needs. Application specific integrated circuits (ASICs) have traditionally been fixed integrated circuits, however, it is possible to provide an ASIC that has a portion or portions that are programmable; thus, it is possible for an integrated circuit device to have qualities of both an ASIC and a PLD. The term PLD as used herein will be considered broad enough to include such devices.
PLDs typically include blocks of logic elements, which are sometimes referred to as logic array blocks (LABs) or “configurable logic blocks” (CLBs). Logic elements (LEs), which are also referred to by other names such as “logic circuits” or “logic cells”, may include a look-up table (LUT), product term, carry-out chain, register, and other elements.
Logic elements, including LUT-based logic elements, typically include configurable elements holding configuration data that determine the particular function or functions carried out by the logic element. A typical LUT circuit may include RAM bits that hold data (a “1” or “0”). However, other types of configurable elements may be used. Some examples may include static, magnetic, ferro-electric or dynamic random access memory, electrically erasable read-only memory, flash, fuse, and anti-fuse programmable connections. The programming of configuration elements could also be implemented through mask programming during fabrication of the device. While mask programming may have disadvantages relative to some of the field programmable options already listed, it may be useful in certain high volume applications. For purposes herein, the generic term “memory element” will be used to refer to any programmable element that may be configured to determine functions implemented by a PLD.
As discussed above, PLDs are commonly constructed using a lookup table (LUT) as the basic logic element. For example, a K-input lookup table (K-LUT) typically includes 2K programmable memory elements, and a 2K to 1 multiplexer, selecting one of the storage elements under the control of the K select inputs to the multiplexer. These K inputs can be considered to be the inputs to a K-input logic function which can implement any particular required logic function by setting the contents of the memory elements to the appropriate values.
There is a tradeoff between cost and speed of a logic circuit constructed with LUTs. Typically the cost of each LUT grows exponentially with the choice of K, but the number of LUTs required to build a logic circuit decreases more slowly with larger values of K. However, the number of LUTs that are in series for a larger value of K will be reduced, making the logic circuit faster. For example, with K=4, sixteen memory elements and a 16:1 multiplexer are required to build a single LUT, and for K=6, sixty-four memory elements and a 64:1 multiplexer are required. A given logic circuit might require one-thousand 4-LUTs, but only eight-hundred 6-LUTs. Under these assumptions, more hardware is required to construct the 6-LUT logic elements because the reduced number of LUTs is insufficient to compensate for the larger complexity of each LUT. However, the increased hardware requirements for the 6-LUT circuitry are offset by a reduction in the delay. The longest path through a logic circuit might be ten 4-LUTs versus eight 6-LUTs. Thus the 6-LUT version of the circuit might be larger, but faster. Further, the 6-LUT circuit would likely require less programmable routing in a PLD, offsetting some of its higher cost.
One reason for the lack of efficiency of larger LUTs is that not all logic functions will use all K inputs. For the example described above, the eight-hundred 6-LUTs might actually include three-hundred 6-input functions, three-hundred 5-input functions, one-hundred 4-input functions, and one-hundred 3-input functions. Thus, the LE based on 6-LUTs is only being used to its fullest extent in three-hundred out of eight-hundred instances.
In addition to LUT operations, some PLDs have included specialized circuitry to perform arithmetic operations efficiently. However, these examples have typically been limited to simple arithmetic operations (e.g., an addition of two inputs) and have generally not exploited internal LUT structures. Increasing the capability of a logic element to perform more complex arithmetic functions while adding only a small amount of additional logic can significantly increase the effective logic density of a PLD and thereby decrease costs.
Thus, there is a need for logic elements that incorporate arithmetic structures with conventional LUT structures to provide greater functionality.
According to some embodiments, arithmetic structures in logic elements result from combining inverters and pass gates (or other multiplexing hardware) with LUT hardware. According to other embodiments, arithmetic structures in logic elements result from combining dedicated adder hardware (e.g., including XOR units) and fracturable LUT hardware. According to other embodiments, arithmetic structures in logic elements result from providing complementary input connections between multiplexers and LUT hardware. In this way, the present invention enables the incorporation of arithmetic structures with LUT structures in a number of ways. The present invention provides flexibility for combining functions in logic elements including various arithmetic functions, relative speeds and costs. In some operational settings, the preferred choice will depend on the typical mix of functions to be implemented in a PLD and the relative importance of speed and area.
Inverters and Pass Gates
Arithmetic structures can result from combining inverters and pass gates (or other multiplexing hardware) with LUT hardware in a logic element.
The pass gates 106 are controlled by pass gate controls e0, e1, and e2 so that when a pass gate control is zero the signal does not pass and when a pass gate control is one the signal does pass. A pass-gate control element 110 determines pass gate controls en0, en1, en2 for the pass gates 106 from the value of c in the logic mode and from the value of cin in the arithmetic mode as shown in the adjacent logic table 112. Preferably a configuration bit in the signal generation for en0, en1, and en2 controls how these signals are generated, depending on whether the logic element 100 is in arithmetic or logic mode. These control values enable the logic element 100 to provide a general logic function in the logic mode and an arithmetic sum in the arithmetic mode. As provided by the table 112, single values pass to x0 and x1, the inputs of the output multiplexer 108, in both the logic and arithmetic modes.
For the arithmetic mode, two multiplexers 114 are connected to y0, y1, y2, and y3, the output values of the 2-LUTs 102. A carry chain element 116 is connected to the multiplexers 114 as well as a carry-chain input cin and a carry-chain output cout.
In the conventional logic mode of operation, the output multiplexer 108 produces z1(a,b,c,d), a logical function of the four control inputs a, b, c, d. The pass gate controls are set by the pass-gate control element 110 so that the four controls 101, the four 2-LUTs 102, the two inverters 104, and the six pass gates 106 form a conventional 4-LUT. The penultimate stage of this LUT is controlled by the c input 101 of the LUT so that the signals x0 and x1 are two 3-LUT outputs and the final stage is controlled by the d input 101 in the output multiplexer 108.
In the arithmetic mode of operation, the output multiplexer 108 produces
sum(a,b,cin,d)=f0(a,b,d)⊕f1(a,b,d)⊕cin,
an arithmetic sum of two logical functions of the control inputs a, b, and d, where cin is the carry-chain input and cout is the corresponding carry-chain output. The functions f0 and f1 are not stored directly in the LUT. Instead, the propagate and generate functions, p and g, are stored. These are well known carry logic signals defined in terms of f0 and f1 as:
p(a,b,d)=f0(a,b,d)⊕f1(a,b,d),
g(a,b,d)=f0(a,b,d)·f1(a,b,d).
Because f0 and f1 are both known functions, the p and g functions can be computed in software and loaded into the LUT contents (i.e., the memory elements of the 2-LUTs 102). More precisely, the first and third of the 2-LUTs 102 are used to compute the propagate function p under the two conditions d=0 (shown as p
The embodiment shown in
It can also be appreciated that although the carry chain element 116 is illustrated for a single bit ripple carry, the use of propagate and generate functions (p and g) allows any one of a number of multi-bit carry chain structures to be implemented.
Dedicated Adder Hardware and Fracturable LUT Hardware
Arithmetic structures can result from combining dedicated adder hardware (e.g., including XOR units) and fracturable LUT hardware in a logic element.
Fracturable LUTs may be understood as modifications or adaptations of conventional LUTs. In general, a fracturable LUT includes a conventional LUT design that has been modified to include additional outputs possibly with additional multiplexers.
In general, a conventional K-LUT includes a configuration of K control inputs and 2K memory elements together with associated multiplexing.
Those skilled in the art will appreciate that a 4-LUT such as the 4-LUT 200 of
For example, the multiplexer 241 closest to the output 215 may be called a first level of multiplexers in the overall 2:1 multiplexer tree of the 4-LUT 200 and the next set of two multiplexers 242 may be called a second level in that tree. By extending the structure of
As will be appreciated by those skilled in the art, a 4:1 multiplexer may be implemented in a manner other than the illustrated multiplexer 240, which has a “tree” of three 2:1 multiplexers 241, 242 at two distinct levels. For example, a 4:1 multiplexer might be implemented by four pass gates with each of the pass gates being controlled by the decoded output of two control signals. In such an example, the four pass gates themselves would not be differentiated by levels relative to each other, however, the 4:1 multiplexer would effectively implement two levels of 2:1 multiplexing.
The principle of a fracturable LUT is illustrated in
The fracturable LUT 300 includes two additional output functions. A first additional output function z0(a,b,c) is provided by the first internal multiplexer 308, and a second additional output function z2(a,b,c) is provided by an additional internal multiplexer 310.
The additional output functions provide additional output capabilities so that, for example, in one operational mode the LUT 300 provides a complete function of the four controls (i.e., z1(a,b,c,d)) while in another mode the top half of the LUT 300 provides a complete function of three controls (i.e., z0(a,b,c)) and the bottom half of the LUT 300 also provides a complete function of three controls (i.e., z2(a,b,d)). Thus the LUT 300 can implement two 3-input functions that share the inputs a and b. It can be appreciated that there are a variety of ways to select pieces of a LUT to use to provide different numbers of functions with different numbers of signals used for their inputs.
Dedicated adder hardware can be configured for example by combining XOR units and multiplexers. In
In each of these examples, the function a⊕b is used to select between the cin and one of the two inputs a or b. It can be seen that it does not matter whether a or b is used for the input to the multiplexer since, if a⊕b is false, which is the case when the multiplexer is selecting the a or b signal, then a and b must have the same value.
In the embodiments presented below, fracturable LUTs are combined with alternative versions of addition hardware such as those illustrated in
In a conventional logic mode of operation, the output multiplexer 506 provides a logic function of the controls: z1(a,b,c,d).
In an arithmetic mode of operation, the logic element 500 uses a dedicated XOR gate 512 to perform the output function and a multiplexer 510 to perform the carry-out function. Two of the 2-LUTs 504 are used to compute a function z0 and its complement. Thus the first internal multiplexer 508 controlled by c causes the 3-LUT to compute p=z0(a,b)⊕c. The XOR 512 and carry-out select multiplexer 510 compute the arithmetic function sum=p⊕cin and the carry-chain output cout=p·cin|
In contrast to the embodiment shown in
In this embodiment a 4-LUT is fractured so that a 3-LUT is used to generate a function of two inputs which is XOR-ed with one of the controls c, which is denoted as an additive control. More generally, a K-LUT can be fractured so that a (K−1)-LUT is used to generate a function of (K−2) inputs which is XOR-ed with one of the controls
This embodiment is similar to the one shown in
In a conventional logic mode of operation, the output multiplexer 906 provides a logic function of the controls: z1(a,b,c,d).
In an arithmetic mode of operation, the logic element 900 uses a dedicated XOR gate 912 to perform the output function and a multiplexer 910 to perform the carry-out function. The internal multiplexers 908, 909 are used to compute a function z0(a,b,c) and its complement. The output multiplexer 906 controlled by d computes p=z0(a,b,c)⊕d. The XOR 912 and carry-out select multiplexer 910 compute the arithmetic function sum=p⊕cin and the carry-chain output cout=p·cin|
The embodiments shown
In a conventional logic mode of operation, the output multiplexer 1106 provides a logic function of the controls: z1(a,b,c,d).
In an arithmetic mode of operation, the first internal multiplexer 1108 produces a logic function z0(a,b,c), the second internal multiplexer 1109 produces a logic function z2(a,b,c), and the first XOR unit 1112 produces an arithmetic function sum=z2(a,b,c)⊕D⊕d⊕cin, where cout is the corresponding carry-chain output given by cout=p·cin|
Additional embodiments result from the incorporation of dedicated hardware for arithmetic rather than using one or more XOR gates as in the above embodiments.
In a conventional logic mode of operation, the output multiplexer 1306 provides a logic function of the controls: z1(a,b,c,d).
In an arithmetic mode of operation, the first internal multiplexer 1308 provides a logic function z0(a,b,c), the third internal multiplexer provides a logic function z2(a,b,d), and the adder unit 1312 provides an arithmetic function z0(a,b,c)+z2(a,b,d).
The embodiment of
Multiplexers with Complementary Input Connections
Arithmetic structures in a logic element can result from providing complementary input connections between multiplexers and LUT hardware.
In a conventional logic mode of operation, the first complementary multiplexer 1406 provides a logic function of the controls: z0(a,b,c,d).
In an arithmetic mode of operation, output multiplexer 1410 provides an arithmetic function z0(a,b,c,d⊕cin), where cout is the corresponding carry-chain output. In this case, the control d 1402 is the additive control, and the complementary multiplexers 1406, 1408 provide functions that are complementary with respect to this argument (i.e., z0(a,b,c,d) and z0(a,b,c,d)). The output multiplexer 1410 is controlled by the carry-chain input cin to select between these two functions to produce z0(a,b,c,d⊕cin), an output form that includes the functional form f(a,b,c)⊕d⊕cin used in above embodiments (cf.
Use of Larger LUTs with Split Inputs
It can be appreciated that all of the above methods can be used with LUTs of any size. For example, in
For large LUTs, such as 6-LUTs, it may be desirable to perform two bits of arithmetic per LE to mitigate the larger cost of these LEs. Further, it may be desirable to perform two logic operations in a larger LE for similar reasons. For this purpose, the concept of a fracturable can be extended to also split the inputs of the LUT. Thus, a fracturable (K,M)-LUT has 2K CRAM cells and can implement a single arbitrary K-input function. To increase its efficiency when a mix of function sizes is to be implemented, as will typically occur in a PLD, the (K,M)-LUT can also be used as two independent logic functions, each of up to (K−1) inputs. Because this will require more than K logic signals, extra inputs must be provided to the LE. In the (K,M)-LUT, an extra M signals are included as inputs, so the LE has a total of K+M inputs. This allows it to implement two functions that have a total of K+M unique signals. For example, a (6,2)-LUT has a total of eight input signals and can implement a five-input function and a three-input function if all the signals are different. Alternatively, it can implement two different five-input functions, if two signals are identical, so that there are only eight unique signals required for the LE.
In the case of M>0, it is necessary to split the LUT inputs to create extra inputs. This will be done by breaking one or more of the common lines to the two halves of the LUT into two separate signals.
The embodiments described above with respect to
Analogously to the logic element 100 of
The pass gates 1606 are controlled by pass gate controls en0, en1, en2, en3, en4, and en5 in an arrangement that duplicates the structures in
For the arithmetic mode, two carry-chain elements 1616, 1617 are cascaded. Two multiplexers 1614 are connected to y0, y1, y2, and y3, from the output values of the 3-LUTs 1602. A first carry chain element 1616 is connected to the multiplexers 1614 as well as cin, as a carry-chain input, and cmid, as a carry-chain output. Similarly two multiplexers 1615 are connected to y4, y5, y6, and y7, also from the output values of the 3-LUTs 1602. A second carry chain element 1617 is connected to the multiplexers 1615 as well as cmid, as a carry-chain input, and cout, as a carry-chain output.
In a conventional logic mode of operation with c=c1=c2 and d=d1=d2, the output multiplexer 1608 produces z1(a,b,c,d,e,f), a logical function of the six control inputs a, b, c, d, e, f.
In a second logic mode of operation, the first intermediate multiplexer 1607 produces z0(a,b,c1,d1,e) and the third intermediate multiplexer 1607 produces z2(a,b,c2,d2,e), each of which is a logical function of its arguments.
In an arithmetic mode of operation, the first intermediate multiplexer 1607 produces
sum0(a, b, c1, cin, e)=f0(a, b, c1, e)⊕f1(a, b, c1,e)⊕cin,
an arithmetic sum of two logical functions of the control inputs a, b, c1, and e, where cin is the carry-chain input and cmid is the corresponding carry-chain output. The third intermediate multiplexer 1607 produces
sum1(a, b, c2, cmid, f)=f2(a, b, c2, f)⊕f3(a, b, c2, f)⊕cmid,
an arithmetic sum of two logical functions of the control inputs a, b, c2, and f, where cmid is the carry-chain input and cout is the corresponding carry-chain output.
Analogously to the logic element 1300 of
In a conventional logic mode of operation with c=c1=c2 and d=d1=d2, the output multiplexer 1706 produces z1(a,b,c,d,e,f), a logical function of the six control inputs a, b, c, d, e, f.
In a second logic mode of operation, a first intermediate multiplexer 1708 produces z2(a,b,c1,d1,e) and a second intermediate multiplexer 1708 produces z4(a,b,c2,d2,f), each of which is a logical function of its arguments.
In an arithmetic mode of operation, the first dedicated adder 1707 produces
sum0(a,b, c1, d1, cin, e)=z0(a, b, c1, d1)⊕z1(a, b, c1, e)⊕cin,
an arithmetic sum of two logical functions, where cin is the carry-chain input and cmid is the corresponding carry-chain output. The second dedicated adder 1713 produces
sum1(a,b,c2,d2,cin,f)=z5(a,b, c2, d2)⊕z6(a, b, c2, f)⊕cmid,
an arithmetic sum of two logical functions, where cmid is the carry-chain input and cout is the corresponding carry-chain output.
Analogously to the logic element 1400 of
In a conventional logic mode of operation with c=c1=c2 and d=d1=d2, the output multiplexer 1810 produces z3(a,b,c,d,e,f), a logical function of the six control inputs a, b, c, d, e, f.
In a second logic mode of operation, an intermediate multiplexer 1806 produces z2(a,b,c1,d1,e) and another intermediate multiplexer 1806 produces z4(a,b,c2,d2,f), each of which is a logical function of its arguments.
In an arithmetic mode of operation, another intermediate multiplexer 1806 produces
z1(a,b,c1,d1,e⊕cin),
an arithmetic function where cin is the carry-chain input and cmid is the corresponding carry-chain output. In this case the first pair 1814 of intermediate multiplexers operate as complementary multiplexers. Another intermediate multiplexer 1806 produces
z5(a,b,c2,d2,f⊕cmid),
an arithmetic function where cmid is the carry-chain input and cout is the corresponding carry-chain output. In this case the second pair 1816 of intermediate multiplexers operate as complementary multiplexers.
The embodiments shown above are applicable generally to data processing environments. For example,
The system 1900 can be used in a wide variety of applications, such as computer networking, data networking, instrumentation, video processing, digital signal processing, or any other application where the advantage of using programmable or reprogrammable logic is desirable. The PLD 1910 can be used to perform a variety of different logic functions. For example, the PLD 1910 can be configured as a processor or controller that works in cooperation with processor 1940 (or, in alternative embodiments, a PLD might itself act as the sole system processor). The PLD 1910 may also be used as an arbiter for arbitrating access to shared resources in the system 1900. In yet another example, the PLD 1910 can be configured as an interface between the processor 1940 and one of the other components in system 1900. It should be noted that system 1900 is only exemplary.
Although only certain exemplary embodiments of this invention have been described in detail above, those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiments without materially departing from the novel teachings and advantages of this invention. Accordingly, all such modifications are intended to be included within the scope of this invention.
Number | Name | Date | Kind |
---|---|---|---|
5260610 | Pedersen et al. | Nov 1993 | A |
5260611 | Cliff et al. | Nov 1993 | A |
5274581 | Cliff et al. | Dec 1993 | A |
5295090 | Hsieh et al. | Mar 1994 | A |
5349250 | New | Sep 1994 | A |
5359242 | Veenstra | Oct 1994 | A |
5359468 | Rhodes et al. | Oct 1994 | A |
5365125 | Goetting et al. | Nov 1994 | A |
5436575 | Pedersen et al. | Jul 1995 | A |
5481206 | New et al. | Jan 1996 | A |
5481486 | Cliff et al. | Jan 1996 | A |
5483478 | Chiang | Jan 1996 | A |
5485103 | Pedersen et al. | Jan 1996 | A |
5488316 | Freeman et al. | Jan 1996 | A |
5500608 | Goetting et al. | Mar 1996 | A |
5523963 | Hsieh et al. | Jun 1996 | A |
5546018 | New et al. | Aug 1996 | A |
5629886 | New | May 1997 | A |
5631576 | Lee et al. | May 1997 | A |
5672985 | Lee | Sep 1997 | A |
5675262 | Duong et al. | Oct 1997 | A |
5724276 | Rose et al. | Mar 1998 | A |
5761099 | Pedersen | Jun 1998 | A |
5818255 | New et al. | Oct 1998 | A |
RE35977 | Cliff et al. | Dec 1998 | E |
5889411 | Chaudhary | Mar 1999 | A |
5898319 | New | Apr 1999 | A |
5898602 | Rothman et al. | Apr 1999 | A |
5909126 | Cliff et al. | Jun 1999 | A |
5920202 | Young et al. | Jul 1999 | A |
5926036 | Cliff et al. | Jul 1999 | A |
5999016 | McClintock et al. | Dec 1999 | A |
6021423 | Nag et al. | Feb 2000 | A |
6051992 | Young et al. | Apr 2000 | A |
6107827 | Young et al. | Aug 2000 | A |
6118300 | Wittig et al. | Sep 2000 | A |
6154052 | New | Nov 2000 | A |
6154053 | New | Nov 2000 | A |
6154055 | Cliff et al. | Nov 2000 | A |
6157209 | McGettigan | Dec 2000 | A |
6191610 | Wittig et al. | Feb 2001 | B1 |
6191611 | Altaf | Feb 2001 | B1 |
6288568 | Bauer et al. | Sep 2001 | B1 |
6288570 | New | Sep 2001 | B1 |
6297665 | Bauer et al. | Oct 2001 | B1 |
6323682 | Bauer et al. | Nov 2001 | B1 |
6400180 | Wittig et al. | Jun 2002 | B2 |
6476634 | Bilski | Nov 2002 | B1 |
6501296 | Wittig et al. | Dec 2002 | B2 |
6943580 | Lewis et al. | Sep 2005 | B2 |
6989687 | Or-Bach | Jan 2006 | B2 |
7062520 | Rupp | Jun 2006 | B2 |
20030055852 | Wojko | Mar 2003 | A1 |
20040251930 | Ngai et al. | Dec 2004 | A1 |
20050127944 | Lewis et al. | Jun 2005 | A1 |