Recent years have seen a rise in the use of programmable hardware to perform various computing tasks. Indeed, it is now common for many computing applications to make use of programmable arrays of blocks to perform various tasks. These programmable blocks of memory elements provide a useful alternative to application-specific integrated circuits having a more specialized or specific set of tasks. For example, field programmable gate arrays (FPGAs) provide programmable blocks that can be programmed individually and provide significant flexibility to perform various tasks.
As programmable hardware increases in size and complexity, it has become a challenge to implement configurations of hardware that provide fast and effective processing. For example, as logic functions are configured to accept an increasing number of input signals, conventional hardware units (e.g., logic modules) are often unable to produce outputs without causing some amount of delay. In many cases, these delays are unacceptable and potentially cause other logic modules to produce incorrect outputs. Furthermore, conventional approaches to fixing these delays are often complicated and difficult to implement on a given programmable hardware unit.
These and other problems exist with regard to implementing logic functions in on programmable hardware units, such as FPGAs.
The present disclosure relates to features and functionality of a carry chain logic system that leverages carry in and carry out signals from logic blocks to implement AND/OR logic functions on programmable hardware (e.g., FPGA hardware). In particular, embodiments of the carry chain logic system described herein enables implementation of large (e.g., high number of inputs) AND/OR logic gates within the framework of additional logic functions without incurring a significant delay as a result of routing inputs between multiple series of logic modules (e.g., multiple logic levels).
As an illustrative example, one or more embodiments of the carry chain logic system may involve a method or series of acts being implemented on a carry chain of logic modules that are implemented on programmable hardware. In accordance with one or more embodiments described herein, the carry chain logic system may receive an input vector including a plurality of input bits. The carry chain logic system may further receive a bit vector, separate from the input vector, including a plurality of vector bits. The carry chain logic system may cause an input bit to be an input to a first adder in the carry chain of logic modules. In some embodiments, the bit vector includes a primer bit (e.g., a least significant bit (LSB) of the bit vector) that is used as a first or primer input to the first adder to begin the logic chain. The carry chain logic system may then provide each vector bit and an associated bit from the plurality of inputs as inputs to additional adders in the carry chain. In one or more embodiments described herein, carry out signals from the adders can be provided as carry in signals to subsequent adders in the carry chain to generate an output based on a carry out signal from a last adder in the carry chain.
As another example, the carry chain logic system may include a carry chain logic function being implementable on programmable hardware. In accordance with one or more embodiments described herein, the carry chain logic system may include a first logic module having a first adder and a second adder, a second logic module having a third adder and a fourth adder, and any number of additional logic modules having additional adder components. In this example, the first logic module may be configured to receive, at a first adder, a first input and a primer bit of a bit vector to generate a carry out signal that is provided to the second adder. The first logic module may be further configured to receive, at a second adder, the carry out signal from the first adder in combination with a second input and an associated bit value from the bit vector to generate another carry out signal. The carry out signal(s) may be fed to additional adders on other logic modules in conjunction with addition inputs and additional bit values of bit vectors to implement a logic function in accordance with one or more embodiments described herein.
The present disclosure includes a number of practical applications that provide benefits and/or solve problems associated with configuring logic functions on programmable hardware. Examples of some of these benefits are discussed in further detail below.
For example, features of the carry chain logic system enable large input logic gates without incurring propagation delays as a result of multiple logic levels being coupled together in series on programmable hardware. For instance, conventional logic module configurations often involve logic modules that receive inputs and feed outputs from different logic levels in a typical hardware configuration. These additional logic levels result in longer propagation delays that require additional pipelining or a lower clock frequency.
The carry chain logic system additionally provides features that enable logic modules to be combined on the same logic level to create AND/OR logic of a higher number of inputs than any individual logic module(s). For example, a logic module may include hardware that limits a specific logic module to receive and process six inputs. Notwithstanding this limitation, the carry chain logic system utilizes carry in and carry out signals to increase the number of inputs that can be considered by a given logic gate without routing input and output signals between logic levels.
As a more specific example, where a lookup table (LUT) in a conventional logic module may only support a fixed number of inputs (e.g., six inputs), any operation such as an AND/OR reduce over more than six inputs may require multiple logic levels if implemented in the LUTs alone. As a result, implementing an operation that requires more than the fixed number of inputs that an LUT is preconfigured to receive may involve routing inputs between logic levels and incurring delays caused as a result of routing inputs via routing fabric of the programmable logic.
In one or more embodiments, the carry chain logic system improves upon this limitation of conventional systems by utilizing the carry in and carry out signals as inputs to additional logic functions (e.g., on a same logic level as the logic modules associated with the respective carry in and carry out functions). Indeed, as will be discussed in further detail herein, the carry chain logic system may decrease a reliance on input and output signals being routed via routing fabric of the programmable logic hardware. As routing inputs between logic levels can take 100s of picoseconds or even nanoseconds, this delay can cause logic functions to fail or require complicated pipeline configurations and/or lower clock cycles. Conversely, by utilizing the carry in and carry out signals as inputs (e.g., carry in signals) to logic modules on the same logic level, this delay can be significantly reduced and involve a much lower delay of only a few (e.g., 10) picoseconds.
The features described herein can be implemented in a number of ways to provide additional benefits and optimizations within programmable hardware systems. For example, in one or more embodiments described herein, the carry chain logic system may treat an input to be reduced as a bit vector, which can be implemented within an OR reduce function by adding the input bit vector to a bit vector of all 1s. Similarly, by adding the input bit vector to a bit vector of 0s with a primer input of 1 in a least significant bit (LSB), a 1 may not be seen as the carry out signal unless all bits on the input are 1, which may provide an AND reduce operation. As carry chains produce faster signals than LUTs, this enables performance of reductions on a larger number of inputs using carry chains rather than LUTs. This example and additional variations will be discussed below in connection with
As illustrated in the foregoing discussion, the present disclosure utilizes a variety of terms to described features and advantages of the carry chain logic system. Additional detail will now be provided regarding the meaning of some of these terms.
As used herein, a “logic module” may refer to any discrete component of hardware capable of receiving a plurality of inputs and producing an output based on a logic function implemented thereon. In one or more embodiments, a logic module may include components such as LUTs, adders, registers, multiplexors, and routing components. In one or more embodiments described herein, a logic module refers that allows configuration of an N-input (e.g., 4 input, 6 input, 8 input) logic function from any of a number of manufacturers that can be implemented in combination with additional logic modules on a programmable hardware device. In one or more embodiments, a programmable hardware device (e.g., an FPGA device) has 100s, 1000s, or 1,000,000s of logic modules implemented thereon being programmable to implement a wide variety of functions.
As used herein, a “logic level” may refer to a grouping of one or more logic modules separated by another grouping of one or more logic modules via a routing fabric between the modules. For example, in one or more embodiments described herein, a logic level may refer to a carry chain of logic modules having carry signals fed as carry signal inputs between the logic modules. Logic modules of the logic level may provide output signals to additional logic layers directly or via registers capable of capturing data from one clock cycle to the next. In one or more embodiments described herein, a logic chain of logic modules may refer to logic modules within a common logic level. Conversely, a series of logic modules or logic module series may refer to logic modules that feed signals to one another across different logic levels (e.g., via a routing fabric).
As used herein, a bit vector may be a pre-determined vector or set of bit values that may be used as an input to one or more logic modules. The bit vector is composed of multiple vector bits. The bit values of the vector bits may be based on the type of logic function the carry chain logic system is performing. For example, and as will be discussed in further detail herein, a carry chain logic system that is configured to perform an OR reduction may include a bit vector having vector bits with bit values of one. In some examples, a carry chain logic system that is configured to perform an AND reduction may include a bit vector having vector bits with bit values of zero, and a first vector bit with a bit value of one.
In one or more implementations described herein, a first vector bit is referred to as a primer bit. As used herein, a primer bit may be an initial bit that may be added to the first logic module of the carry chain logic system. The primer bit may have a bit value that is based on the configuration of the carry chain logic system. For example, for an AND or an OR reduction, the primer bit may have a bit value of one to allow the first logic module to perform the AND or the OR reduction. In some embodiments, the primer bit may be included as a vector bit in the bit vector.
As used herein, an input vector may include a plurality of input bits for the carry chain logic system. Each logic module may receive an input bit from the input vector. As discussed in further detail herein, the input bits of the bit vector may be received from other functions in the programmable hardware. For example, the input bits may be outputs from a logic function on a different logic level than the carry chain logic system. In some examples, the input bits may be outputs from other logic functions, combinations, subtractions, or any other function performed on the programmable hardware. In some embodiments, the input vector has a quantity of input bits. The number of logic modules in the carry chain logic system may be equal to the quantity of input bits.
Additional detail will now be discussed regarding a carry chain logic system in relation to illustrative figures portraying example embodiments. For example,
As shown in
As an example, and as shown in
As further shown, the logic modules 106 may receive and provide carry signals 108 between logic modules 106 of the logic chain 104. For example, each logic module 106 may receive a carry in signal and provide a carry out signal. As shown in
As shown in
As discussed herein, the logic chain 104 may be of any length, with the carry signal 108 being propagated through any number of logic modules 106. For example, the logic chain 104 shown in
The second logic module 106-2 may receive as inputs the second carry signal 108-2 (e.g., the carry in signal) and a second input 110-2. The second input 110-2 may include a second vector bit from the bit vector and a second input from the input vector. Thus, the second logic module 106-2 may receive three inputs. The second logic module 106-2 may generate a second output 112-2 and a carry signal 108. The logic chain 104 may be continued indefinitely, producing n number of carry signals 108-n. N number of logic modules 106-n may receive the n-carry signals 108-n, n-inputs 110-n, and n-outputs 112-n. This may allow for long logic chains 104 to process large amounts of inputs on the same logic level with reduced processing delays.
As shown in
As shown in
In the embodiment shown, the logic module 206 may receive six LUT inputs 215 at the LUT 216. In the first logic portion 218-1, the LUT 216 may output a first input 224-1 to the first adder 220-1 and provide the first adder 220-1 with a first vector bit 226-1. The first adder 220-1 may receive a first carry in bit 222-1. The first adder 220-1 may process the first carry in bit 222-1, the first input 224-1, and the first vector bit 226-1 and generate a first output 212-1 and a first carry out bit 228-1.
In the second logic portion 218-2, the LUT 216 may generate a second input 224-2. The LUT 216 may provide the second adder 220-2 with the second input 224-2 and a second vector bit 226-2. The second adder 220-2 may receive the first carry out bit 228-1 as a second carry in bit 222-2. The second adder 220-2 may process the second carry in bit 222-2, the second input 224-2, and the second vector bit 226-2 to generate a second output 212-2 and a second carry out bit 228-2. The second carry out bit 228-2 may be used at another logic module 206 on the same logic level as a carry in bit 222.
In one or more embodiments described herein, the logic module 206 may be configured as an AND gate or and OR gate and be configured to add inputs from the respective logic portions of the logic module.
As shown in
As will be discussed in connection with multiple examples below, the logic module 206 may be combined with one or more additional logic modules 206 to create a logic gate having a greater number of inputs than are available for a single logic module 206. For example, the logic module 206 may be combined with a plurality of additional logic modules 206 to provide a large AND gate having a significantly higher number of inputs that would be available in conventional configurations (e.g., thirty, fifty, or hundreds of inputs).
In one or more embodiments described herein, the logic module 206 may implement the larger logic gate by implementing a logic chain having a plurality of logic modules 206 that are configured to receive a bit vector, an input vector, and carry signal(s). In particular, the bit vector may include a vector of 1s and/or 0s in addition to a primer bit fed to a first logic module 206 of plurality of logic modules 206 in the logic chain. The bit vector may be fed as inputs to the logic modules 206 of the logic chain together with an input vector of input bits for which the logic gate is to be applied.
In addition to the bit vector and input vector being provided as inputs, the logic modules 206 may use the carry inputs fed to the adders to create a ripple adder and produce two bit values having a most significant bit (MSB) and a least significant bit (LSB) based on a combination of a relevant bit vector value, input value, and carry signal provided to the adder(s). In one or more embodiments described herein, the MSB becomes the carry out bit for an adder or logic module while the LSB becomes the output for the respective adder or logic module.
It will be appreciated that the MSB and/or LSB may be used differently depending on the specific logic gate the logic modules are programmed to provide. In addition, the specific values of the bit vector, including the set of 0s and/or 1s as well as the primer bit may be determined to be specific values based on the logic gate that the logic modules have been programmed to provide.
Additional detail will now be discussed in connection with a number of example implementations. For example,
A third adder 320-3 may receive as inputs the second carry signal C2, a third vector bit b3 of the bit vector 332, and a third input bit a3 from the input vector 334. The third adder 320-3 may generate a third carry signal C3. A fourth adder 320-4 may receive as inputs the third carry signal C3, a fourth vector bit b4 of the bit vector 332, and a fourth input bit of the input vector 334. The fourth adder 320-4 may generate a fourth carry signal C4. The third adder 320-3 and the fourth adder 320-4 may be part of a second logic module. Thus, the implementation shown in
As shown in
As shown in
As shown in
As mentioned above, the specific values of the bit vectors 332 may be determined based on the specific logic gate to be implemented by the logic modules of the logic chain 304. For example, in the illustrated configuration, the bit vector 332 may include a first primer input (e.g., a “1” bit value) provided to the first adder in combination with additional “1” inputs provided to the additional adders of the OR-reduce logic chain.
In the example shown in
As mentioned above,
The input bit may include the output of an LUT reduce function 438. The LUT reduce function 438 may receive multiple bits and reduce the multiple inputs to a single input for the adder 420. For example, the LUT reduce function 438 may reduce six bits from a bit vector at a time to generate the various inputs a1, a2, a3, a4. In this manner, rather than each input from the bit vector be an individual input to an adder 420, the LUT reduction function 428 may perform an initial reduction of the input bits, thereby reducing the total number of logic modules that may be used when performing a given operation. The output of the adders 420 may then be used in a reduce output 436.
In the example shown in
With reference to the specific example shown in
A second LUT OR reduce may reduce inputs A7 through A12 to generate the second input bit a2. A second adder 420-2 may receive the second input bit a2, a bit vector having a value of 1, and the first carry bit C1. The output of the second adder 420-2 may include a second carry bit C2.
A third LUT OR reduce may reduce inputs A13 through A18 to generate the third input bit a3. A third adder 420-3 may receive the third input bit a3, a bit vector having a value of 1, and the second carry bit C2. The output of the third adder 420-3 may include a third carry bit C3.
A fourth LUT OR reduce may reduce inputs A19 through A24 to generate the fourth input bit a4. A fourth adder 420-4 may receive the third input bit a4, a bit vector having a value of 1, and the third carry bit C3. The output of the fourth adder 420-4 may include a fourth carry bit C4.
An OR reduce may receive the fourth carry bit C4 and perform an OR reduce. In this manner, the carry chain 404 may perform an OR reduce on a 24 bit input vector in a single logic level, thereby reducing the total number of logic modules used to perform the OR reduce and increasing the speed of the OR reduce.
With reference to the specific example shown in
A second LUT AND reduce may reduce inputs A7 through A12 to generate the second input bit a2. A second adder 420-2 may receive the second input bit a2, a bit vector having a value of 0, and the first carry bit C1. The output of the second adder 420-2 may include a second carry bit C2.
A third LUT AND reduce may reduce inputs A13 through A18 to generate the third input bit a3. A third adder 420-3 may receive the third input bit a3, a bit vector having a value of 0, and the second carry bit C2. The output of the third adder 420-3 may include a third carry bit C3.
A fourth LUT AND reduce may reduce inputs A19 through A24 to generate the fourth input bit a4. A fourth adder 420-4 may receive the third input bit a4, a bit vector having a value of 0, and the third carry bit C3. The output of the fourth adder 420-4 may include a fourth carry bit C4.
An OR reduce may receive the fourth carry bit C4 and perform an AND reduce. In this manner, the carry chain 404 may perform an AND reduce on a 24 bit input vector in a single logic level, thereby reducing the total number of logic modules used to perform the AND reduce and increasing the speed of the AND reduce.
An example implementation of this configuration could include a counter configured to increment or decrement based on some detected condition. In this example, a registered function could be configured and driven as an input to an adder (e.g., on the chain of adders). Where this would conventionally be performed by providing an output from a logic module of another logic layer, this framework enables feeding the carry signal as an input to the logic chain within the same logic layer and significantly reduce latency that would otherwise be caused by routing the signal via a routing fabric between logic layers. As shown in
For example, a first logic function (e.g., a subtractor function) can supply a carry out indicating the result of a compare operation. This may be provided in conjunction with a different set of combinational signals that are used to provide a carry-in value to another counter. Thus, this framework may be utilized to implement an entire logic of multiple logic functions using the carry chain configuration described herein.
As shown in
As a general example, the illustrated example shows a first logic function (e.g., a subtraction function). The result of the logic function is fed to an abort stage (e.g., a second logic function), which is fed to an adder function (e.g., a third logic function). This implementation enables a logic chain to be implemented as a more complex function that would normally involve a multi-plexor (MUX) or other function that involves multiple logic stages that has the potential of causing an unacceptable amount of delay. Indeed, by using the carry out signal to drive additional logic functions, the configurations described herein enables a logic chain to consider multiple conditions within a single logic stage.
From a more general view, similar principles of the above examples may be implemented to combine multiple logic functions before or after a reduce logic function (e.g., AND-reduce, OR-reduce), such as before or after a subtraction function as shown in
Example implementations may include incrementing a queue, taking a comparison and using it as a logic function without a routing penalty, updating a read pointer, an abort function, or any other combinational logic.
The first adder generates a first carry out bit based on the primer bit, the first vector bit, and the first input bit at 748. The logic module provides each vector bit from the plurality of vector bits and an associated input bit from the plurality of input bits as inputs to additional adders in the carry chain at 750. The programmable hardware provides a carry out bit from each adder to a next adder in the carry chain to generate an output based on a last carry out bit from a last adder in the carry chain at 752.
In some embodiments, the input vector includes input values based on outputs of combinational logic from additional logic modules implemented on programmable hardware. In some embodiments, additional logic modules are implemented on the same logic level as the carry chain of logic modules. In some embodiments, the input values include carry out signals from adders of the additional logic modules.
In some embodiments, values of the bit vector are based on a configuration of the logic modules to act as a corresponding logic function. In some embodiments, the values of the bit vector include the primer bit and a set of one bit values based on the logic modules being configured to act as an AND-reduce logic function. In some embodiments, the values of the bit vector include the primer bit, a one bit value for the first vector bit, and a set of zero bit values based on the logic modules being configured to act as an OR-reduce logic function. In some embodiments, the primer bit is a one bit value.
The techniques described herein may be implemented in hardware, software, firmware, or any combination thereof, unless specifically described as being implemented in a specific manner. Any features described as modules, components, or the like may also be implemented together in an integrated logic device or separately as discrete but interoperable logic devices. If implemented in software, the techniques may be realized at least in part by a non-transitory processor-readable storage medium comprising instructions that, when executed by at least one processor, perform one or more of the methods described herein. The instructions may be organized into routines, programs, objects, components, data structures, etc., which may perform particular tasks and/or implement particular data types, and which may be combined or distributed as desired in various embodiments.
The steps and/or actions of the methods described herein may be interchanged with one another without departing from the scope of the claims. In other words, unless a specific order of steps or actions is required for proper operation of the method that is being described, the order and/or use of specific steps and/or actions may be modified without departing from the scope of the claims.
The term “determining” encompasses a wide variety of actions and, therefore, “determining” can include calculating, computing, processing, deriving, investigating, looking up (e.g., looking up in a table, a database or another data structure), ascertaining and the like. Also, “determining” can include receiving (e.g., receiving information), accessing (e.g., accessing data in a memory) and the like. Also, “determining” can include resolving, selecting, choosing, establishing and the like.
The terms “comprising,” “including,” and “having” are intended to be inclusive and mean that there may be additional elements other than the listed elements. Additionally, it should be understood that references to “one embodiment” or “an embodiment” of the present disclosure are not intended to be interpreted as excluding the existence of additional embodiments that also incorporate the recited features. For example, any element or feature described in relation to an embodiment herein may be combinable with any element or feature of any other embodiment described herein, where compatible.
The present disclosure may be embodied in other specific forms without departing from its spirit or characteristics. The described embodiments are to be considered as illustrative and not restrictive. The scope of the disclosure is, therefore, indicated by the appended claims rather than by the foregoing description. Changes that come within the meaning and range of equivalency of the claims are to be embraced within their scope.
This application claims the benefit of and priority to U.S. Provisional Application No. 63/295,323 filed Dec. 30, 2021, the entirety of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63295323 | Dec 2021 | US |