This invention relates generally to a system having a carry look ahead adder.
Carry look-ahead (CLA) adders are used in many data processing systems. An n-bit CLA adder can add two n-bit operands and provide a sum of the two operands through the use of propagate and generate terms. The speed of adders within a data processing system can affect operation speed of the data processing system itself. Therefore, it is desirable to improve the speed of adders, such as CLA adders, in order to improve performance of the data processing system.
The present invention is illustrated by way of example and is not limited by the accompanying figures, in which like references indicate similar elements.
Skilled artisans appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help improve the understanding of the embodiments of the present invention.
An (n+1)-bit CLA adder provides a sum of two (n+1)-bit operands, a(0:n) and b(0:n), through the use of fast carry signals created by the Carry look-ahead tree. The operation of conventional CLA adders is known in the art. The basic concept is the use of propagate and generate terms which contribute towards determining the carry signals. In the most common implementation, the propagate and generate terms are initially determined for each single-bit pair of input operands that are to be added. This determination of propagate and generate terms occurs in parallel for all the operand bit pairs. Additional stages of logic are used to subsequently take these single-bit propagate and generate terms to create multi-bit propagate and generate signals corresponding to multiple bit pairs of input operands. Again, this operation occurs in parallel. Hence, a carry look-ahead tree results in the creation of several propagate and generate signals, each of which represents groups containing varying numbers of bit pairs of input operands. Each propagate and generate signal can be either asserted or deasserted. The significance of an asserted generate signal is that it represents the creation of a carry within that group. Similarly, an asserted propagate signal indicates that any carry entering the group will be allowed to propagate out of the group. It is thus seen that propagate and generate terms contribute towards determining the carry value creation and propagation along a carry tree which represents addition of two (n+1)-bit operands.
In systems using conventional CLA adders, each bit of operands a and b is stored in a corresponding latch, where these latched values of a and b are used in the CLA adder to create propagate and generate terms used in providing the final sum. However, in one embodiment of a system using a modified CLA adder as will be described herein, operands a and b are not individually latched. Instead, logic combinations of a and b, corresponding to a propagate term and a generate term, are latched within the modified CLA. That is, as will be described in more detail below, each bit of operands a and b is provided directly from combinational logic circuitry within the system, without being stored, as inputs to logic gates in a first stage of the modified CLA adder whose outputs are latched. These latched outputs correspond to a generate term, which, in one embodiment, is equivalent to the logical expression “ai·bi” and a propagate term, which, in one embodiment, is equivalent to the logical expression “ai+bi,” where i corresponds to a particular bit location within operands a and b. In a first stage of the modified CLA adder to be described herein, a propagate term and a generate term is generated for each of the n+1 bits of operands a(0:n) and b(0:n).
Note that in alternate embodiments, each of the generate terms and propagate terms can refer to any logical expression or combination of ai and bi. For example, in one alternate embodiment, the generate term may be equivalent to the logical expression “ai—bar·bi—bar” (where the “bar” indicates the negative of the corresponding signal). Alternatively, other expressions may be used to define each of the generate and propagate terms. However, for ease of explanation herein, it will be assumed that the generate term corresponds to “ai·bi” and the propagate term to “ai+bi.”
As used herein, the term “bus” is used to refer to a plurality of signals or conductors which may be used to transfer one or more various types of information, such as data, addresses, control, or status. The conductors as discussed herein may be illustrated or described in reference to being a single conductor, a plurality of conductors, unidirectional conductors, or bidirectional conductors. However, different embodiments may vary the implementation of the conductors. For example, separate unidirectional conductors may be used rather than bidirectional conductors and vice versa. Also, plurality of conductors may be replaced with a single conductor that transfers multiple signals serially or in a time multiplexed manner. Likewise, single conductors carrying multiple signals may be separated out into various different conductors carrying subsets of these signals. Therefore, many options exist for transferring signals.
The terms “assert” or “set” and “negate” (or “deassert” or “clear”) are used when referring to the rendering of a signal, status bit, or similar apparatus into its logically true or logically false state, respectively. If the logically true state is a logic level one, the logically false state is a logic level zero. And if the logically true state is a logic level zero, the logically false state is a logic level one.
Therefore, each signal described herein may be designed as positive or negative logic, where negative logic can be indicated by a bar over the signal name, the term “bar” following the signal name, or an asterix (*) following the name. In the case of a negative logic signal, the signal is active low where the logically true state corresponds to a logic level zero. In the case of a positive logic signal, the signal is active high where the logically true state corresponds to a logic level one. Note that any of the signals described herein can be designed as either negative or positive logic signals. Therefore, in alternate embodiments, those signals described as positive logic signals may be implemented as negative logic signals, and those signals described as negative logic signals may be implemented as positive logic signals.
Parentheses are used to indicate the conductors of a bus or the bit locations of a value. For example, “bus 60 (0:7)” or “conductors (0:7) of bus 60” indicates the eight lower order conductors of bus 60, and “address bits (0:7)” or “address (0:7)” indicates the eight lower order bits of an address value. Also, as used in the descriptions herein, note that bit location 0 corresponds to the least significant bit; however, in alternate embodiments, bit location 0 may correspond to the most significant bit.
Furthermore, note that other flip flops and combinational circuitry would be present in system 10 to provide each bit of operands a and b. That is, each of a1-an, and b1-bn, is also provided from other combinational logic circuitries within system 10 to CLA adder 20. Therefore, each bit of operands a and b is provided from combinational logic circuitries (i.e. from various cones of logic) to CLA adder 20. As with flip flops 12-13 and 16-17, these flip flops can be located anywhere within system 10, and may be located at distances far away from CLA adder 20. Also, note that the flip flops, such as flip flops 12-13 and 16-17, can be referred to as storage elements and can be implemented using different types of storing or latching elements.
Note that, as used herein, combinational logic refers to logic which does not include storage elements. For example, combinational logic 14 receives the latched outputs of flip flops 12-13 (X0_lat to XI_lat), and provides a0, but combinational logic 14 does not include storage elements and thus does not store any of the latched outputs of flop flops 12-13, a0, nor any intermediate values which may be determined within combinational logic 14.
In one embodiment, combinational logic circuitry 14 may be an I+1 to 1 multiplexer which provides one of the latched outputs of flip flops 12-13 as operand a0. Therefore, note that combinational logic circuitry 14 may simply provide the value of one of X0_lat to XI_lat as operand a0 without modifying the value, through the use of combinational logic such as a multiplexer. Alternatively, combinational logic circuitry 14 may include any type of logic circuits and any number of logic gates which provide operand a0 based on a logic combination of the latched outputs of flip flops 12-13. The same examples apply to any of the combinational logic circuitry of system 10.
CLA 20 receives operands a(0:n) and b(0:n), computes the arithmetic sum of a and b, and provides sum(0:n), where sum(0:n)=a(0:n)+b(0:n). CLA 20 also receives two clocks, C1_CLK and C2_CLK. Operation of CLA 20 will be described in more detail in reference to
Referring to
Latching element 27 includes NAND gate 22, a switch 26, and inverters 30, 32, and 34. (Note that inverter 28 may also be considered part of latching element 27.) An output of NAND gate 22 is connected to an input of switch 26 and an output of switch 26 is connected to an input of inverter 32 and an output of inverter 30. An output of inverter 32 is connected to an input of inverter 30. C1_CLK is provided as an input to an inverter 28 whose output is provided to a first control input of switch 26. Switch 26 also receives C1_CLK at a second control input. C1_CLK is also provided to an enable input of inverter 30. The output of switch 26 and inverter 30 is provided as generate term g0—bar and is provided to the input of an inverter 34 which provides as its output generate term g0. Therefore, g0 and g0—bar are provided by single bit carry tree stage 46 as the generate terms for single bit location 0. In the illustrated embodiment, g0 represents the logical value of a0·b0 (i.e. of “a0 AND b0”). In alternate embodiments, other logic gates may be used in place of NAND 22, and/or the output of inverter 34 may instead provide g0—bar.
Still referring to
Therefore, single carry tree stage 46 includes a total of n+1 latching elements for latching and providing generate bits g0, g0—bar through gn, gn—bar, respectively, (based on a logical combination of a0, b0 to an, bn, respectively), and a total of n+1 latching elements for latching and providing propagate bits p0, p0—bar through pn, pn—bar, respectively (based on a logical combination of a0, b0 to an, bn, respectively). Therefore, a total of 2n+2 latching elements are used within single bit carry tree stage 46, each latching element storing a generate or a propagate bit, each based on a logical combination of a particular bit location of operand a and the same bit location of operand b.
Furthermore, note that a NAND gate and a NOR gate are used in the illustrated embodiment of
In the illustrated embodiment of
The determination of latched generate terms g(0:n) and g_bar(0:n) and latched propagate terms p(0:n) and p_bar(0:n) occurs in parallel for all the operand bit pairs. This is referred to as the first stage of the carry tree. Additional stages of logic represented by the multiple bit carry tree stages 48 are used to subsequently take these latched single-bit propagate and generate terms to create multi-bit propagate and generate signals corresponding to multiple bit pairs of input operands. As an example, multiple bit carry tree stages 48 includes the second stage of the carry tree which is directly connected to a plurality of latched single-bit generate and propagate terms. This second stage can be used for determining propagate and generate terms corresponding to multiple bit groupings of operand a and operand b. For example, the multiple bit grouping could represent 3 bits of operand a and 3 bits of operand b. The determination of multi-bit propagate and generate terms would then occur in parallel such that a plurality of 3-bit propagate and 3-bit generate terms would be computed. As is known in the art, additional stages of logic in 48 are used to create propagate and generate terms representing even larger number of operand bit pairs. The number of logic stages in 48 depends on the number of bits (n+1) in the adder, and details of the sum stage 52. The implementation shown in
Referring now to partial sum logic 50, the XOR(0:n) outputs represent true values of partial sums of individual bit pairs a0+b0 to an+bn, and the XOR_bar(0:n) represent complimentary values of partial sums of individual bit pairs a0+b0 to an+bn. The values of XOR(0:n) and XOR_bar(0:n) are directly computed from latched bit-wise propagate and generate inputs, such as p(0:n), p_bar(0:n), g(0:n), and g_bar(0:n). The creation of latched bit-wise propagate and generate inputs, such as p(0:n), p_bar(0:n), g(0:n), and g_bar(0:n), may provide a benefit over the prior art because this approach may eliminate time delay resulting from explicitly latching operand a and operand b prior to computing the bit-wise propagate and generate terms.
Still referring to
Similarly, the output of logic gate 24 is stored by inverters 42 and 40 (where inverters 42 and 40 may be referred to as clocked latching circuitry). That is, when C1_CLK is high, switch 36 (which, in the illustrated embodiment is represented by a transmission gate, but may alternatively be formed differently, such as by using a single transistor) provides the output of logic gate 24 to the input of inverter 40. However, while C1_CLK is high, note that inverter 42 remains disabled, so as to prevent contention at storage node 39. When C1_CLK goes low, switch 36 is disabled (becomes open) and inverter 42 is enabled such that the value from logic gate 24 is now stored by inverters 42 and 40 and available at storage node 39 (also referred to as latch node 39). Therefore, p0, which is at the output of inverter 44, corresponds to “a0+b0”.
In a conventional CLA adder, each latch in the single bit carry tree stage stores a(0:n) and b(0:n). In this conventional case, inverters are used in place of logic gates 22 and 24, where each inverter receives a particular bit of operand a or b, and the outputs of inverters 34 and 44 would then provide the latched values of the particular bit of operand a or b. The latched outputs in the conventional CLA adder would then be combined to create propagate and generate terms. However, as will be discussed in reference to
Note that the length of time between X0_lat being valid and a0 being valid is based on the propagation delay of the slowest latched output of flip flops 12-13 through combinational logic circuitry 14. That is, each of values X0_lat through XI_lat need to be valid and propagated through combinational logic circuitry 14 to provide the 0th bit of operand a, i.e. a0. For example, if combinational logic circuitry 14 were an I+1 input AND logic gate, then the slowest input to the AND logic gate would determine when a0 becomes valid. Therefore, the time at which a0 becomes valid may not depend directly on X0_lat, but could depend on another latched output of flip flops 12-13.
When a0 is valid, the output of logic gates 22 and 24 become valid. This occurs at some time 76 prior to falling edge 72 of C1_CLK and thus, the outputs of logic gates 22 and 24 (corresponding to p and g terms) can be latched by inverters 32 and 30 and inverters 30 and 42 at falling edge 72 of C1_CLK (at which point switches 26 and 36 are disabled and storage nodes 29 and 39 now provide the values of p and g). Therefore, at some short time after a0 becomes valid (equivalent to the propagation delay through logical gates 22 and 24), the outputs of logical gate 22 and 24 become valid, as illustrated by arrow 74. The values of p and g (such as, for example, g0, g0—bar, p0, and p0—bar) then remain valid for a full phase of C2_CLK (i.e. phase 66 of C2_CLK). With the values of p and g being valid, the output sum becomes valid at some point after rising edge 68, where the timing of sum being valid is based on the propagation delay through multiple bit carry tree stages 48, XOR and XOR_bar creation 50, and sum stage 52 (which are all dynamic logic) starting from the time which p and g are latched, such as by latching elements 27 and 37.
Note that, in the illustrated embodiment, a0 and p and g all become valid within a same phase 64 of C2_CLK (and also of C1_CLK). In this manner, the values of p and g are available at the falling edge 72 of C1_CLK for use by multiple bit carry tree stages 48 and XOR and XOR13 bar creation 50. Note that in conventional CLA adders in which the operands a and b are latched, the latched values of a and b would be valid at a time later than the time at which operand a0 is valid in
By now it should be appreciated that there has been provided an improved CLA adder in which logical combinations of a and b are stored in preparation for addition rather than operands a and b themselves. That is, the outputs of the combinational logic circuitry (such as circuitry 14 and 18) provide operands (such as a0-an and b0-bn) that are to be added by a CLA adder, but these outputs of the combinational logic circuitry are not latched prior to the CLA adder performing the addition of the two operands. Instead, logic combinations, such as those performed by logic gates 22 and 24, of particular bit locations of operands a and b are latched or stored in order to possibly provide the final sum faster than as previously possible by conventional CLA adders.
Because the apparatus implementing the present invention is, for the most part, composed of electronic components and circuits known to those skilled in the art, circuit details will not be explained in any greater extent than that considered necessary as illustrated above, for the understanding and appreciation of the underlying concepts of the present invention and in order not to obfuscate or distract from the teachings of the present invention.
It should also be understood that all circuitry described herein may be implemented either in silicon or another semiconductor material or alternatively by software code representation of silicon or another semiconductor material.
Although the invention has been described with respect to specific conductivity types or polarity of potentials, skilled artisans appreciated that conductivity types and polarities of potentials may be reversed.
In one embodiment, system 10 is a portion of a computer system such as a personal computer system. Other embodiments may include different types of computer systems. Computer systems are information handling systems which can be designed to give independent computing power to one or more users. Computer systems may be found in many forms including but not limited to mainframes, minicomputers, servers, workstations, personal computers, notepads, personal digital assistants, electronic games, automotive and other embedded systems, cell phones and various other wireless devices. A typical computer system includes at least one processing unit, associated memory and a number of input/output (I/O) devices.
In the foregoing specification, the invention has been described with reference to specific embodiments. However, one of ordinary skill in the art appreciates that various modifications and changes can be made without departing from the scope of the present invention as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of the present invention.
Benefits, other advantages, and solutions to problems have been described above with regard to specific embodiments. However, the benefits, advantages, solutions to problems, and any element(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as a critical, required, or essential feature or element of any or all the claims. As used herein, the terms “comprises,” “comprising,” or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The term “plurality”, as used herein, is defined as two or more than two. The term another, as used herein, is defined as at least a second or more.
The term “coupled”, as used herein, is defined as connected, although not necessarily directly, and not necessarily mechanically.
Because the above detailed description is exemplary, when “one embodiment” is described, it is an exemplary embodiment. Accordingly, the use of the word “one” in this context is not intended to indicate that one and only one embodiment may have a described feature. Rather, many other embodiments may, and often do, have the described feature of the exemplary “one embodiment.” Thus, as used above, when the invention is described in the context of one embodiment, that one embodiment is one of many possible embodiments of the invention.
Notwithstanding the above caveat regarding the use of the words “one embodiment” in the detailed description, it will be understood by those within the art that if a specific number of an introduced claim element is intended in the below claims, such an intent will be explicitly recited in the claim, and in the absence of such recitation no such limitation is present or intended. For example, in the claims below, when a claim element is described as having “one” feature, it is intended that the element be limited to one and only one of the feature described.
Furthermore, the terms “a” or “an”, as used herein, are defined as one or more than one. Also, the use of introductory phrases such as “at least one” and “one or more” in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an.” The same holds true for the use of definite articles.