This invention relates to logic circuits and, more particularly, to an adiabatic flip-flop and memory cell design and a method of use thereof.
Despite the exponential progress over the past fifty years the increase of performance of computing devices is coming to an end. Modern microprocessors are limited by heat generation, and their speeds have been capped at 4 GHz since 2004. Traditional Complementary MOSFET logic (CMOS) circuits dissipate energy every time they switch in the form of heat. CMOS devices switch using sharp transitions and yield the following equation for their dissipation energy:
ECMOS=½CVDDhu 2 (1)
Here, C is the load capacitance of the logic gate, and VDD is the power supply. The energy is discarded and dissipated into heat after each switching event, therefore imposing a speed limit on the operation of modern computing devices. A diagram of a traditional CMOS inverter is presented in
Adiabatic reversible computing is a viable alternative to traditional circuit implementations since it reduces heat generation by avoiding unnecessary dissipation. Adiabatic reversible computing, or simply adiabatic computing, uses reversible logic and quasi-adiabatic transitions to reduce heat generation by introducing a trade-off between speed and power. This can be implemented by using a slowly ramping clock as a voltage supply yielding the following expression for energy dissipation:
Here, R is the resistance of the logic gate, C is the load capacitance of the logic gate, Vt is the ramping power supply, RC is the intrinsic time constant of the gate, and T is the ramping time of the power supply from a null value, e.g., 0 volts, to Vt, or vice versa. The term including the RC time constant of the gate and the ramping time T of the power supply allows for further reduction in the dissipation energy. When T is much lower than the RC time constant of the gate, energy recovery can be enhanced and energy dissipation can be reduced considerably. Stated differently, the larger the value is of the inverse of the ratio RC/T, i.e., T/RC, the more energy may be recovered from the logic gate and less energy may be dissipated by the logic gate.
Adiabatic computing can be implemented as split-rail charge recovery logic (SCRL). An inverter using adiabatic SCRL logic is shown in
Since the logic values of SCRL gates are not valid when the clocks are ramping up and down, or during the null state, any following gates will require clocks with a different phase. In some non-limiting embodiments or examples, the null state can be 0 Volts. However, this is not to be construed in a limiting sense, since the null state may be another suitable and/or desirable voltage value selected by one skilled in the art, e.g., for a particular application.
A chain of three SCRL inverters is shown in
Even though adiabatic microprocessors using SCRL circuits as described above have been successfully implemented, sequential elements were not implemented using adiabatic logic. Modern CMOS circuits have both combinational logic, such as the non-liming three-inverter chain shown in
Herein, disclosed are designs of example sequential elements, in particular, an example flip-flop and an example SRAM cell that can implement adiabatic computing using SCRL logic. The example adiabatic flip-flop and memory designs are believed to be the first examples of sequential elements that may be used in any practical implementation of adiabatic computing, such as adiabatic microprocessors. Adiabatic microprocessors using the proposed sequential elements may fully implement adiabatic computing.
Generally, provided, in some non-limiting embodiments or examples, is a computer storage element and method of use thereof. In some non-limiting embodiments, the computer storage element may be a flip-flop or memory. In some non-limiting embodiments, the computer storage element may be operated in a manner that reduces, minimizes, or avoids electrical power from entering the computer storage element and thereby power consumption and, hence, heat generation in the computer storage element over prior methods of use.
Further preferred and non-limiting embodiments or examples are set forth in the following numbered clauses
Clause 1: A method comprising: (a) in a computer storage element having first and second power inputs separated by an array of transistors of the computer storage element configured for storing a computer bit of data, applying to an input of the array of transistors a logic value “1” or “0”; (b) concurrent with step (a), applying to the first power input a first clock signal having a leading edge that changes from a null value to VDD, or vice versa, over a time period T1; (c) concurrent with step (b), applying to the second power input a second clock signal having a leading edge that changes from the null value to VSS, or vice versa, over the time period T1, whereupon the logic value applied to the input of the array of transistors is stored in the array of transistors; (d) following step (c), causing the first clock signal to change from VDD to the null value, or vice versa, over a time period T2; and (e) concurrent with step (d), causing the second clock signal to change from VSS to the null value, or vice versa, over the time period T2, whereupon a portion or part of electrical charge or energy associated with the logic value stored in the array of transistors is provided to circuitry that generates the first clock signal, the second clock signal, or both the first and second clock signals, wherein the value of time period T1, or time period T2, or both time periods T1 and T2 is/are greater than a product of RC, where R is resistance associated with the computer storage element, and C is a load capacitance associated with the computer storage element.
Clause 2: A method comprising: (a) in a computer storage element having first and second clock inputs separated by an array of transistors of the computer storage element storing a first bit of data applied to a bit input, applying to the first clock input a first clock signal having a leading edge that changes from a null value to VDD, or vice versa, over a time period T1; (b) concurrent with step (a), applying to the second clock input a second clock signal having a leading edge that changes from the null value to VSS, or vice versa, over the time period T1, whereupon a portion or part of electrical charge or energy associated with the first bit of data stored in the array of transistors is provided to circuitry that generates the first clock signal, the second clock signal, or both the first and second clock signals; (c) following step (b), causing a second bit of data to be stored in the array of transistors; and (d) following step (c), causing, over the time period T2, the first clock signal to return from VDD back to the null value, or vice versa, and the second clock signal to return from VSS back to the null value, or vice versa, whereupon the logic value applied to the bit input is stored in the array of transistors, wherein the value of time period T1, or time period T2, or both time periods T1 and T2 is/are greater than a product of RC, where R is resistance associated with computer storage element, and C is a load capacitance associated with computer storage element.
These and other features of the present invention will become more apparent from the following description wherein reference is made to the appended drawings wherein:
The timing requirements for sequential elements separating Bennett clocked combinational logic is somewhat complicated since the data should be latched in when the Bennett clock phases are all active, but the data should not appear on the latch output until all Bennett phases have ramped back down, e.g., to their null states. This can be accomplished by using a master-slave flip-flop as the sequential element. We present the design of an energy-recovery, adiabatic, master-slave flip-flop compatible with SCRL logic.
The design of a non-limiting example adiabatic master-slave flip-flop is presented in
The adiabatic flip-flop comprises twelve transistors as seen in
A non-limiting example timing diagram of the master-slave flip-flop is presented in
Having thus generally described the example adiabatic master-slave flip-flop shown in
In
Referring now to
In response to the input of a first logic level, e.g., logic level 1 (e.g., VDD), into input “In” prior to time t0 and, beginning at time t0, the subsequent change over a period of time T1 of the value of MClk+ from 0V to VDD and the corresponding change over the period of time T1 of the value of MClk− (not shown) from 0V to VSS, i.e., at the leading edges of MClk+ and MClk−, transistors M1 and M2 turn on or enter a conducting state. With transistors M1 and M2 on, the logic level 1 at “In” appears at the source-drain junction between transistors M3 and M4 and the gates of transistors M5 and M6.
Transistors M3 and M4, operating as a first retention or storage cell, retain the logic level 1 appearing at the source-drain junction between transistors M3 and M4 until the trailing edges of MClk+ and MClk− return to 0V, following the abrupt clock pulse SClk and its inverse
Following the time period T1 of MClk+ and MClk−, the logic level at the input “In” is returned to a null logic state (e.g., don't care state). This change in the logic level at the input “In”, however, has no effect on the logic levels of the first retention cell and the first inverter since the values of MClk+ and MClk− are not changing.
Beginning at time t1, the values of SClk and
In an example, in response to the leading and trailing edges of the abrupt clock pulses SClk and
Following the trailing edges of the SClk and
Following the trailing edges of MClk+ and MClk−, the sequence of signals beginning with the leading edge of BClk1+ (and its inverse BClk1−) through the trailing edge of MClk+ (and its inverse MClk−) can be repeated as deemed suitable and/or desirable for one or more subsequent logic level inputs, i.e., “1” or “0”, into input into “In”. However, this is not to be construed in a limiting sense.
For example, in response to the input of a second logic level, e.g., logic level 0 (e.g., VSS), into input “In” prior to an instance of time t0, and beginning at time t0, the subsequent change over a period of time T1 of the value of MClk+ from 0V to VDD and the corresponding change over the period of time T1 of the value of MClk− from 0V to VSS, i.e., at the leading edges of MClk+ and MClk−, transistors M1 and M2 turn on or enter a conducting state. With transistors M1 and M2 on, the logic level 0 at “In” appears at the source-drain junction between transistors M3 and M4 and the gates of transistors M5 and M6.
Transistors M3 and M4, operating as a first retention or storage cell, retain the logic level 0 appearing at the source-drain junction between transistors M3 and M4 until the trailing edges of MClk+ and MClk− return to 0V, following the abrupt clock pulse SClk and its inverse
Following the time period T1, the logic level at the input “In” is returned to a null logic state (e.g., don't care). This change in the logic level at the input “In”, however, has no effect on the logic levels of the first retention cell and the first inverter since the values of MClk+ and MClk− are not changing.
Beginning at time t1, the values of SClk and
In an example, in response to the leading and trailing edges of the abrupt clock pulses SClk and
While not wishing to be bound by any particular theory, it is believed that the time period(s) T1, or T2, or both T1 and T2 is/are tradeoffs between switching speed and power dissipation. For example, it is believed that longer time period(s) T1, or T2, or both T1 and T2 for MClk+ and MClk− result in electrical charge moving more gradually between the external logic circuit that generates MClk+ and MClk− and at least transistors M3-M6 of the Master Latch, with the result being that for time period T2 less energy is dissipated, e.g., by the resistive, capacitive, and/or inductive elements of the Master Latch, and more electrical charge is recovered by at least the portion of the external logic circuit that generates MClk+ and MClk−.
In contrast, shorter time period(s) T1, or T2, or both T1 and T2 for MClk+ and MClk− result in charge moving more quickly to and from the external logic circuit that generates MClk+ and MClk− and at least transistors M3-M6 of the Master Latch. As a result, more power is dissipated, e.g., by the resistive, capacitive, and/or inductive elements of the Master Latch, during this movement of charge and less charge is recovered by at least the portion of the external logic circuit that generates MClk+ and MClk−.
In some non-limiting embodiments or examples, the time period T1 can range between one picosecond to 100 seconds, or between one picosecond to one millisecond, or between one nanosecond to 100 milliseconds. Similarly, time period T2 can range between one picosecond to 100 seconds, or between one picosecond to one millisecond, or between one nanosecond to 100 milliseconds. However, these values are exemplary only and are not to be construed in a limiting sense. Moreover, T1 and T2 may be the same or different time periods.
In some non-limiting embodiments or examples, the value of T in equation (2) above, which can represent time period(s) T1 and/or T2 in the above flip-flop example, may be tuned or selected such that the value of the inverse of RC/T, i.e., T/RC, is between 2-5000, or is between 2-100, or is between 2-50. However, this is not to be construed in a limiting sense since the value of T, i.e., time period(s) T1 and/or T2 in the above flip-flop example, may be tuned or selected to be any suitable and/or desirable value deemed suitable and/or desirably by one of ordinary skill in the art, e.g., for a particular application and/or based on practicable considerations in the design of the external logic circuit. In an example, the larger the value of T/RC is, the more energy is recovered by the external logic circuit and the less energy is dissipated by the flip-flop shown in
With regard to the above flip-flop example and the variables R and C in equation (2) above, the value of R in equation (2) may be the resistance between the terminals for MClk+ and MClk− at least during time period T2 and the value of C in equation (2) may be capacitance the terminals for MClk+ and MClk− at least during time period T2.
The adiabatic master-slave flip-flop design presented in
With reference to
In some non-limiting embodiments or examples, the external logic circuit used with the adiabatic SRAM cell and/or the external logic circuit used with the adiabatic master-slave flip-flop is/are designed or configured to not only provide the signals shown, for example, in
The enable signals En and
The remaining, trapped, energy in the cell will be dissipated when new data is written into the cell. When the power clock signals SRClk+ and SRClk− are at the null value, the Write signal is asserted (e.g., to VDD), whereupon the new data is applied to the cell from the bit lines Bit and
As shown in
As shown in
After the SRAM cell is erased and at least part of the stored energy is recovered, then the Write signal ramps up, e.g., from a null value or 0 volts to VDD, turning on transistors M3 and M6. This writes the data at the inputs Bit and
Finally, the SRAM cell can be placed back into read mode by ramping down the enable signal En to the null value, e.g., 0 volts, and ramping up the
During a Read cycle, as seen in
In some non-limiting embodiments or examples, the value of T in equation (2) above, which can represent time period(s) T1 and/or T2 in the above SRAM cell example, may be tuned or selected such that the value of the inverse of RC/T, i.e., T/RC, is between 2-5000, or is between 2-100, or is between 2-50. However, this is not to be construed in a limiting sense since the value of T, i.e., time period(s) T1 and/or T2 in the above SRAM cell example, may be tuned or selected to be any suitable and/or desirable value deemed suitable and/or desirably by one of ordinary skill in the art, e.g., for a particular application and/or based on practicable considerations in the design of the external logic circuit. In an example, the larger the value of T/RC is, the more energy is recovered by the external logic circuit and the less energy is dissipated by the SRAM cell shown in
With regard to the example SRAM cell shown in
Although the invention has been described in detail for the purpose of illustration based on what is currently considered to be the most practical non-limiting embodiments or examples, it is to be understood that such detail is solely for that purpose and that the invention is not limited to the disclosed non-limiting embodiments or examples, but, on the contrary, is intended to cover modifications and equivalent arrangements that are within the spirit and scope of the following claims. For example, it is to be understood that the present invention contemplates that, to the extent possible, one or more features of any non-limiting embodiment or example can be combined with one or more features of any other non-limiting embodiment or example.
This application claims the benefit of U.S. Provisional Patent Application No. 63/012,367, filed Apr. 20, 2020, the disclosure of which is incorporated herein by reference.
This invention was made with government support under FA9453-19-P-0519 awarded by the United States Air Force. The government has certain rights in the invention.
Number | Name | Date | Kind |
---|---|---|---|
20160094221 | Wang | Mar 2016 | A1 |
Number | Date | Country | |
---|---|---|---|
20210327496 A1 | Oct 2021 | US |
Number | Date | Country | |
---|---|---|---|
63012367 | Apr 2020 | US |