Multiple power supply circuit architecture

Information

  • Patent Grant
  • 6366061
  • Patent Number
    6,366,061
  • Date Filed
    Wednesday, January 13, 1999
    26 years ago
  • Date Issued
    Tuesday, April 2, 2002
    22 years ago
Abstract
A multiple power supply circuit architecture, such as a circuit power system including a first voltage rail, a first reference rail, a second voltage rail, a second reference rail, and a first selective connector between the first and second voltage rails.
Description




CROSS REFERENCE TO RELATED APPLICATIONS




Not Applicable.




STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT




Not Applicable.




BACKGROUND OF THE INVENTION




1. Field of the Invention




The present invention is directed generally to a multiple power supply circuit architecture and, more particularly, to a method and apparatus for significantly reducing power consumption during sleep-mode without reducing circuit speed.




2. Description of the Background




Many modern integrated circuit systems shut down certain circuit blocks when their capabilities are not needed, in order to save power; e.g., sleep mode in a lap top computer. For simple static CMOS logic, sleep mode can be implemented by gating the clock that drives the latches at the input to the logic functions. For static CMOS logic, if the inputs do not change value, then only static leakage power is dissipated. Normally, static logic circuits dissipate 3 to 6 orders of magnitude less power during sleep mode, so power dissipation during sleep mode is minimal.




However, it is known to design a circuit with a two power supply system. See, for example, U.S. Pat. No. 5,814,845, issued to Carley. Such a system can reduce power consumption and maintain circuit speed. In such a circuit, however, the static leakage power is a significant fraction of the total power. That is because multiple power supply circuits sometimes cause “underdriving” of the input of static CMOS logic gates, which results in a higher leakage current, just as lowering the V


T


does. In general, for systems which employ CMOS logic gates without any form of preamplifiers, the voltage of the smaller power supply is adjusted such that during normal operation the power dissipated by switching (both capacitive charging power and short-circuit power) is approximately equal to the power dissipated by static leakage currents.




Some circuits have tried to address increased sleep-mode power dissipation with multiple V


T


MOS devices, but they require additional masks, additional space, and result in large time delays when transitioning between “sleep” mode and normal operating mode.




Therefore, the need exists for a multiple power supply architecture that reduces leakage current and delays, particularly when transitioning between normal operating mode and “sleep” mode.




BRIEF SUMMARY OF THE INVENTION




The present invention is directed to a multiple power supply circuit architecture. For example, the present invention may be embodied as a circuit power system including a first voltage rail, a first reference rail, a second voltage rail, a second reference rail, and a first selective connector between the first and second voltage rails.




The present invention may also be embodied as a circuit, including a first circuit, a first voltage rail connected to the first circuit, a first reference rail connected to the first circuit, a second circuit, a second voltage rail connected to the second circuit, a second reference rail connected to the second circuit, and a first selective connector between the first and second voltage rails.




The present invention also includes a method of controlling a power system for a circuit, including providing a first power supply, providing a second power supply, connecting the first power supply to the second power supply for sleep mode, and disconnecting the first power supply from the second power supply for non-sleep mode.




The present invention solves problems experienced with the prior art because by providing a circuit with reduced sleep-mode power consumption without reduced circuit speed. Those and other advantages and benefits of the present invention will become apparent from the description of the preferred embodiments hereinbelow.











BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING




For the present invention to be clearly understood and readily practiced, the present invention will be described in conjunction with the following figures, wherein:





FIG. 1

is a block diagram illustrating a circuit in accordance with the present invention;





FIG. 2

is a circuit schematic illustrating a counter constructed according to the present invention;





FIG. 3

is a circuit schematic illustrating a series regulator circuit according to one embodiment of the present invention;





FIG. 4

is a circuit schematic illustrating an embodiment of the present invention with external power;





FIG. 5

is a circuit schematic illustrating another embodiment of the present invention with an external power;





FIG. 6

is a circuit schematic illustrating a circuit including a controller and a dummy critical path;





FIG. 7

is a circuit schematic illustrating a circuit for dynamically adjusting the second voltage and reference rails based on delay tracking;





FIG. 8

is a circuit schematic illustrating another embodiment of the present invention;





FIG. 9

is a circuit schematic illustrating a circuit for monitoring supply voltage and generating bias voltages;





FIG. 10

is a circuit schematic illustrating another embodiment of the circuit illustrated in

FIG. 8

;





FIG. 11

is a plan view of an application of the present invention in which the local area adjustment divides a die into smaller regions;





FIG. 12

is a circuit schematic illustrating a Class B driver/buffer according to the present invention;





FIG. 13

is a circuit schematic illustrating a portion of

FIG. 8

integrated with the circuit of

FIG. 12

;





FIG. 14

is a circuit schematic illustrating another embodiment of the circuit of

FIG. 13

;





FIG. 15

is a block diagram illustrating a 16*16+36-bit MAC architecture;





FIG. 16

is a pie chart illustrating power distribution on a 0.5 μm static CMOS implementation of the invention;





FIGS. 17 and 18

are charts illustrating static CMOS versus QuadRail power-delay comparison measurements;





FIG. 19

is a chart illustrating 0.5 um series-regulated QuadRail MAC measured power-rail waveforms;





FIG. 20

is a microphotograph of static CMOS, QuadRail MAC die microphotographs;





FIGS. 21-23

are charts illustrating static CMOS versus QuadRail power-delay comparisons in 0.35 um CMOS, 0.25 um FDSOI, and 0.16 um CMOS processes; and





FIGS. 24 and 25

are charts illustrating static CMOS versus series-regulated QuadRail power*delay dispersion analysis in 0.5 um processes.











DETAILED DESCRIPTION OF THE INVENTION




It is to be understood that the figures and descriptions of the present invention have been simplified to illustrate elements that are relevant for a clear understanding of the present invention, while eliminating, for purposes of clarity, other elements. Those of ordinary skill in the art will recognize that other elements may be desirable. However, because such elements are well known in the art, and because they do not facilitate a better understanding of the present invention, a discussion of such elements is not provided herein. In the described embodiments, logic signals with an “L” subscript swing between V


DDL


and V


SSL


, and logic signals with an “H” subscript swings between V


DDH


and V


SSH


. The “L” and “H” subscripts distinguish between the “low-swing” and “high-swing” of the circuit, respectively.




The present invention will be described in terms of a doped silicon semiconductor substrate, although advantages of the present invention may be realized using other structures and technologies, such as silicon-on-insulator, silicon-on-sapphire, and thin film transistor.





FIG. 1

is a circuit schematic illustrating a circuit


10


in accordance with the present invention. The circuit


10


employs multiple voltages at the gate level while still allowing for the retention of a static CMOS-based logic gate structure. That structure mixes high-swing and low-swing signals by, for example, operating non-critical path gates with the low-swing voltages and operating critical path gates with high swing voltages. Significant power reductions are realized because there are no DC paths between the power supplies.




The circuit


10


includes a first voltage rail


12


, a first reference rail


13


, a second voltage rail


14


, and a second reference rail


15


. A first selective connector


16


is connected between the first and second voltage rails


12


,


14


, and a second selective connector


18


is connected between the first and second reference rails


13


,


15


. A first circuit


20


is connected to the first voltage and reference rails


12


,


13


, and a second circuit is connected to the second voltage and reference rails


14


,


15


. The first and second circuits


20


,


22


may be any types of circuits such as, for example, logic circuits.




The voltage and reference rails


12


-


15


, under normal operation, are two separate power supplies. The first power supply is formed by the first voltage and reference rails


12


,


13


, and the second power supply is formed by the second voltage and reference rails


14


,


15


. However, the power supplies formed by the voltage and power rails


12


-


15


are not identical. One power supply typically has a larger voltage swing than the other. In addition, the voltage swings may be overlaping or non-overlapping, and centered or non-centered. However, certain benefits are realized if the power supplies are centered (that is, the midpoint of one power supply is the same as the mid point of the other, even though the power supplies have different voltage swings). For example, if the supplies are centered, high and low noise margins are maximized and rising and falling delays are equalized. Although the present invention is illustrated as having four rails


12


-


15


, forming two power supplies, and two selective connectors


16


,


18


, the present invention is not limited to that embodiment. For example, a six rail, three power supply system using three selective connectors can also realize the benefits of the present invention. More rails, connectors, and circuits may also be used.




The first and second selective connectors


16


,


18


are sleep-mode enable devices that keep the power supplies separate during normal operation. However, during the sleep mode, or low power mode, the first and second voltage rails


12


,


14


are shorted together, and the first and second reference rails


13


,


15


are shorted together, thereby eliminating the DC path power consumption that exists during normal operating mode. When the rails


12


-


15


are shorted together, both power supplies are operating at the same or nearly the same voltage. The present invention will be described in terms of the shorted power supplies operating at the high swing voltage, although benefits of the present invention may also be realized if the shorted power supplies are instead operated at the low swing voltage.




The selective connectors


16


,


18


may be, for example, mechanical switches or solid state switches, such as transistors. The selective connectors


16


,


18


may also be more complex devices, such as power supplies, to selectively create a potential between the rails when no connection is desired, and to selectively create a zero potential between the rails when a connection or short is desired. Examples of such power supplies are series-regulated power supplies and switching power supplies.




An advantage of shorting the power supplies together to enter sleep mode is that it results in extremely little static leakage power dissipation. Unlike prior art circuits, however, the present invention provides a circuit


10


that is fully functional at all times, even in sleep mode. More particularly, when the first and second power supplies are shorted together, the entire circuit is still functional at full clock speed. Furthermore, the circuit


10


does not suffer from any recovery delay when it operates in sleep mode. For example, if the circuit


10


is in sleep mode, the second circuit


22


(as well as the first circuit


20


) is still completely functional because it is powered by the high swing voltage. In fact, the second circuit may operate more quickly in sleep mode than in normal mode because it is being driven by a higher voltage. However, operating the second circuit


22


in sleep mode may result in more power being consumed because of the higher voltage driving the second circuit.




Alternatively, only one selective connector, such as


16


, may be provided, so that only one pair of rails, such as


12


,


14


, are connected together during sleep mode. In that embodiment, the other selective connector


18


is eliminated and the rails


13


,


15


are not connected together during sleep mode. For example, the rails


13


,


15


not connected together during sleep mode may be at the same potential so that there is no need to connect them together. In that embodiment, one of those rails, such as


14


, may be eliminated and all of the circuits may be tied to the remaining rail


15


.





FIG. 2

is a circuit schematic illustrating a counter constructed according to the present invention. In that embodiment, the first circuit


20


is a logic stage and the second circuit


22


is a driver/buffer stage. The high swing power supply and low swing power supply are approximately centered. The PMOS devices may have independent N-wells for minimal body-effect on the buffer stage PMOS devices. In addition, the NMOS devices may reside in the native P-substrate to facilitate a single threshold, N-well based process.





FIG. 3

is a circuit schematic illustrating a series-regulator circuit for regulating the high swing and low swing power supplies for the counter illustrated in FIG.


2


. The high swing power (first voltage and reference rails


12


,


13


) may be supplied either off-chip or on-chip. The low swing power (second voltage and reference rails


14


,


15


) may be servoed to maintain a fixed ratio of off-drive to average on-drive current (I


off


/I


on


) in order to balance static and dynamic power. As a result, total power may be minimized without any process modifications.




In one embodiment, the transistor pairs M


3


:M


4


and M


7


:M


8


are ratioed Nx:


1


x, where


1


x is the minimum-width transistor and N is the target I


on


/I


off


ratio. The PMOS devices may be ratioed wider than the NMOS devices in order to equalize their respective drive capabilities. The current mirror devices M


1


:M


2


and M


5


:M


6


may be ratioed 1:1. M


9


and M


10


provide the DC series path between the power rails and are sized to be able to source and sink the peak on-drive current requirement. Three local inter-rail decoupling capacitors (C


d


) each with a value of, for example, 4pF may be used to reduce rippling on the low-swing rails


14


,


15


caused by simultaneous switching noise on the low-swing and high-swing rails.




Transistors M


11


and M


12


are disabled (SLP=Vs


1


) during normal operation. However, during sleep mode (SLP=Vd


1


), or low power mode, the low swing rails are shorted to the high swing rails, eliminating DC path power consumption that exists during active mode.





FIG. 4

is a circuit schematic illustrating an embodiment of the present invention with external power. Power supplies V


B1


, V


B2


, and V


B3


are provided external of the device


10


, such as off-chip. In sleep mode, first and second selective connectors


16


,


18


are closed and connectors


23


,


23


′ are open to remove power supply V


B2


from the second voltage and reference rails


14


,


15


. In normal mode, selective connectors


16


,


18


are open and connectors


23


,


23


′ are closed.





FIG. 5

is a circuit schematic illustrating another embodiment of the present invention with an external power. A single power supply V


B1


provides power to voltage regulators


25


,


25


′, which regulate the second voltage and reference rails


14


,


15


. In sleep mode the voltage regulators


25


,


25


′ connect the first and second voltage rails


12


,


14


together and connect the first and second reference rails


13


,


15


together. In normal mode, the voltage regulators


25


,


25


′ generate separate swing voltages on the rails


12


-


15


. V


B1


may be located external of the device


10


, such as off-chip, while the voltage regulators


25


,


25


′ and all other illustrated components may be located on the device


10


.





FIG. 6

is a circuit schematic illustrating another embodiment of the present invention with a dummy critical path


29


and a controller


30


. The circuit


10


may be used in situations where it is important to optimize latch-to-latch delay and timing. The circuit


10


includes a circuit block


24


including the first and second circuits


20


,


22


and connecting first and second latches


26


,


28


. It also includes a dummy critical path


29


and a controller


30


. As described hereinbelow, the dummy critical path


29


may be eliminated in some embodiments.




The dummy critical path


29


simulates the critical path of the logic block


24


, so as to provide feedback to the controller


30


indicative of the speed at which signals are propagating through the critical path of the logic block


24


. As a result, the dummy critical path


29


provides feedback to the controller


30


regarding factors that affect the speed of the circuit


10


, such as changes in temperature, changes in operating voltage, and manufacturing variations. The dummy critical path


29


does not necessarily have to simulate the entire logic block


24


to be effective. For example, the dummy critical path


29


may simulate the only a portion of the logic block


24


, such as the second circuit


22


which, in the illustrated embodiment, is operating at the lower voltage.




The controller


30


controls the voltage of the second voltage and reference rails


14


,


15


. The controller


30


may control the voltage on the rails


14


,


15


directly, or it may control them indirectly, such as by controlling the first and second selective connectors


16


,


18


(as illustrated with broken lines in FIG.


6


). The controller


30


may also receive feedback from the second voltage and reference rails


14


,


15


. The controller


30


may also receive feedback from the dummy critical path


29


. The controller


30


uses the feedback from the dummy critical path


29


to adjust the low swing voltage of the second voltage and reference rails


14


,


15


. For example, the low swing voltage may be reduced until the signals do not propagate quickly enough through the dummy critical path, thereby minimizing power consumption and still maintaining adequate signal speed. Alternatively, the low swing voltage may be adjusted until dynamic power and static power are equal, such as may be determined from the ratio of I


off


/I


on


. The controller


30


may periodically check the dummy critical path


29


to compensate for changing conditions, such as temperature variations.




In another embodiment, the first and second selective connectors


16


,


18


may be eliminated and the circuit


10


may operate in a more conventional mixed swing quadrail configuration.




In another embodiment, the dummy critical path


29


may be eliminated. For example, the controller


30


may measure signal propagation through the actual critical path when the circuit


10


is not otherwise being used. In that embodiment, the controller


30


may be connected to the front and back of the critical path, such as near the first and second latches


26


,


28


, so as to produce and measure the propagation of a signal through the critical path.





FIG. 7

is a circuit schematic illustrating a circuit for dynamically adjusting the second voltage and reference rails


14


,


15


based on delay tracking. The dummy critical path


29


includes a dummy circuit and associated control circuitry. The dummy circuit may be located in close physical proximity to the second circuit


22


so that the dummy circuit is very similar to the second circuit


22


in variations, such as process and temperature variations, and therefore is representative of the worst case performance of the second circuit


22


. Nonetheless, additional “slack”, such as about ten percent, may be added to the dummy circuit as a safety margin. The charge pumps in the controller


30


decrease or increase the low voltage swing on rails


14


,


15


, depending on whether or not, respectively, the dummy circuit meets the target clock CLK performance. As a result, the voltage on rails


14


,


15


may be fine tuned to the point where the dummy circuit has a delay that matches the target delay. A voltage minimum level (Vddmin/Vssmax) determines the minimum allowable low swing defined by rails


14


,


15


, which may be desired for balancing static and dynamic power or for other reasons, such as maintaining minimum allowed noise margins. The common mode comparison block helps to keep the rails


14


,


15


centered. The buffer drivers in the controller


30


supply the voltages carried on rails


14


,


15


to other parts of the circuit


22


.





FIG. 8

is a circuit schematic illustrating another embodiment of the present invention. The first and second selective connectors


16


,


18


are embodied as NMOS and PMOS transistors, respectively. The NMOS and PMOS transistors are controlled by sleep signals SLP* and SLP, respectively, at their gates. The signals SLP* and SLP may be provided to the selective connectors


16


,


18


by, for example, a logic circuit (not shown), such as may be used to produce other control signals for the circuit


10


. The first circuit


20


includes a PMOS transistor


31


and a current source


32


. The second circuit


22


includes an NMOS transistor


34


and a current source


42


.





FIG. 9

is a circuit schematic illustrating a circuit for monitoring the supply voltages at the rails


12


-


15


, and for generating the bias voltages. Such a circuit is sometimes desirable because there are often significant variations in threshold voltages. Additionally, threshold voltages may change over time or as a result of changes in temperature. Accordingly, it is sometimes desirable to monitor at least some of the voltages carried by the rails


12


-


15


, as well as to back bias the substrate and wells carrying the transistors


20


,


22


. In circumstances where a circuit such as that illustrated in

FIG. 9

is not necessary, the voltages carried by the rails


12


-


15


may be supplied by fixed power supplies, such as batteries.




Back biasing of the substrate is accomplished by a floating power supply


44


connected to the substrate via a conductor


46


. Once substrate voltage V


SUBS


is set, it remains substantially fixed. Accordingly, it may be more appropriate to refer to power supply


44


as an adjustable power supply. One reason for back biasing the substrate is to match the threshold voltages with V


WELL


above the value of the voltage V


DDH


. For example, to substantially reverse bias the PMOS junction capacitances one may place a large back bias on the substrate, e.g. V


SUBS


=V


SSL


−3 volts.




Typical values which may be used in the circuit shown in

FIG. 9

include V


SSL


set to ground potential and V


SUBS


set at −3 volts. The voltage difference across second voltage and reference rails


14


,


15


may be small (e.g. 0.25 volts) and is set by a floating power supply


48


connected across third and fourth rails


14


,


15


. V


DDH


−V


SSH


may be equal to V


DDH


−V


SSL


(e.g. 0.25 volts). V


SSH


and V


WELL


may then be determined because the voltage difference between rails


12


,


15


must be greater than the threshold voltages of the devices, and V


WELL


must be greater than V


DDH


.




V


SSH


−V


SSL


determines the off current flowing through NMOS input transistor


34


. Where V


SSL


is zero volts, V


SSH


determines the off current. A typical value for V


SSH


−V


SSL


is approximately one volt. One of the benefits of the multiple power supply architecture of the present invention is that the value V


SSH


−V


SSL


may be adjusted to make up for variations in the threshold voltages of the n-type devices. The value of V


SSH


may be allowed to float to compensate for V


TN


. A floating power supply


50


is provided across first voltage and reference rails


12


,


13


so as to apply approximately 1.25 volts to the first voltage rail


12


and one volt to the first reference rail


13


. However, the first reference rail


13


is also connected to a negative feedback loop comprised of a constant current source


52


and NMOS transistor


54


connected across rails


14


and


15


. The transistor


54


receives a signal at its gate terminal which is representative of the midpoint between the voltages carried by rails


12


,


13


, i.e., (V


DDH


+V


SSH


)/2. The output of the transistor


54


is connected to a non-inverting put terminal of an operational amplifier


56


. An inverting input terminal of the operational amplifier


56


receives a voltage representative of the midpoint of the voltages carried by rails


14


and


15


, i.e., (V


DDL


+V


SSL


)/2. An output terminal of the operational amplifier


56


is connected to rail


13


. Because of the negative feedback loop comprised of current source


52


, transistor


54


, and operational amplifier


56


, V


SSH


is allowed to float to precisely compensate for the value of V


TN


.




The threshold of transistor


34


V


TNS


will likely be large when several volts of negative bias are applied to the substrate to decrease the junction capacitances of the n-type devices. However, the exact value of V


SSH


−V


SSL


is derived from the feedback loop comprised of current source


52


, transistor


54


, and operational amplifier


56


which determine the necessary difference to achieve a desired mid-point (half way between “on” and “off”) current level for transistor


34


. The on current level is the current through transistor


34


when its gate to source voltage V


GS


is at V


DDH


−V


SSL


. It is typical, but not necessary, that V


DDH


−V


SSH


=V


DDL


−V


SSL


. The exact opposite is true for the PMOS input gate


31


. In that case, the off current is given by the current through the PMOS transistor


31


with V


GS


=V


DDL


−V


DDH


and its on current is determined by V


GS


=V


SSL


−V


DDH


. Because the same voltage difference determines the off current for the NMOS and PMOS devices, this circuit will work correctly when V


TN


=V


TP


. A feedback loop adjusts the value of V


WELL


until the threshold of the n-type devices and the p-type devices match. Another reason for back biasing the substrate is to ensure that V


TS


can be matched with V


WELL


above V


DDH


.





FIG. 9

also illustrates a feedback loop for adjusting V


WELL


. That feedback loop includes a transistor


58


series-connected with a current source


60


across first voltage and reference rails


12


,


13


. The transistor


58


receives at its gate terminal a signal representative of the midpoint in the voltage across the second voltage and reference rails


14


,


15


, i.e., (V


DDL


+V


SSL


)/2. The output of the transistor


58


is input to a non-inverting input terminal of an operational amplifier


62


. An inverting input terminal of the operational amplifier


62


receives a voltage representative of the midpoint in the voltages across rails


12


,


13


i.e., (V


DDH


+V


SSH


)/2. The voltage V


WELL


available at an output terminal of the operational amplifier


62


is connected to the well through a conductor


63


.




The proposed architecture is able to offset the nominal value of V


T


of each component and nearly all of the variation in V


T


. Alternatively, V


T


may be controlled by varying the nominal value of V


T


during the manufacturing process, and by imposing more stringent limitations on its variance during manufacturing.





FIG. 10

is a circuit schematic illustrating another embodiment of the circuit illustrated in FIG.


8


. The current sources


32


,


42


are implemented by transistors


62


,


64


. Transistor


64


acts as a variable current source so the load capacitance can be charged up in the required fraction of a clock cycle. For example, the signal VB


IL


input on the gate terminal of the transistor


64


may be on the order of −0.75 volts to −2 volts. The signal VB


2H


input to the gate terminal of the transistor


62


provides a similar function of setting the value of the current source and may assume a value of 2 volts to 3.5 volts.




The follower circuit


66


is comprised of two series connected PMOS transistors


68


and


70


connected across rails


12


and


13


. The transistor


68


acts as a constant current source. Its value is set by an input signal VB


3H


in a manner similar to that previously described in conjunction with the signal VB


1L


. Transistor


70


receives at its gate terminal the output signal OUT


1




L


. The follower circuit


66


produces an output signal OUT


1




H


. In the illustrated embodiment, the follower has a gain substantially less than one (0.5 to 0.8), so its output swing will not be full rail-to-rail. Accordingly, the output signal may be buffered, such as with another logic gate.




The PMOS transistors


68


,


70


may be fabricated in a well separate from the well of the other p-type transistors. Thus, a separate well bias voltage V


WELL2


may be provided. The signal V


WELL2


can be produced using the concepts illustrated in conjunction with

FIG. 3

but using a reference circuit matched to transistors


68


,


70


and connecting the inverting input terminal of the operational amplifier to the reference circuit output.




The circuit architecture of the present invention can be applied at two different levels of threshold offset adjustment: local-area adjustment and die-level adjustment. Die-level adjustment would use the same values for V


SSH


and V


WELL


across the entire die. That embodiment will offset some of the systemic variations in V


TN


a V


TP


across the wafer and will offset all of the variations between runs. Local-area adjustment divides the die into smaller regions


72


, as illustrated in FIG.


11


. In each region


72


, the values for V


SSH


and V


WELL


would be determined by a local circuit


74


, such as that illustrated in FIG.


9


. To facilitate better voltage range compatibility, only the outputs from the substrate device gates may be distributed between regions


72


. For example, for an n-type well process, the output swinging from V


SSL


to V


DDL


should be distributed between regions because the value of V


SSH


varies between regions. That would also hold true for interconnections between different integrated circuits.





FIG. 12

illustrates a Class B driver/buffer


76


. Like static CMOS, either M


1


is on and M


2


is off, or vice versa. No static power is dissipated by the Class B buffer


76


except for leakage currents. However, because M


1


is operating in common-source mode and M


2


is operating in common-drain mode, the well voltages of M


1


and M


2


may be adjusted separately by area-wide or chip-wide bias generators to make the switching point of the buffer


76


occur at the midpoint of the input swing.





FIG. 13

is a circuit schematic illustrating the second circuit


22


of

FIG. 8

connected to a Class B buffer circuit


76


of the type shown in

FIG. 12. A

transistor


34


′ and a current source


42


′ provide a signal that is the complement of the signal to be buffered.





FIG. 14

is another embodiment of the device illustrated in FIG.


13


. The current source


42


′ is embodied by a transistor


78


′ which is responsive to the complement of the signal input to transistor


34


′. Because the transistors


78


′ and


34


′ are responsive to the true and compliment, respectively, of the same signal, power is dissipated only during switching. Similarly, the current source


42


is embodied as a transistor


78


so that power is dissipated by those transistors only during switching. Thus, while the circuit shown in

FIG. 13

may be viewed as a Class A/B circuit, the circuit shown in

FIG. 14

is a Class B/B circuit.




The transistors


34


′,


78


′,


34


,


78


may be all located on the same substrate such that adjustment of the well potential as was done with transistors M


1


and M


2


is not possible. Under such conditions, one may ratio the widths of the transistors to compensate for differences in gain caused, for example, by different modes of operation. Thus, in

FIG. 13

, the width of transistor


34


is greater than the width of transistor


78


and the width of transistor


34


′ is greater than the width of transistor


78


′. Appropriate ratios may be arrived at by running simulations seeking the largest possible noise margins. Of course, combinations of ratioing and control of well potential may also be used where appropriate.




A two's complement, fixed-point 16*16+36-bit MAC was fabricated in a commercial 0.5μ CMOS process. The MAC comprises of an Overlapped bit-pair Booth-recoded, (


3


,


2


) counter-based Wallace tree 16*16-bit multiplier and a 36-bit Block Carry Lookahead final accumulator, with a single pipeline stage between the multiplier and accumulator for enhanced throughput, shown in FIG.


15


. The power distribution measured on a static CMOS implementation of the MAC is shown in FIG.


16


. The Wallace tree multiplier is the most power-critical MAC component, consuming 75% of total power. This is due to the substantial interconnect capacitances driven by the 28-transistor-based (


3


,


2


) counter within the Wallace tree. In order to lower the multiplier power, three versions of the MAC are fabricated with the multiplier constructed in series-regulated QuadRail, off-chip regulated QuadRail, and conventional static CMOS to study the relative power-delay trade-offs. The final accumulator, due to its higher logic depth than the multiplier, is the most time-critical MAC component and hence sets the maximum clock frequency. It is therefore implemented in full-swing static CMOS in all MAC versions to retain a fixed, high throughput. All three MACs have CMOS-level I/Os to enable interfacing with external CMOS circuitry without level conversion.





FIGS. 17 and 18

show the measured Wallace tree multiplier power-delay comparisons for static CMOS vs. the QuadRail methodologies over a range of operating voltages (2.5-1.5V), i.e., V


dd


for CMOS and V


logic


for QuadRail. QuadRail's corresponding buffer voltages are selected to maintain an I


off


/I


on


ratio of 1:150, which balances static and dynamic power within the QuadRail multiplier while meeting the target delay constraints set by the CMOS MAC.

FIG. 19

shows the low-swing rail waveforms from the series-regulated QuadRail MAC at Vd


1


=2V, Vs


1


=0V. Measured peak-to-peak power/ground bounce on the low-swing power rails is confined to within 8% of the low-swing voltage with 4 pF on-chip inter-rail decoupling capacitors.




Power and delay are measured across 500 pseudo-random input vectors. The off-chip regulated QuadRail approach shows energy/operation savings ranging up to 3.79× over static CMOS, with the savings increasing with voltage scaling. The savings are attributed to the following:




Average point-to-point net capacitance (due to both inter-connect and fanout gate loading) extracted from the Wallace tree multiplier layout is 48fF. This, coupled with the inherently high switching activities of Wallace trees makes the effective switched capacitance per cycle substantial. A full quadratic reduction in buffer stage dynamic power is achieved due to the lowered output swing across this capacitance.




28% of the dynamic power within the multiplier is due to short-circuit power dissipation, despite the multiplier being optimally sized to maintain steep input rise/fall times. Thus, the reduced buffer stage swing offers a nearly cubic reduction in its short-circuit power component as well, contributing to the additional energy/operation savings.




Series-regulated QuadRail offers relatively lower energy/operation savings than off-chip regulated QuadRail, due to the DC series path between the power supplies. Therefore, the buffer stage dynamic power reduction factor drops from quadratic to linear. However, the nearly cubic reduction in buffer stage short-circuit power is still retained, contributing to an energy/operation savings slightly larger than linear. The savings range up to 2.55×, i.e., up to a 35% loss in savings compared to off-chip regulated QuadRail. At 67 MHz/23 MHz (maximum/minimum measured clock speed), the total series-regulated QuadRail MAC power (i.e., multiplier, accumulator, and registers) is 16.6 mW/2.06 mW. Series-regulated QuadRail's DC power disadvantage is offset by the following advantages:




Standby power (152.5 nW) is nearly three orders of magnitude lower than off-chip regulated QuadRail's standby power (143.8 μW), because of the absence of the Vd


1


−Vs


1


totempole current path during sleep mode. Further, transition between sleep and active mode is accomplished in a single clock cycle. Since transitioning to sleep mode essentially transforms QuadRail into conventional static CMOS, circuit state is still retained during standby. Thus, transitioning between sleep and active modes eliminates the need for any explicit state data transferring schemes.




Since the additional low-voltage supply is not required, series-regulated QuadRail is a self-contained methodology that can replace static CMOS operating from a regular, high-swing supply without mandating any system-level modifications.





FIG. 20

shows the static CMOS and QuadRail MAC die microphotographs. The off-chip regulated QuadRail MAC occupies about 10% larger layout area due to intrinsic cell-layout area penalty incurred by its dual-well requirement. Series-regulated QuadRail MAC incurs an additional 8% area penalty due to the on-chip decoupling capacitors.




The power-delay comparisons are extended over three additional commercial single-threshold processes: 0.35 μm CMOS, 0.25 μm FDSOI, and 0.16 μm CMOS, to study the impact of process scaling on energy/operation savings (FIGS.


21


-


23


). Series-regulated QuadRail energy/operation savings increase with process scaling: up to 3.2× in 0.35 μm, 3.45× in 0.25 μm, and 3.8× in 0.16 μm processes. The 0.25 μm implementation's lowest energy/operation (at V


logic


=0.75V, V


buffer


=0.35V) is 6pJ. This is nearly 3.3× lower than one of the lowest reported energy/operation implementations in literature in a comparable multi-threshold 0.25 μm process. Since interconnect capacitance scales slower than gate capacitance with process scaling, the Wallace tree multiplier, because of its interconnect-dominated point-to-point net capacitances, becomes more and more power-critical. This, coupled with the increasing ratios of logic to buffer swings with process scaling, makes driving the multiplier's load capacitances at lower swings to offer improved energy/operation savings. The savings increase even further with process scaling beyond our range of analysis.




To study the impact of series-regulated QuadRail on manufacturability, worst-case process and temperature corner analysis is performed across industrial Slow-NMOS-Slow-PMOS and Fast-NMOS-Fast-PMOS corners of the CMOS and QuadRail multipliers in the 0.5 μm process, shown in

FIGS. 24 and 25

. QuadRail demonstrates similar power*delay dispersions as CMOS at high voltages. With voltage scaling, the dispersion remains well controlled and at V


logic


=1.5V, V


buffer


=0.5V, the power*delay dispersion is 1.8× lower than CMOS, demonstrating improved low-voltage parametric yield. This is attributed to (i) the low-swing rails being dynamically offset across corners to maintain the target I


off


/I


on


ratio, thereby significantly compensating for the manufacturing variations, and (ii) the reduced output swings of QuadRail gates causing the power and delay sensitivities to worst-case corners to be relatively lower than in static CMOS. Further electronic variations control for both QuadRail and CMOS may be achieved through substrate/well back-biasing schemes.




In summary, up to 2.55× energy/operation savings were measured over static CMOS, while offering a simultaneous 1.8× low-voltage manufacturability improvement, without requiring any process or system-level modifications. Experimental results from three additional processes were also presented to show increased savings over static CMOS with process scaling.




The present invention may be utilized in many different devices, such as application specific integrated circuits, single-chip or multi-chip microprocessors, and special purpose microprocessors, such as a digital signal processor or a graphics processor.




The present invention also includes a method of operating a multiple power supply architecture, including controlling a power system for a circuit. The method includes providing a first power supply, providing a second power supply, connecting the first power supply to the second power supply for sleep mode, and disconnecting the first power supply from the second power supply for non-sleep mode. Connecting the power supplies may be accomplished by shorting the first and second power supplies together, such as with switches or power supplies, as discussed hereinabove. Similarly disconnecting the power supplies may be accomplished by opening a switch or transistor, or by using a power supply to produce a voltage between the first and second power supplies. The method may be used locally in a circuit or globally, as discussed hereinabove. For example, the method may be used in a circuit as described with regard to

FIG. 6

, such as by producing a signal indicative of a signal propagating through a critical path of at least one of the first and second circuits, and by controlling one of the first and second power supplies in response to the signal. That method may use a dummy critical path, or may utilize the actual critical path, as discussed hereinabove.




Those of ordinary skill in the art will recognize that many modifications and variations of the present invention may be implemented. For example, although the invention has been described largely in terms of using at least two selective connectors


16


,


18


, the present invention may be utilized with only one selective connector or, in some embodiments, without any selective connectors. The foregoing description and the following claims are intended to cover all such modifications and variations.



Claims
  • 1. A power system, comprising:a first voltage rail; a first reference rail, wherein said first voltage rail and said first reference rail form a first power supply for powering a first circuit; a second voltage rail; a second reference rail, wherein said second voltage rail and said second reference rail form a second supply for powering a second circuit; and a first selective connector between said first and second voltage rails.
  • 2. The system of claim 1, further comprising a second selective connector between said first and second reference rails.
  • 3. A power system, comprising:a first voltage rail; a first reference rail; a second voltage rail; a second reference rail; a first selective connector between said first and second voltage rails; a second selective connector between said first and second reference rails; at least one additional voltage rail; at least one additional reference rail; at least one additional selective connector between said at least one additional voltage rail and at least one of said first and second voltage rails; and another at least one additional selective connector between said at least one additional reference rail and at least one of said first and second reference rails.
  • 4. The system of claim 1, wherein:said first voltage and reference rails form a first power supply; said second voltage and reference rails form a second power supply; and said first and second power supplies have voltage swings that are overlapping.
  • 5. The system of claim 4, wherein said first and second power supplies are centered.
  • 6. The system of claim 1, wherein:said first voltage and reference rails form a first power supply; said second voltage and reference rails form a second power supply; and said first and second power supplies have voltage swings that are not overlapping.
  • 7. The system of claim 1, wherein:said first voltage and reference rails form a first power supply having a first voltage swing; and said second voltage and reference rails form a second power supply having a second voltage swing, wherein said first voltage swing is greater than said second voltage swing.
  • 8. The system of claim 2, wherein said first and second selective connectors are selected from a group consisting of mechanical switches, transistors, and power supplies.
  • 9. A circuit comprising:a first circuit; a first voltage rail connected to said first circuit; a first reference rail connected to said first circuit; a second circuit; a second voltage rail connected to said second circuit; a second reference rail connected to said second circuit; and a first selective connector between said first and second voltage rails.
  • 10. The circuit of claim 9, further comprising a second selective connector between said first and second reference rails.
  • 11. A circuit, comprising:a first circuit; a first voltage rail connected to said first circuit; a first reference rail connected to said first circuit; a second circuit a second voltage rail connected to said second circuit; a second reference rail connected to said second circuit; a first selective connector between said first and second voltage rails; at least one additional circuit; at least one additional voltage rail connected to said at least one additional circuit; at least one additional reference rail connected to said at least one additional circuit; at least one additional selective connector between said at least one additional voltage rail and at least one of said first and second voltage rails; and another at least one additional selective connector between said at least one additional reference rail and at least one of said first and second reference rails.
  • 12. The circuit of claim 9, wherein said first and second circuits form a CMOS circuit architecture.
  • 13. The circuit of claim 9, wherein:said first voltage and reference rails form a first power supply; said second voltage and reference rails form a second power supply; and said first and second power supplies have voltage swings that are overlapping.
  • 14. The circuit of claim 13, wherein said first and second power supplies are centered.
  • 15. The circuit of claim 9, wherein:said first voltage and reference rails form a first power supply; said second voltage and reference rails form a second power supply; and said first and second power supplies have voltage swings that are not overlapping.
  • 16. The circuit of claim 9, wherein:said first voltage and reference rails form a first power supply having a first voltage swing; and said second voltage and reference rails form a second power supply having a second voltage swing, wherein said first voltage swing is greater than said second voltage swing.
  • 17. The circuit of claim 10, wherein said first and second selective connectors are selected from a group consisting of mechanical switches, transistors, and power supplies.
  • 18. The circuit of claim 10, further comprising a controller connected to said second voltage rail, connected to said second reference rail, and responsive to a signal indicative of signal propagation through at least one of said first and second circuits.
  • 19. The circuit of claim 18, wherein said controller is directly connected to said second voltage rail and said second reference rail.
  • 20. The circuit of claim 18, wherein said controller is connected to said second voltage rail and said second reference rail via said first and second selective connectors.
  • 21. The circuit of claim 18, further comprising a dummy critical path connected to said controller.
  • 22. The circuit of claim 10, further comprising a controller responsive to a signal indicative of signal propagation through at least one of said first and second circuits, and having a first output terminal connected to said first selective controller and a second output terminal connected to said second selective controller.
  • 23. The circuit of claim 22, further comprising a dummy critical path connected to said controller.
US Referenced Citations (16)
Number Name Date Kind
4920284 Denda Apr 1990 A
4977335 Ogawa Dec 1990 A
5196743 Brooks Mar 1993 A
5206544 Chen et al. Apr 1993 A
5218247 Ito et al. Jun 1993 A
5266848 Nakagome et al. Nov 1993 A
5315173 Lee et al. May 1994 A
5399920 Van Tran Mar 1995 A
5442218 Seidel et al. Aug 1995 A
5448526 Horiguchi et al. Sep 1995 A
5604453 Pedersen Feb 1997 A
5659258 Tanabe et al. Aug 1997 A
5736869 Wei Apr 1998 A
5814845 Carley Sep 1998 A
5844441 Phoenix Dec 1998 A
6034400 Waggoner et al. Mar 2000 A
Foreign Referenced Citations (5)
Number Date Country
0116820 Aug 1984 EP
0381237 Aug 1990 EP
2073519 Oct 1981 GB
362029315 Feb 1987 JP
WO 8602201 Apr 1986 WO
Non-Patent Literature Citations (6)
Entry
R.K. Krishnamurthy et al., “Mixed Swing QuadRail: Exploring Multiple Voltage Swings for Low Energy/Operation Digital Circuits,” SRC Research Report C96538, Nov. 1996.
R.K. Krishnamurthy et al., “Static Power Driven Voltage Scaling and Delay Driven Buffer Sizing in Mixed Swing QuadRail for Sub-1V I/O Swings,” IEEE/ACM Intl. Symposium on Low Power Electronics & Design, Aug. 1996, pp. 381-386.
R.K. Krishnamurthy et al., “Exploring the Design Space of Mixed Swing QuadRail for Low Power Digital Circuits,” IEEE Trans. On VLSI Systems: Special Issue on Low Power Electroncis & Design, vol. 5, No. 4, Dec. 1997.
L.R. Carley et al., “QuadRail: A Design Methodology for Low Power ICs,” Proc. NAPA Valley Workshop on Low Power Design, Apr. 1994.
Y. Nakagome et al., “Sub-1-V Swing Internal Bus Architecture for Future Low-Power ULSI's,” IEEE Journal of Solid State Circuits, vol. 28, No. 4, Apr. 1993, pp. 414-419.
A. Chandrakasan et al., “Low-Power CMOS Digital Design,” IEEE Journal of Solid State Circuits, vol. 27, No. 4, Apr. 1992, pp. 473-484.