The present disclosure relates to circuit structures designed to conserve power during operation. More specifically, the present disclosure concerns a circuit design that relies on multiple flip-flops to conserve power.
Flip-flops are common electronic circuit elements. Normally, a typical, single includes one internal clock buffer. The internal clock buffer typically includes two inverters. The internal clock buffer typically drives four switches.
In a traditional circuit design, the size of the two inverters was quite large to assure that the flip-flops meet the performance requirements for the particular circuit.
One advantage to using a large clock inverter is that the circuit designer may realize a significant increase in the driven capabilities of the circuit. Moreover, a large clock inverter may increase circuit speed.
Unfortunately, these advantages were offset by at least one drawback associated with the use of a large clock inverter. Specifically, large clock inverters increase the power for the circuit, and this may be significant for the overall system implementation. In circuits where power consumption is to be minimized, this may present an obstacle to practical implementation of traditional flip-flop circuits.
It is, therefore, one aspect of the invention to provide a circuit design, incorporating multiple flip-flops, where power consumption is made more efficient.
In other words, the invention provides for a circuit design that incorporates flip-flops but does not also include large clock inverters that consume undesirable amounts of power.
The invention will be described in connection with several figures in which:
Several embodiments of the invention will now be described in connection with the drawings appended hereto. As should be immediately apparent to those skilled in the art, there are numerous variations and equivalents of the enumerated embodiments that may be employed without departing from the scope of the invention. The invention is intended to encompass any variations and equivalents that would be appreciated by those of ordinary skill in the art upon reading and understanding this disclosure.
As illustrated in
The clock buffer 12 includes two inverters 16, 18 connected to one another in series. The inverter 16 receives an external clock signal CK. The inverter 16 then manipulates the clock signal CK to generate a clock signal CKi, which follows two paths. In a first path, the clock signal CKi is provided to the inverter 18. In a second path, the clock signal CKi is provided to two switches 20, 26. In response to receipt of the clock signal CKi, the inverter 18 modifies the clock signal CKi and generates a clock signal CKi. The clock signal CKi is provided to two switches 22, 24.
As shown in
As shown in
As also shown in
As noted above, one advantage to this flip-flop approach is that the clock inverters 16, 18 may be powered such that there is a significant gain and also a significant speed for the flip-flop circuit 10. A drawback for this implementation of flip-flop circuit 10 is that the clock inverters 16, 18 are large and, therefore, consume a proportionately large amount of power for the flip-flop circuit 10.
Today, process technology has extended into deep sub-micron levels for integrated circuits (i.e., <90 nm). As should be appreciated by those skilled in the art, transistor size is made increasingly smaller as manufacturers produce circuits at this sub-micron level (or below). The size of the transistors created at this level places a limit on the size of the clock inverters in a flop. In other words, as transistors are minimized in size, so too are the sizes of the associated clock inverters in a flop. Small size, however, does not necessarily result in decreased performance or capability. Minimally-sized inverters are capable of driving more than four switches without losing speed performance (e.g., maximum frequency, transition time, setup time, hold time, etc). In other words, it is possible to connect two of the same flops together, connected to only one internal clock buffer (made up of two inverters). In so doing, it is possible to save the power consumption of the two internal inverters that make up the clock buffer that otherwise would be provided to the second flop in the traditional circuit design.
In a practical design, when two flops (comprising eight switches) are connected together, it may be necessary to marginally adjust the second clock inverter size to assure that the dual flip-flop has a similar performance to a single flop.
As illustrated in
The clock signal CKi is provided to switches 48, 54, 56, and 62. The clock signal CKi, in turn, is provided to switches 50, 52, 58, and 60.
As shown in
As also shown in
With respect to the flip-flop 40, the data D8 passes through the switch 56 to the inverter 72. The inverter 72 produces a transformed data signal D9 that takes one of two possible routes. In a first route, the data D9 proceeds to the inverter 74. At the inverter 74, the data D9 is transformed into data D10. The data D10 rejoins the data stream and provides input to the inverter 72 at a point upstream of the inverter 72 but downstream of the switch 56.
As also shown in
As is immediately apparent from
Experiments indicate that, for both 90 nm and 65 nm process technology, when two flops 38, 40 are combined together with only two minimum size clock inverters 44, 46, it is possible to create a dual flip-flop circuit 36 that performs similarly to a single flop. Specifically, the dual flip-flop circuit 36 operates at the same speed or almost the same speed as a single flop. Moreover, internal switching power consumption is reduced by about 15˜25%, as compared with the power consumption of two single flip-flops.
Using a dual flop circuit 36 instead of two single flops in a large design is contemplated to result in significant power savings for the device's clock tree since the sink number for the clock tree is reduced by at least a factor of two (or more). The reduced clock tree load permits reliance on a smaller number of clock tree buffers. In one contemplated embodiment, the number of clock tree buffers may be reduced by half (or more) based on a comparison with prior art designs.
As illustrated in
As also shown in
As is immediately apparent from
In further embodiments contemplated by this disclosure four or more flip-flops may be associated with a single clock buffer.
In addition, it is contemplated that in one or more of the embodiments described herein, one or both of the clock inverters 44, 46 may be sized to present a minimal aspect. When so sized, the clock inverter 44, 46 may be sized to consume a minimal amount of power. Alternatively, the clock inverter 44, 46 may be made as small in physical size as may be created for the process technology employed (e.g., <90 nm). Other variations should be apparent to those skilled in the art.
In keeping with these variations, only one of the clock inverters 44 or 46 may be sized to present a minimal aspect. This may include only the inverter 44 or the inverter 46. In another contemplated embodiment, both of the clock inverters 44 and 46 may be sized to present a minimal aspect.
Embedding in an ASIC Processor
In a typical Application-Specific integrated Circuit (“ASIC”) design flow, a complex circuit is designed using a high-level hardware specification language such as VHSIC (Very High Speed Integrated Circuit) Hardware Description Language (“VHDL”) or Verilog (a hardware description language (“HDL”)). A synthesis tool converts this into a gate level netlist. These gates then undergo further processing which results in a final mask that can be used to manufacture chips the implement the complex circuit.
When converting from the high-level specification to a gate-level netlist, the synthesis tool selects combinations of gates to implement the behavior specified for using the high-level language. The gates are selected from a library that specifies the available gates, their functional behavior, size, speed and power. The synthesis tool is given a set of constraints, i.e. targets for total size, speed, power, etc. The synthesis tool selects, from the available library elements, the combinations of gates that yield identical function as are specified at the high level and as best meet the targets.
This library, and equivalent other databases, provide information to other tools used in the ASIC design flow. An element whose information has been added to all databases used in the ASIC flow is called a cell.
For dual- and multi- flip-flops to be useful in an ASIC flow, it is contemplated that they are available as cells. In other words, they are provided as library elements so that the synthesis tool may select them, and information about dual and multi- flip-flops are added to all databases, so that all tools involved in the flow may operate on them.
Typically, a cell is made available in different versions with different design strengths, which have identical function but have different performance, size and power characteristics.
With reference to
In variations of the cells, the flip-flops 38, 40, 82 may be available as cells in a standard ASIC flow. Similarly, the flip flops 38, 40, 82 and the clock buffer 42 may be available as a cell or as plural cells in a standard ASIC flow. As noted above, one or both of the inverters 44, 46 in the clock buffer 42 may be sized to present a minimal aspect.
As noted above, several embodiments of the invention have been described. There are numerous variations and equivalents of the enumerated embodiments that may be employed without departing from the scope of the invention, as recited by the claims appended hereto. The invention is intended to encompass those variations and equivalents.
The present application is the U.S. National Phase of International Application PCT/US2009/043175, filed May 7, 2009, which claims priority benefit under 35 U.S.C. §119(e) to U.S. Provisional Patent Application Ser. No. 61/056,195, filed on May 27, 2008, the contents of which are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2009/043175 | 5/7/2009 | WO | 00 | 7/1/2011 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2009/146241 | 12/3/2009 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5396129 | Tabira | Mar 1995 | A |
5404338 | Murai et al. | Apr 1995 | A |
5654658 | Kubota et al. | Aug 1997 | A |
6081145 | Bandai et al. | Jun 2000 | A |
6396323 | Mizuno | May 2002 | B1 |
6864732 | Chalasani | Mar 2005 | B2 |
8281176 | Sherlock | Oct 2012 | B2 |
20040119496 | Park et al. | Jun 2004 | A1 |
20070022339 | Branch et al. | Jan 2007 | A1 |
20080028343 | Sato et al. | Jan 2008 | A1 |
Number | Date | Country |
---|---|---|
60224319 | Nov 1985 | JP |
5102312 | Apr 1993 | JP |
5206792 | Aug 1993 | JP |
2000294737 | Oct 2000 | JP |
Entry |
---|
Calhoun, May 24, 2002, Circuit Techniques for subthreshold leakage reduction in a deep sub-micron process, Masters Thesis submitted to Department of EECS, Massachusetts Institute of Technology. |
ISR and WO dated Jul. 9, 2009 in PCT/US09/043175. |
IPRP dated Dec. 9, 2010 in PCT/US09/043175. |
First Office Action dated Aug. 7, 2012 in Japanese patent application No. 2011-511687. |
Number | Date | Country | |
---|---|---|---|
20110254588 A1 | Oct 2011 | US |
Number | Date | Country | |
---|---|---|---|
61056195 | May 2008 | US |