The present invention is related to integrated circuits and more particularly to storage devices of integrated circuits.
In general, a decrease in power consumption of an integrated circuit included in portable applications or other target applications increases the battery life and may provide an advantage in the marketplace. Clock switching from global clock distribution, local clock distribution (e.g., Clock Tree Synthesis (CTS)), or synchronous devices (e.g., flip-flops) is a substantial source of integrated circuit power consumption. The latter components, e.g., CTS and flip-flop power consumption, are interrelated, since CTS is meant to distribute clock signals from the global distribution to all flip-flops in a physical area. However, data indicates that flip-flop power consumption dominates total integrated circuit power consumption in some applications. For example, flip-flops included in a processor core consume four times more power than CTS. In an exemplary Graphics Processing Unit (GPU), flip-flops consume three to three-and-a-half times more power than CTS. In some portions of the integrated circuit, the flip-flop power consumption is approximately the same as power consumption due to CTS. Accordingly, improved flip-flop topologies that consume less power are desired.
In at least one embodiment of the invention, an apparatus includes a clock node configured to receive a single-phase clock signal and an input node configured to receive an input signal. The apparatus includes a complementary input node configured to receive a complementary input signal that is complementary to the input signal. The apparatus further includes first differential latch. The first differential latch includes a first pair of complementary devices including a first device of a first type and a second device of a second type and includes a second pair of complementary devices cross-coupled to the first pair of complementary devices. The second pair of complementary devices includes a third device of the first type and a fourth device of the second type. The differential latch further includes a first pair of input devices including a fifth device of the first type and a sixth device of the first type and a second pair of input devices including a seventh device of the second type and an eighth device of the second type. The first pair of input devices and the second pair of input devices are configured to write an intermediate node with the complementary input signal and to write a complementary intermediate node with the input signal in response to a first state of the single-phase clock signal.
The apparatus may include a second differential latch connected to the clock node. The second differential latch may be complementary to the first differential latch and configured to update an output node and a complementary output node based on the first intermediate node and the complementary intermediate node and in response to a second state of the single-phase clock signal. The first and second differential latches may be configured as an edge-triggered master-slave flip-flop. The edge-triggered master-slave flip-flop may not include a transmission gate. The edge-triggered master-slave flip-flop may operate using the single-phase clock signal and no additional phases of the clock signal. The edge-triggered master-slave flip-flop may include at most six transistors driven by the clock signal. The edge-triggered master-slave flip-flop may include only four transistors connected to the clock node.
In at least one embodiment of the invention, a method includes providing a first reference voltage to a first storage element. The method includes providing a second reference voltage to one of a first node of the first storage element and a complementary first node of the first storage element according to an input signal and a complementary input signal during a first state of a clock signal. The method includes writing the first node with the complementary input signal and writing the complementary first node with the input signal using the first reference voltage and the second reference voltage during the first state of the clock signal. The method may include providing the second reference voltage to a second storage element. The method may include providing the first reference voltage to one of a second node of the second storage element and a complementary second node of the second storage element according to the intermediate signal and the complementary intermediate signal during a second state of the clock signal. The method may include writing the second node with an intermediate signal on the complementary first node and writing the complementary second node with a complementary intermediate signal on the first node using the first reference voltage and the second reference voltage during the second state of the clock signal. The method may include providing the second reference voltage to the first storage element during the second state of the clock signal and providing the first reference voltage to the second storage element during the first state of the clock signal. The first storage element and the second storage element may be included in an edge-triggered master-slave flip-flop using the clock signal and no additional phases of the clock signal.
The present invention may be better understood, and its numerous objects, features, and advantages made apparent to those skilled in the art by referencing the accompanying drawings.
The use of the same reference symbols in different drawings indicates similar or identical items.
A native edge-triggered master-slave flip-flop exploits native latch topologies to create an edge-triggered master-slave flip-flop using a single clock phase having substantially reduced clock power consumption and substantially improved hold timing margin as compared to the clock power consumption and hold timing margin of a conventional master-slave flip-flop. The native edge-triggered master-slave flip-flop is formed from a native active low latch (i.e., a native B-latch) and a native active high latch (i.e., a native A-latch) and includes, at most, six clocked transistors, i.e., transistors having a gate terminal coupled directly to a clock net (or clock terminal) of the latch. Those native latches may each be driven directly by a clock net or through an inverted version of the single clock phase, which reduces external capacitive loading. The native A-latch and the native B-latch use complementary circuit topologies. The native A-latch and the native B-latch may be cascaded together to form a native rising-edge-triggered master-slave flip-flop or as a native falling-edge-triggered master-slave flip-flop. In at least one embodiment, each of the native latches includes two clocked transistors (one n-type transistor and one p-type transistor) and the native edge-triggered master-slave flip-flop includes only four clocked transistors driven directly by the clock net, with a total of twenty transistors. The reduced number of transistors in each native latch reduces wire loading of the clock net, area, and clock power consumption of each instantiation of a flip-flop as compared to a conventional edge-triggered master-slave flip-flop. The native edge-triggered master-slave flip-flop topology has reduced dynamic power consumption and low hold time requirements.
Referring to
In emerging manufacturing technologies (e.g., FinFET manufacturing technology), wire contributions to the overall load have increased significantly, may dominate the gate loading, and may increase internal power consumption of the conventional edge-triggered master-slave flip-flop. For example, in a conventional integrated circuit using standard master-slave flip-flops, the internal dynamic power consumption may range from 50% to as high as 80% of the local dynamic power consumption. Techniques that may reduce the power consumption of a flip-flop include multi-bit master-slave flip-flops. Referring to
Reduction of flip-flop power consumption may be achieved using a pulsed flip-flop technique. In a pulsed flip-flop, only a single latch (e.g., a single active high latch for rising edge operation and a single active low latch for falling-edge operation) is included in the flip-flop. However, to guarantee edge-triggered behavior with the latch, pulse-generator clock shaping circuitry is required. Referring to
Pulsed flip-flops consume substantially less clock power than a standard master-slave flip-flop or even a multi-bit master-slave flip-flop described above. However, pulsed flip-flops require a pulsed clock signal that is generated by a pulse-generator. That pulsed clock signal has a duty cycle that is skewed with respect to clock signal CLK to ensure that the hold time of the pulsed flip-flop is sufficiently small. Yet, the pulsed clock still needs to be wide enough to ensure that the latch is writable. That is, the pulsed clock, which is generated from clock signal CLK and has an extra insertion delay, may have an active pulse width that is up to 5 or 6 gate delays, which accounts for process variations. As a result, the hold time requirement of the pulsed flip-flop can be significantly greater than the hold time of a standard master slave flip-flop or multi-bit flip-flop, which can heavily penalize an integrated circuit design for a target application. The pulsed flip-flop trades off reductions in dynamic power consumption with the cost of increased hold buffering.
A native edge-triggered master-slave flip-flop topology reduces power consumption without drawbacks of schemes described above. The native edge-triggered master-slave flip-flop topology provides clock power reduction comparable to that of the pulsed flip-flop for small bank sizes (i.e. smaller multi-bit clusters) but does not have the hold time or writability overhead since the topology maintains a master-slave configuration. The native edge-triggered master-slave flip-flop topology eliminates a multi-phase clock requirement. The single-phase clocking reduces the wire loading on the clock net. In addition, the low-power master-slave flip-flop topology reduces the number of clocked transistors driven by the clock net in each instantiation of a flip-flop, thereby reducing the required integrated circuit area.
As referred to herein, a native circuit (i.e., a native latch or native edge-triggered master-slave flip-flop) is a circuit that can be driven directly by the clock net (i.e., clock terminal) of the latch, and the circuit topology guarantees appropriate behavior. The native edge-triggered master-slave flip-flop topology includes two native latches: one native latch operates as an active low latch with respect to a signal on the clock net (i.e., a native B-latch) and another that operates as an active high latch with respect to a signal on the clock net (i.e., a native A-latch).
An exemplary native rising-edge-triggered master-slave flip-flop is formed by coupling a native B-latch to receive input data. That native B-latch is configured to provide an intermediate signal to a native A-latch. Similarly, a native falling-edge-triggered master-slave flip-flop is formed by coupling a native A-latch to receive input data. That native A-latch is configured to provide an intermediate signal to a native B-latch. The native edge-triggered master-slave flip-flops each have relatively low clock net loading. Each of the latches in a native edge-triggered master-slave flip-flop includes no more than three clocked transistors (e.g., two n-type transistors and one p-type transistor in a native B-latch or one n-type transistor and two p-type transistors in a native A-latch), for a total of, at most, six clocked transistors. Each of the clocked transistors is driven directly from the clock net, which is affected by reduced wire loading and gate capacitance loading.
Referring to
Native A-latch 500 has a circuit topology that is complementary to the circuit topology of native B-latch 400. N-type input devices 508 and 510 receive complementary versions of the input signal, input signal DIN and complementary input signal DX, respectively. A high state of clock signal CLK received by clocked device 516 causes one of n-type input devices 508 and 510 to write logic zero onto a corresponding node of intermediate node QF and complementary intermediate node QX of storage element 502, which includes two cross-coupled pairs of complementary devices. Input signal DIN and complementary input signal DX, which are mutually exclusive signals, cause one of p-type input devices 504 and 506 to provide a high voltage reference to storage element 502 to guarantee no write contention during the high state of clock signal CLK. Clocked devices 512 and 514 provide a high voltage reference during a low state of clock signal CLK. During the low state of clock signal CLK, input signal DIN and complementary input signal DX can change rapidly. Clock devices 512 and 514 ensure a stable high voltage reference that prevents data stored in the latch from being altered during the low state of clock signal CLK. Positive feedback causes n-type devices of storage element 502 to switch the state of native A-latch 500.
Referring to
Referring to
Referring to
Other embodiments of native latches can achieve dynamic power savings that when configured in native edge-triggered master-slave flip-flops, may match or exceed the dynamic power savings of pulsed flip-flop power savings for small multi-bit clusters, without the associated hold time or writability overhead, and have a reduced transistor count as compared to B-latch 400, A-latch 500, B-latch 900, and A-latch 1000. Referring to
Referring to
The native edge-triggered master-slave flip-flops described herein may substantially reduce local dynamic power consumption as compared to conventional master-slave flip-flops, but also have reduced hold times and reduced area as compared to other reduced power consumption solutions (e.g., pulsed flip-flops). While circuits and physical structures have been generally presumed in describing embodiments of the invention, it is well recognized that in modern semiconductor design and fabrication, physical structures and circuits may be embodied in computer-readable descriptive form suitable for use in subsequent design, simulation, test or fabrication stages. Structures and functionality presented as discrete components in the exemplary configurations may be implemented as a combined structure or component. Various embodiments of the invention are contemplated to include circuits, systems of circuits, related methods, and tangible computer-readable medium having encodings thereon (e.g., VHSIC Hardware Description Language (VHDL), Verilog, GDSII data, Electronic Design Interchange Format (EDIF), and/or Gerber file) of such circuits, systems, and methods, all as described herein. In addition, the computer-readable media may store instructions as well as data that can be used to implement the invention. The instructions/data may be related to hardware, software, firmware or combinations thereof.
The description of the invention set forth herein is illustrative, and is not intended to limit the scope of the invention as set forth in the following claims. For example, while the invention has been described in an embodiment functioning as a master-slave flip-flop, one of skill in the art will appreciate that the teachings herein can be utilized with other native A-latch or native B-latch configurations. Variations and modifications of the embodiments disclosed herein, may be made based on the description set forth herein, without departing from the scope of the invention as set forth in the following claims.