Embodiments of the present invention relate to the field of integrated circuit design and operation. More specifically, embodiments of the present invention relate to systems and methods for pipelined one cycle throughput for single-port 6T RAMs.
A read access from a small signal, differential memory, e.g., a static random access memory (SRAM), generally comprises three operations. A first operation, known as or referred to as “development,” applies a voltage differential on the sense nodes of a sense amplifier. A second operation, known as or referred to as “evaluate,” amplifies a small voltage differential on the sense nodes into a full swing, e.g., “rail to rail,” signal to determine the value of the memory cell. A third operation, known as or referred to as “precharge,” charges the sense nodes so that they are ready for a subsequent access.
Often, under the conventional art, the development and evaluate operations are timed using a “replica,” e.g., delay, circuit. A replica circuit generally comprises a bit line with the same load as the functional bit lines, with the circuit designed so that it always discharges. This replica can be in addition to, or instead of, a delay chain, e.g., of inverters. The replica is not used to store data; rather, it is used to track various delays of a memory circuit. Because the replica circuit is formed on the same die as the memory circuits, there is a degree of correspondence between the analog characteristics, e.g., capacitance, threshold voltage, static and dynamic leakage, switching rate and the like, of a replica and “real” memory circuits. For example, a replica circuit may track changes in operating conditions, e.g., Vdd and/or operating temperature, as well as global changes in process variation.
Unfortunately, the replica circuit and the real memory circuits are not identical. For example, a replica circuit generally does not track local process variations, e.g., statistical variations in dopant density, that may cause timing differences between a replica and “real” memory circuits, causing differences in behavior between them. Consequently, a replica timer is usually designed to be slower than a mirrored memory circuit. In addition, there is usually some variation among memory cells and sense amplifiers within a memory array. Accordingly, a replica timer must be designed to leave timing margin to allow for the slowest memory cells and sense amplifiers to complete their operations. The accumulation of timing margins to allow for worst case differences between replicas and actual memory, and to allow for the slowest memory cells and sense amplifiers, typically results in memory accesses, e.g., reads and/or writes, which occur slower than necessary, for most memory cells.
An alternative to a replica circuit is to use a separate clock phase for each operation, e.g., one phase for development, one phase for evaluate and one phase for precharge. Thus three clock phases, one and one half clock cycles, are required to complete all three operations. In addition, conventional-art phase-based designs typically add an extra clock phase to align the memory operations with the same clock phase, e.g., a rising edge. Accordingly, the conventional art typically utilizes four clock phases, two clock cycles, to complete the three operations, further slowing memory throughput under the conventional art.
Therefore, what is needed are systems and methods for pipelined one cycle throughput for single-port 6T RAMs. What is additionally needed are systems and methods for pipelined one cycle throughput for single-port 6T RAMs that detect the completion of an evaluate operation. A further need exists for systems and methods for pipelined one cycle throughput for single-port 6T RAMs that allow for single cycle throughput on consecutive memory accesses. A still further need exists for systems and methods for pipelined one cycle throughput for single-port 6T RAMs that are compatible and complementary with existing systems and methods of integrated circuit design, manufacturing and test. Embodiments of the present invention provide these advantages.
In accordance with a first embodiment of the present invention, an electronic circuit is configured to perform consecutive read accesses using one sense amplifier. The electronic circuit includes circuitry configured to precharge sense nodes of the sense amplifier, and circuitry configured to develop the sense nodes. The electronic circuit also includes circuitry configured to sense the sense nodes to read a first bit, and circuitry configured to detect a completion of an evaluate operation on the sense nodes. The consecutive read accesses may be conducted with single cycle throughput of a synchronizing clock signal. The circuitry configured to detect a completion of an evaluate operation on the sense nodes may include a three state latch.
In accordance with another embodiment of the present invention, an electronic circuit includes a sense amplifier circuit coupled to a sense node and an inverted sense node. The electronic circuit also includes a precharge circuit coupled to the sense node and to the inverted sense node, and a bridge transistor configured to selectively couple the sense node to the inverted sense node. The electronic circuit further includes a sense amplifier enable transistor configured to selectively enable the sense amplifier circuit, and a self-timing circuit coupled to the sense node and to the inverted sense node. The self-timing circuit is configured to turn off the sense amplifier responsive to a completion of an evaluate operation by the sense amplifier.
In accordance with a method embodiment of the present invention, a method of performing consecutive read accesses using one sense amplifier includes, for a first read access, first precharging the sense nodes, first developing the sense nodes, and evaluating the sense nodes to read a first bit. Responsive to completion of evaluating the sense nodes to read a first bit, the method continues with second precharging the sense nodes, second developing the sense nodes; and evaluating the sense nodes to read a second bit.
The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention. Unless otherwise noted, the drawings are not drawn to scale.
Reference will now be made in detail to various embodiments of the present invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with these embodiments, it is understood that they are not intended to limit the invention to these embodiments. On the contrary, the invention is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of the invention, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be recognized by one of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well known methods, procedures, components, and circuits have not been described in detail as not to unnecessarily obscure aspects of the invention.
Pipelined One Cycle Throughput for Single-Port 6T Rams
It is to be appreciated that the term “three state” as used herein does not refer to, and is not analogous to the term “Tri-state®,” a registered trademark of Texas Instruments, Inc., of Dallas, Tex. As is known to those of skill in the art, a Tri-state® device includes conventional “high” and “low” outputs, as well as a high impedance, or “hi-Z,” output state. Embodiments in accordance with the present invention store three (or more) states in a single latch.
If all inputs 131 A, 132 B and 133 C are set to one, then the output of latch 100 will retain the state it had last, as indicated by the last row of truth table 150. The “star” notation, e.g., “X*,” indicates previous state of the output signal line. For example, if inputs 131 A and 132 B are set to one, and input 133 C is set to zero, outputs 121 X and 122 Y will be zero, and output 123 Z will be set to one. Changing input 133 C from zero to one will result in all inputs set to one, and the outputs will retain their previous state. In this example, outputs 121 X and 122 Y will be zero, and output 123 Z will be set to one. In accordance with embodiments of the present invention, which ever input is the last to transition from zero to one will have its output remain one.
It is appreciated that embodiments in accordance with the present invention offer several advantages in comparison to a three state circuit based on multiple conventional, e.g., two-state, latches. For example, there are no transitory states. In addition, embodiments in accordance with the present invention may operate asynchronously, e.g., with unclocked handshaking signals. Further, embodiments in accordance with the present invention generally require fewer gates, less die area and are thus less expensive in comparison to the conventional art. Still further, embodiments in accordance with the present invention will generally operate faster, e.g., with fewer gate delays, than under the conventional art. For example, in accordance with embodiments of the present invention, the worst case delay from input to output is two gate delays.
It is appreciated that three state latch 100 (
Latch 100 of
Accordingly, embodiments in accordance with the present invention may utilize an OAI gate structure, e.g., OAI gate 199, or an AOI gate structure. However, the schematic representations presented herein illustrate the logical function of the separate gates. For example, all inputs of OAI gate 199 do not have the same logical function, and hence schematics utilizing the logical function of the separate gates represent a preferred approach to illustrate aspects of the present invention. With reference to
In addition, in accordance with embodiments of the present invention, latches with an arbitrary number of inputs may be formed by “widening” the first part of the gate, e.g., the OR gate in the exemplary OAI gate structure. For example, to form a four-input latch, the OR gates of
As illustrated in three state latch 100 of
In addition, handshaking sense amplifier electronic circuit 200 comprises three P-type metal oxide semiconductor (PMOS) devices for precharging sense nodes SEN 254 and SENB 255. Responsive to a signal 253 “precharge bar” (inverted precharge), PMOS device 222 couples the sense node SEN 254 and the inverted sense node SENB 255. Responsive to the same precharge 253 signal, PMOS devices 221 and 223 pull the sense node and the inverted sense node high, to precharge the sense nodes.
Handshaking sense amplifier electronic circuit 200 also comprises a pair of cross coupled inverters configured to function as a sense amplifier. A first inverter, comprising PMOS device 231 and NMOS device 232 accepts a “sense bar” SENB 255 signal as input. A second inverter, comprising PMOS device 234 and NMOS device 235 accepts a “sense” SEN 254 signal as input. NMOS device 233 functions as an enable device for the sense amplifier, responsive to a “sense amplifier enable” SAE signal 252.
Handshaking sense amplifier electronic circuit 200 further comprises a three state latch, e.g., three state latch 100 as previously described with respect to
In accordance with embodiments of the present invention, use of the novel three state latch enables an “unset” state that the latch enters during the evaluate operation. The three state latch will stay in “unset” until evaluate finishes, and the latch is in either of the “set to zero” or “set to one” states. This allows the evaluate state to be self timed by a handshake. Development keeps an entire phase of the clock, while evaluate and precharge share the other phase.
Synchronizing clock signal, CLK 251, is a periodic clock signal. Sense amplifier enable, SAE 252 is the logical AND of DONEB 256 and CLK 251. Precharge bar is the logical OR of DONEB 256 and the inverse of CLK 251. The done inverted signal, DONEB 256, is the logical OR of the inverse of CLK 251 with the logical AND of sense latched inverted, not(SENL 258) with the inverse of sense bar latched inverted, not(SENBL 257). The sense bar latched inverted, SENBL 257, signal is the logical OR of sense inverted, not(SEN 254) with the logical AND of inverted done inverted, not(DONEB 256) with sense latched inverted, not(SENL 258). The sense bar latched, SENBL 258, signal is the logical OR of inverted sense inverted, not(SENB 255), with the logical AND of inverted done inverted, not(DONEB 256), with inverted sense bar latched inverted, not(SENBL 257).
Timing diagram 300 illustrates the timing relationship among a synchronizing clock signal, CLK 251, sense node enable, SAE 252, precharge inverted (bar), PCB 253, sense enable, SEN 254, sense inverted (bar) SENB 255, done inverted (bar), DONEB 256, sense bar latched (bar), SENBL 257, and sense latched, SENL 258.
SAE 252 and PCB 253 depend on CLK 251 and DONEB 256. All signals depend on CLK 251, with the set and reset of SAE 252 both coming from the rising edge of CLK 251. SAE 252 sets immediately after CLK 251 rise, and does not reset until DONEB 256 goes low (which indicates that the latch has been set).
Edges 302, 303, 304, 305, 306a, 306b and 307 are generated off of the rising edge 301 of CLK 251. The “a” and “b” notation, e.g., of edges 306a and 306b, indicate that these edges occur substantially in parallel. Edges 312a, 312b and 313 are generated off of the falling edge 311 of CLK 251.
Edges 322, 323, 324, 325, 326a, 326b and 327 are generated off of the rising edge 321 of CLK 251. Edges 332a, 332b and 333 are generated off of the falling edge 331 of CLK 251.
It is appreciated that when CLK 251 has a rising edge, all inputs to the three state latch, e.g., CLK 251, SEN 254 and SENB 255, are one and the three state latch retains its last state. The development operation occurs while CLK 251 is low. The evaluate and precharge operations take place while CLK 251 is high.
For example, with respect to
In accordance with embodiments of the present invention, the timing of the handshaking sense amplifier is based on a clock signal, CLK 251, and a handshake signal, DONEB 256. It is appreciated that the self-timing handshake signal DONEB 256 will generally occur prior to a subsequent phase of CLK 251. In contrast, absent such a handshake signal, the conventional art typically would require at least one clock phase for each of the three operations of development, evaluate and precharge. For example, the conventional art typically requires one phase for each operation, e.g., one phase for development, one phase for evaluate and one phase for precharge. Thus three clock phases, one and one half clock cycles, are required to complete all three operations. In addition, prior-art phase based designs typically add an extra clock phase to align the memory operations with the same clock phase, e.g., a rising edge. Accordingly, the conventional art typically utilizes four clock phases, two clock cycles, to complete the three operations. In contrast, embodiments in accordance with the present invention are able to complete the three operations of development, evaluate and precharge in a single clock cycle.
Accordingly, fewer clock cycles and less timing margin are required in comparison to a conventional art, replica-based design. During an evaluation of the integrated circuit design, the frequency of each integrated circuit may be increased until the circuit fails, in order to determine an operating frequency. Moreover, there is no need to reprogram a replica timer in order to adjust a speed of memory operations, as may be the case under the conventional art.
In addition, each sense amplifier may find its own tradeoff between evaluate and precharge timings. If reading a weak cell, evaluate may take longer, but that time may be “borrowed” from a precharge time. For example, the operating frequency is limited only in a case in which the sense evaluate has both weak evaluate (due to the cell being read or the sense evaluate itself) and weak precharge. Without requiring such extra timing margin, a higher performance design may be enabled by embodiments in accordance with the present invention.
It is to be further appreciated that embodiments in accordance with the present invention have less requirement for a 50% duty cycle clock signal in comparison to the conventional art. For example, few state changes take place during a low phase of CLK 251, and such state changes are triggered off the falling edge 311 of CLK 251. As is known to those of ordinary skill in the art, it is difficult to obtain a 50% duty cycle clock signal, and it is additionally difficult to propagate and distribute such a clock signal across an integrated circuit. Accordingly, embodiments in accordance with the present invention are more tolerant of duty cycles that are not close to 50%, and more tolerant of duty cycle degradation in distribution, in comparison to the conventional art.
In a first read access 510 read access 0, responsive to the rising edge of a clock signal 501, e.g., CLK 251 of
In a second, pipelined read access 520 read access 1, as soon as a full swing output has been detected, sense amplifier precharge operation 515 may be initiated. Sense amp precharge operation 515 is then terminated on the subsequent falling edge of clock signal 501. The development 524 of 520 read access 1 may now take place without conflict. If there is no subsequent read access, sense amplifier precharge 525 may be held until a subsequent development operation, e.g., development 544 as shown in 540 read access 3.
In a write access 530 write access 2, it is not necessary to precharge a sense amplifier, as a sense amplifier is not used during a write access. After evaluation of the 520 read access 1, a bit line may be precharged 532, followed by a write operation 534. For example, a multiplexor may isolate a sense amplifier from a word line. In general, when not reading, e.g., when a next access is a write or no access, a sense amplifier should be in a precharge condition.
In 635, the sense amplifier is turned on, e.g., by asserting SAE 252. In 640, the sense nodes are evaluated by the sense amplifier. In 650 the process determines if the evaluate operation is complete. If the evaluate operation is not complete, the process waits. If the evaluate operation is complete, process flow transfers to 610 to read a second cell, e.g., a second cell coupled to the same sense amplifier, e.g., within a same memory word.
In optional 660, the sense amplifier is turned off responsive to the determination of completion of the evaluate operation. Process flow transfers to 610 to read a second cell.
In this novel manner, a single-port memory may be pipelined in order to achieve single cycle throughput. This desirable feature is enabled by self-timing a completion of a sense amplifier precharge operation, e.g., via the previously described novel three state latch.
Embodiments in accordance with the present invention provide systems and methods for pipelined one cycle throughput for single-port 6T RAMs. In addition, embodiments in accordance with the present invention provide systems and methods for pipelined one cycle throughput for single-port 6T RAMs that detect the completion of an evaluate operation. Further, embodiments in accordance with the present invention provide systems and methods for pipelined one cycle throughput for single-port 6T RAMs that allow for single cycle throughput on consecutive memory accesses. Still further, embodiments in accordance with the present invention provide systems and methods for pipelined one cycle throughput for single-port 6T RAMs that are compatible and complementary with existing systems and methods of integrated circuit design, manufacturing and test.
Various embodiments of the invention are thus described. While the present invention has been described in particular embodiments, it should be appreciated that the invention should not be construed as limited by such embodiments, but rather construed according to the below claims.
This application is related to co-pending, commonly owned U.S. patent application Ser. No. 13/910,001, attorney docket NVID-PSC 120850, filed Jun. 4, 2013, entitled “Handshaking Sense Amplifier,” to Gotterba and Wang and to U.S. patent application Ser. No. 13/909,981, attorney docket NVID-PSC 120851, filed Jun. 4, 2013, entitled “Three State Latch,” to Gotterba and Wang. Both applications are hereby incorporated herein by reference in their entireties for all purposes.