1. Field of the Invention
The present invention generally relates to a delay circuit and more specifically to a configurable delay circuit.
2. Description of the Related Art
Delay circuits are used to align signals relative to each other, such as aligning a rising and/or falling edge of a clock signal to capture data signals. A conventional circuit that is used to delay a signal includes multiple inverters connected in series, where the output of the last inverter is a delayed version of the input signal. The amount of delay that is incurred may be increased by increasing the number of inverters that are connected in series.
When the amount of delay that is needed is variable, a delay circuit may be used that includes a multiplexer. In such a circuit, the multiplexor receives the outputs of two or more of the different inverters that are connected in series, so that each input to the multiplexor is a different delayed version of the input signal. The multiplexor then selects one of the inputs as the output signal.
While the multiplexor enables the selection of one or more different delays, the multiplexor itself also delays the output signal by an additional amount. The additional delay is referred to as “insertion delay” and is incurred by each delayed version of the input signal. Problematically, the insertion delay may vary from multiplexor to multiplexor due to fabrication process variations, thereby complicating the alignment of signals that are delayed using a given multiplexor.
Accordingly, what is needed in the art is a technique for delaying signals by varying amounts without also incurring insertion delay due to a multiplexor.
One embodiment of the present invention sets forth a technique for delaying signals by varying amounts. A configurable delay circuit includes fixed and tri-state inverters. Pullup and pulldown transistors within one or more tri-state inverters may be activated to reduce the delay introduced by fixed inverters. The pullup and pulldown transistors within one or more tri-state inverters may be separately activated to independently adjust the rising delay and the falling delay incurred by the input signal.
Various embodiments of the invention comprise a configurable delay circuit that includes a fixed inverter element coupled in parallel with a tri-state inverter element. The fixed inverter element is configured to receive an input signal and generate an inverted input signal that is delayed relative to the input signal by a first amount of time. The tri-state inverter element is configured to receive the input signal and reduce the first amount of time that the inverted input signal is delayed relative to the input signal when at least one of a first control signal and second control signal is activated.
Various embodiments of the invention for generating an output signal that is delayed relative to an input signal include receiving a first control signal that controls a first delay of a rising edge of the output signal relative to a rising edge of the input signal and receiving a second control signal that controls a second delay of a falling edge of the output signal relative to a falling edge of the input signal. The first control signal and the second control signal are applied to a configurable delay circuit that receives the input signal and generates the output signal such that the output signal is delayed by the first delay and the second delay relative to the input signal.
One advantage of the disclosed mechanism is that the configurable delay circuit delays signals by varying amounts without incurring an additional insertion delay from a multiplexor.
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of skill in the art that the present invention may be practiced without one or more of these specific details. In other instances, well-known features have not been described in order to avoid obscuring the present invention.
A configurable delay circuit can be used to correct mismatches in delays between signals such as between clock signals and data and between different bits of data within a multi-bit data bus. Misaligned clock edges relative to data signals can result in functional errors, e.g., timing errors. The configurable delay circuit may be used to align the clock relative to the data signals and ensure that timing requirements are better met.
Mismatches in delays between different signals of a multi-bit data bus present challenges for meeting the timing requirements to correctly sample all signals of the multi-bit data bus. The mismatches are typically caused by varying wire lengths and variations due to the silicon fabrication process for the different data signals of the multi-bit data bus. In particular, the delays of different repeater elements that are inserted along the length of data and clock signal wires may vary, resulting in mismatches between the different data signals and between clock signals relative to the data signals. The configurable delay circuit may be used to minimize the variation between the valid sampling windows for each data signal of a multi-bit bus, thereby reducing functional errors.
Other potential sources of systematic skew between clock and data signals are asymmetry in the clock buffers at the transmitter and receiver ends of a link over which the data is transmitted, and aperture offsets in the receiver flip-flops. Adjustments of the forwarded clock phase can be made using the configurable delay circuit to compensate for such offsets. The ability to independently adjust the rising delay and falling delay provided by the configurable delay circuit allows for trimming of the data signals and for adjustment of a clock signal duty-cycle or pulse-width. Adjustment of the rising-edge timing should be essentially independent of the falling-edge timing. Otherwise, if the adjustments to each edge interact strongly, it is difficult to find a suitable tuning algorithm for removing timing offsets.
At each stage of the configurable delay circuit 100, the rising edge at the output of a particular stage can be delayed by de-asserting the respective control signal en2L, en1L, and en0L for the particular stage. The falling edge at the output of a particular stage can be delayed by de-asserting the respective control signal en2H, en1H, and en0H for the particular stage. By assembling a series of these stages of the configurable delay circuit 100, a range of control for the timing of each output edge may be achieved. For example, the rising-edge timing at the output signal 131 is controlled by the set of controls en2H, en1L, and en0H. The falling-edge timing at the output signal 131 is controlled by the remaining three controls, e.g., controls en2L, en1H, and en0L. The structure of multiple stages provides a very flexible mechanism for controlling the relative delay between the output and input of each stage and the overall delay of the output signal 131 relative to the input signal 101, because the overall sizing of each stage and the relative sizes of the fixed and adjustable tri-state inverters are free parameters.
The enL control signal enables and disables the pull-up transistor of the tri-state inverter 105. When the active-low enL control signal is asserted (i.e., driven low), the pull-up operation of the tri-state inverter 105 is enabled. When the active-high enH control signal is asserted (i.e., driven high), the pull-down operation of the tri-state inverter 105 is enabled. When neither enL nor enH is asserted the output of the tri-state inverter 105 is in a high impedance state and the output is driven only by the fixed inverter 110.
The fixed inverter 110 provides a first level of drive strength to drive a load at the output. When enL is asserted, the drive strength of a rising transition at the output is greater due to the tri-state inverter 105 pull-up, so the delay of the rising transition is reduced. Similarly, when enH is asserted the drive strength of a falling transition at the output is greater due to the tri-state inverter 105 pull-down, so the delay of the falling transition is reduced. Assuming that the logical effort, a measure of drive strength, for a fixed inverter 110 is 1, the logical effort of the tri-state inverter 105 is 2 when all transistors are equally sized. Therefore, the drive strength of the stage of the configurable delay circuit 100 is increased by 50% with the tri-state inverter 105 is enabled.
The relative drive strength of each stage is determined based on the widths of the transistors comprising the tri-state inverter 105 and the fixed inverter 110. Each stage of the configurable delay circuit 100 can be configured to provide four different delay variations using the control signals enL and enH. A first delay is incurred by the input to generate the output when enL and enH are both de-asserted. The first delay is reduced for the rising edge of the output and the falling edge of the output when enL and enH are both asserted to increase the drive strength of the state of the configurable delay circuit 100. The first delay is reduced only for the rising edge of the output when enL is asserted and enH is de-asserted. Finally, the first delay is reduced only for the falling edge of the output when enH is asserted and enL is de-asserted.
The relative sizing of the transistors comprising the tri-state inverter 105 and the fixed inverter 110 may be used to control the possible delays and reduced delays that are generated by each stage of the configurable delay circuit 100. For example, assuming that each stage in the configurable delay circuit 100 shown in
The delay transfer characteristic 150 corresponds to a configurable delay circuit 100 where the first stage has a tri-state inverter of size 1S and a fixed inverter of size 3S, the second stage has a tri-state inverter of size 2S and a fixed inverter of size 2S, and the third stage has a tri-state inverter of size 3S and a fixed inverter of size 1S.
The lowest delay of approximately 30 picoseconds occurs when the en2H, en1L, and en0H control signals are asserted so that the respective pull-down devices and pull-up device in the tri-state inverter elements are activated. The largest delay of approximately 58 picoseconds occurs when the en2H, en1L, and en0H control signals are un-asserted so that the respective pull-down devices and pull-up device in the tri-state inverter elements that are controlled by the en2H, en1L, and en0H control signals are deactivated.
The rising edge of the input signal 101 is delayed by an increasing amount of time as the en2H, en1L, and en0H control signals progress through the following eight different binary values that each correspond to a different delay step:
101, 100, 111, 110, 001, 000, 011, 010, where the minimum delay is specified by 101 and the maximum delay is specified by 010 because en1L is active low. While adjustments in the en2H, en1L, and en0H control signals affect the delay generated on the rising edge of the output signal 131, the adjustments to the en2H, en1L, and en0H control signals do not affect the delay of the falling edge of the output signal 131. As shown in
The following table represents the different drive strengths of the stages controlled as en2H, en1L, and en0H are adjusted to progressively decrease the delay of the rising edge at the output 131.
As shown in
At step 210 control signal settings are received that control a delay incurred by the falling edge of the input signal 101 to generate the output signal 131. In other words, the control signal settings control the delay of the falling edge of the output signal 131 relative to the falling edge of the input signal 101. The control signal settings that control a delay of the falling edge are en2L, en1H, and en0L. At step 215 the control signal settings are applied to the configurable delay circuit 100 to control the amount of delay incurred by the input signal to generate the output signal. At step 220, the output signal that is delayed relative to the input signal is generated.
The control signals of the configurable delay circuit 100 may be adjusted to independently increase or decrease the delay of a rising transition at the output separately from a falling transition at the output. The configurable delay circuit 100 may be adjusted via the control signals to reduce the delay variation between different signals of a multi-bit bus for rising and/or falling data transitions. A predetermined acceptable delay variation may be identified. The predetermined acceptable delay variation may be identified to improve the functional yield of an integrated circuit for a particular performance level, e.g., clock rate. In one embodiment, the relative drive strengths of the fixed inverter and the tri-state inverter are implemented in the configurable delay circuit 100 so that one or more delay steps equals the predetermined acceptable delay variation.
I/O bridge 307, which may be, e.g., a Southbridge chip, receives user input from one or more user input devices 308 (e.g., keyboard, mouse) and forwards the input to CPU 302 via communication path 306 and memory bridge 305. A parallel processing subsystem 312 is coupled to memory bridge 305 via a bus or second communication path 313 (e.g., a Peripheral Component Interconnect (PCI) Express, Accelerated Graphics Port, or HyperTransport link); in one embodiment parallel processing subsystem 312 is a graphics subsystem that delivers pixels to a display device 310 (e.g., a conventional cathode ray tube or liquid crystal display based monitor). A system disk 314 is also connected to I/O bridge 307. A switch 316 provides connections between I/O bridge 307 and other components such as a network adapter 318 and various add-in cards 320 and 321. Other components (not explicitly shown), including universal serial bus (USB) or other port connections, compact disc (CD) drives, digital video disc (DVD) drives, film recording devices, and the like, may also be connected to I/O bridge 307. The various communication paths shown in
In one embodiment, the parallel processing subsystem 312 incorporates circuitry optimized for graphics and video processing, including, for example, video output circuitry, and constitutes a graphics processing unit (GPU). In another embodiment, the parallel processing subsystem 312 incorporates circuitry optimized for general purpose processing, while preserving the underlying computational architecture, described in greater detail herein. In yet another embodiment, the parallel processing subsystem 312 may be integrated with one or more other system elements in a single subsystem, such as joining the memory bridge 305, CPU 302, and I/O bridge 307 to form a system on chip (SoC).
It will be appreciated that the system shown herein is illustrative and that variations and modifications are possible. The connection topology, including the number and arrangement of bridges, the number of CPUs 302, and the number of parallel processing subsystems 312, may be modified as desired. For instance, in some embodiments, system memory 304 is connected to CPU 302 directly rather than through a bridge, and other devices communicate with system memory 304 via memory bridge 305 and CPU 302. In other alternative topologies, parallel processing subsystem 312 is connected to I/O bridge 307 or directly to CPU 302, rather than to memory bridge 305. In still other embodiments, I/O bridge 307 and memory bridge 305 might be integrated into a single chip instead of existing as one or more discrete devices. Large embodiments may include two or more CPUs 302 and two or more parallel processing systems 312. The particular components shown herein are optional; for instance, any number of add-in cards or peripheral devices might be supported. In some embodiments, switch 316 is eliminated, and network adapter 318 and add-in cards 320, 321 connect directly to I/O bridge 307.
In sum, the configurable delay circuit includes tri-state inverters that are coupled in parallel with fixed inverters and that are selectively activated to reduce the delay introduced into the input signal by the fixed inverters. The pullup and pulldown transistors within one or more tri-state inverters may be separately activated to independently adjust the rising edge delay and the falling edge delay incurred by the input signal. When the configurable delay circuit is implemented with three stages of tri-state and fixed inverter pairs, the transistors may be sized such that the different delays incurred by the rising and/or falling edges of the input signal vary linearly.
Advantageously, the configurable delay circuit delays signals by varying amounts without incurring an additional insertion delay. In particular, the adjustment to the delay for either the rising or the falling edge does not interact with the delay incurred by the opposing edge. Therefore, the rising and falling edges may be independently adjusted to control a clock duty factor or a pulse width.
One embodiment of the invention may be implemented as a program product for use with a computer system. The program(s) of the program product define functions of the embodiments (including the methods described herein) and can be contained on a variety of computer-readable storage media. Illustrative computer-readable storage media include, but are not limited to: (i) non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM disks readable by a CD-ROM drive, flash memory, ROM chips or any type of solid-state non-volatile semiconductor memory) on which information is permanently stored; and (ii) writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive or any type of solid-state random-access semiconductor memory) on which alterable information is stored.
The invention has been described above with reference to specific embodiments. Persons skilled in the art, however, will understand that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. The foregoing description and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Therefore, the scope of embodiments of the present invention is set forth in the claims that follow.