The present invention relates generally to clock distribution circuitry, and more particularly relates to techniques for correcting duty cycle error in a clock distribution network.
In many high-performance very large scale integration (VLSI) chips, including, for example, microprocessor chips, a reference clock signal, which may be generated externally and supplied to the chip, is distributed globally throughout the chip using a wiring network. The wiring network, which is typically either a tree-based network, a grid-based network, or a combination of a tree-based and a grid-based network, is re-powered at a number of points by buffers. Each buffer ideally generates a signal that is identical to the original reference clock signal.
A clock signal generally comprises a logic “high” portion and a logic “low” portion. A desired characteristic of a global clock distribution architecture is that it ideally exhibits a 50 percent duty cycle, meaning that the duration of the logic high pulse in the clock signal is exactly 50 percent of the full clock cycle. Thus, the duration of the logic high portion of the clock signal is ideally the same as the duration of the logic low portion of the clock signal for any given clock cycle. Due to model inaccuracies and process, voltage and/or temperature (PVT) variations to which the chip is subjected, rising and falling edges of the clock signal are transmitted with slightly different delays resulting in a duty cycle error. Duty cycle error may be defined herein as the difference between the actual duty cycle and the desired 50 percent duty cycle. In addition, buffered clock distributions employing wires that are represented only by resistance (R) and capacitance (C) are known to be unstable with respect to duty cycle error in the sense that any duty cycle error will be increased by the clock distribution network.
Timing errors attributable to duty cycle error can significantly degrade the overall performance and/or reliability of the chip, and it is therefore beneficial to minimize such duty cycle error. Some advanced circuit techniques, such as, for example, limited switch dynamic logic (LSDL), are especially sensitive to duty cycle error. This problem becomes even more pronounced as the total clock delay between the clock source and the various destination points in the chip exceeds one clock cycle. For instance, the expected duty cycle error from a clock distribution network having a total clock delay of four cycles is approximately four times as large as the duty cycle error expected from a clock distribution network having a one cycle clock delay.
Solutions for reducing duty cycle error are well known, such as, for example, using differential clock distribution architectures, or employing active and/or passive duty cycle correction circuits. However, these known approaches are often complex or require additional design and testing resources, and are therefore undesirable. Moreover, PVT variations can cause the duty cycle to vary substantially from one chip to another, thereby requiring individual integrated circuit (IC) configuration to correct the duty cycle error on each chip which can significantly increase testing time and cost.
Accordingly, there exists a need for techniques for reducing duty cycle error in a clock distribution network that do not suffer from one or more of the problems exhibited by conventional clock distribution architectures and methodologies.
The present invention meets the above-noted need by providing, in an illustrative embodiment, techniques for reducing duty cycle error associated with a repetitive high-frequency timing signal (e.g., a global clock signal) in an IC device.
In accordance with one aspect of the invention, a clock distribution network for distributing a repetitive timing signal throughout an integrated circuit, the timing signal being within a range of frequencies about a first frequency, includes multiple buffer circuits and at least one conductive segment connecting one of the buffers to another of the buffers. The conductive segment has a length selected so as to be less than a quarter-wave resonance length of the conductive segment at the first frequency to thereby achieve duty cycle correction. The length of the conductive segment is preferably selected such that a time taken for the timing signal to traverse the conductive segment follows the relation
where T represents the time taken for the timing signal to traverse the at least one conductive segment, P represents a period of the timing signal, and n represents a positive integer based on at least one of harmonics of a resonance of the conductive segment and harmonics of a frequency of the timing signal.
In accordance with another aspect of the invention, a method of reducing duty cycle error of a repetitive timing signal in a clock distribution network including a plurality of buffers and a plurality of conductive segments, each of the plurality of conductive segments providing electrical connection between a respective pair of buffers in the plurality of buffers, includes the steps of: adjusting an output impedance of a first buffer of a given pair of buffers so that the output impedance of the first buffer is less than a characteristic impedance of a given one of the conductive segments, the first buffer having an output connected to the given conductive segment; adjusting an input impedance of a second buffer of the given pair of buffers so that the input impedance of the second buffer is greater than the characteristic impedance of the given conductive segment; and adjusting a length of the given conductive segment so that a time taken by the timing signal to traverse the given conductive segment is less than a quarter-wave resonance length of the conductive segment at a frequency of operation of the timing signal.
In accordance with a third aspect of the invention, one or more clock distribution networks and/or the method of reducing duty cycle error is implemented in an integrated circuit.
These and other features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
The present invention will be described herein in the context of an illustrative clock distribution network for use, for example, in a high-speed (e.g., greater than about one gigahertz (GHz)) microprocessor. It should be understood, however, that the present invention is not limited to clock distribution networks. Rather, the invention is more generally applicable to techniques for advantageously distributing a repetitive signal throughout an integrated circuit in such a manner as to reduce duty cycle error in the integrated circuit. The techniques of the invention can therefore be used for improving clock distribution characteristics in the integrated circuit, without the use of additional active or passive duty cycle correction (DCC) circuitry. Furthermore, aspects of the invention can be used to preserve the width of non-repetitive pulses that are distributed in an integrated circuit, as will become apparent to those skilled in the art using the techniques described herein.
When high-frequency signals (e.g., above about 1 GHz) are carried on wire segments, the wire segments behave as transmission lines. Therefore, when employing transmission lines of any significant length care must be taken that the transmission medium is matched to its terminations. Increasing clock frequencies and use of lower-loss transmission lines to carry the clock signal have resulted in wavelengths similar to optimal wire segment lengths for on-chip clock distribution. Generally, the source and load impedances should equal the characteristic impedance of the transmission line, as this minimizes signal reflections. Signal reflection occurs as a transmitted signal is at least partially reflected back toward its origin due to differences in impedance along the transmission line. The transmission lines on a given chip, however, are rarely driven nor terminated with their characteristic impedances, and therefore significant signal reflections will most likely occur. The degree of signal reflection will primarily be a function of the magnitude of the difference between the characteristic impedance of the wire segment and the load impedance at an end of the wire segment. If these signal reflection effects are not considered, the clock distribution may degrade overall chip performance.
Alternatively, in accordance with one aspect of the invention, signal reflection effects can be beneficially used to create a clock distribution network which reduces duty cycle errors, thereby enhancing the effective bandwidth of the clock distribution network, reducing the number of buffers required in the clock distribution network, and reducing clock jitter and skew. Known methodologies which attempt to reduce duty cycle error in a clock distribution network, which have conventionally involved the inclusion of active or passive DCC circuitry, are complex and require additional design and test resources, as previously stated. The addition of active or passive DCC circuitry is particularly undesirable in highly dense VLSI chips, such as, for example, microprocessors, where semiconductor area is already at a premium.
The term “wire segment” as used herein may be defined as any conductor used to convey an electrical signal or signals between two or more nodes. Wire segments need not be limited to any particular shape (e.g., straight, curved, etc.) or dimensions and may include, but are not limited to, integrated circuit traces, printed circuit board traces, etc., formed of a conductive material (e.g., metal, polysilicon, etc.). The term “wire segment” may be used synonymously herein with the terms “line,” “transmission line,” “line segment,” “wire,” etc.
Consider the case where the duty cycle of a driving buffer is less than 50 percent.
Pulse transmission through the wire segments can be simulated using, for example, PowerSpice, which is commercially available from International Business Machines (IBM) Corp. (IBM), although alternative circuit simulation programs and modeling methodologies are similarly contemplated. At low frequencies (e.g., approximately 1 megahertz (MHz) or less), there may be a wide choice of signal return paths in the chip, some quite distant from the signal wire. At high frequencies (e.g., above about 1 GHz), however, only wires substantially proximate to the signal wire will provide an effective signal return path. Frequency-dependent transmission line models of the wires may be constructed using, for example, AQUAIA (see, e.g., I. M. Elfadel, et al., “AQUAIA: A CAD Tool for On-Chip Interconnect Modeling, Analysis, and Optimization,” Dig. Electr. Perf. Electronic Packaging, Vol. 11, pp. 337-340, Monterey, Calif., October 2002, which is incorporated by reference herein), developed by Ibrahim M. Elfadel and Alina Deutsch, although alternative on-chip interconnect modeling programs are similarly contemplated.
By way of example only, and without loss of generality, a copper distribution wire segment is preferably selected having dimensions of 3.0 micrometers (μm) wide and 1.2 μm thick, and with spaces of 1.3 μm between the wire segment and the nearest adjacent power supply or ground conductor. There would be no reflection at the end of the wire segment if the source and load impedances were precisely matched (i.e., equal) to the characteristic impedance of the wire segment. However, this is rarely the case. The characteristic impedance of the wire segment is about 50 ohms (Ω) in the present example, and the wire segment is terminated by the input of a field-effect transistor (FET) buffer having a substantially high input impedance (e.g., greater than about 1 megohm) and which presents a small capacitive load (e.g., about a few picofarads or less). To achieve full-swing clock signals, a driving buffer will typically have an output impedance which is substantially less than the characteristic impedance of the wire segment. Accordingly, signal reflection effects will almost always be present. These reflected waves will travel back and forth along the wire segment until their respective amplitudes are diminished, primarily by absorption of energy by the terminating resistances and/or transmission line losses.
Reflected waves have an important effect on the timing of rising and falling edges of the clock signal at both ends of the wire segment. When the duty cycle of the clock signal is exactly 50 percent, the time between the rising and falling edges will be substantially the same, and therefore the effects of the reflected waves on both rising and falling edges of the clock signal will be substantially the same. If, however, the duty cycle of the clock signal is not 50 percent, as is sometimes the case, the timing of the rising and falling edges of the clock signal can be influenced in different ways by the reflected waves. This makes sense since the clock signal is no longer symmetrical when the duty cycle is not 50 percent. One significant effect is that the arrival of a reflected wave at the output of the driving buffer can either aid or oppose the buffer output transition, and therefore change the buffer delay and/or slew rate. Furthermore, when a reflected wave opposes the buffer output transition, the buffer may consume more power in order to generate a desired output signal. Since a given chip often includes many buffers, overall power consumption in the chip can significantly increase as a result of signal reflection effects.
By way of example only, assume that the velocity of pulses traveling along a given wire segment is 90 mm/nanosecond (ns). Using the expression c=λf, where c is pulse velocity (mm/ns), λ is wavelength (mm) and f is frequency (GHz), it can be determined that a 4.5 mm wire segment has a quarter-wave resonant frequency, FQ, of 5 GHz. For a wire segment of this length, which may be referred to herein as the quarter-wave resonance length of the wire segment, a reflected pulse will return to the buffer output after a delay of 100 picoseconds (ps), coinciding with the next transition for a symmetric 5 GHz pulse train.
In accordance with one aspect of the invention, wire segment lengths are preferably selected such that the time taken by the clock signal to traverse the length of the wire segment follows the relation
where T is the time is takes the clock signal to travel the length of the wire segment, P is the clock period (1/f), and n is a positive integer which may be based, at least in part, on harmonics of the wire segment resonance and/or harmonics of the clock frequency. Assuming a typical case of n=1, the length of the wire segment is preferably selected such that the time taken by the clock pulses to travel the length of the wire segment is less than about one sixth and more than about one tenth of the clock period. By way of example only, optimal duty cycle correction for a 5 GHz clock signal occurs for a shorter wire segment length of about 2.5 mm (compared to the quarter-wave resonance length of about 4.5 mm), when the reflected pulse returns after a delay of only 55 ps, well before the next transition (e.g., rising edge or falling edge) of the clock signal. Duty cycle correction, as the term is used herein, is intended to refer to an adjustment of the duty cycle of a periodic signal having a measured duty cycle which is not exactly 50 percent so as make the duty cycle more closely equal to 50 percent.
In this example, the buffer rising and falling delays are 9.9 ps when the duty cycle is 50 percent, but change to 11.8 ps and 9.2 ps, respectively, when the duty cycle falls to 40 percent. A 40 percent duty cycle effectively becomes 41.3 percent, and a 60 percent duty cycle becomes 59.1 percent, both of which are closer to the desired 50 percent duty cycle. Thus, in accordance with an aspect of the invention, by appropriate selection of the wire segment length, the signal reflection of one clock pulse is advantageously used to affect the timing (e.g., delay and slew) of the next clock pulse in the clock signal to thereby achieve a certain amount of duty cycle correction. A typical clock distribution network comprises many such wire segments connected in series between adjacent pairs of buffers (e.g., as shown in
Another desired feature of clock distribution networks is insensitivity to PVT variations to which the buffers may be subjected. A second graphical view 404 shows the PVT sensitivity of the buffer to variations in buffer strength as a function of wire segment length. Using reflection effects and proper selection of wire segment lengths, it is possible to design buffered transmission line networks having positive, negative, or substantially zero sensitivity to buffer strength, in accordance with another aspect of the invention. In the exemplary case shown in
In addition to the quarter-wave resonance frequency FQ, higher-harmonic modes exist at odd multiples of FQ. There are also resonances where the reflected signal makes an odd number of round trips through the wire in a half-cycle before returning to have a small effect on buffer delay: (FQ/3, FQ/5, . . . ). The wire segments employed in the present example are too lossy to observe any significant effects of harmonic frequencies, and therefore hypothetical wire segments having artificially low losses, such as, for example, ⅙th the resistive losses of actual wire segments, are simulated. These potential effects of harmonic frequencies for the illustrative case described above are shown graphically in
By way of example only, a clock distribution network design for a POWER6™ (a trademark of IBM Corporation) microprocessor is optimized for frequencies exceeding 5 GHz using a compromise between duty cycle correction and PVT insensitivity, in accordance with the techniques of the present invention. The chip was fabricated using a 65 nanometer complementary metal-oxide-semiconductor (CMOS) process. The clock distribution network employs a length-matched tree with a path length of 19.2 mm from the clock source and requires eight levels of repowering buffers ending with 176 final buffers all driving a clock grid area of 88 mm2 .
At least a portion of the methodologies of the present invention may be implemented in an integrated circuit. In forming integrated circuits, a plurality of identical die is typically fabricated in a repeated pattern on a surface of a semiconductor wafer. Each die includes a device described herein, and may include other structures and/or circuits. The individual die are cut or diced from the wafer, then packaged as an integrated circuit. One skilled in the art would know how to dice wafers and package die to produce integrated circuits. Integrated circuits so manufactured are considered part of this invention.
Although illustrative embodiments of the present invention have been described herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various other changes and modifications may be made therein by one skilled in the art without departing from the scope of the appended claims.