This invention relates generally to the field of skew correction of clock signals. More particularly, in accordance with certain embodiments consistent with the present invention, this invention relates to a clock de-skewing arrangement that utilizes a delay line architecture.
Clock skew problems can manifest themselves in several environments, including but not limited to, chip-level, System-on-Chip (SoC), and board/system level. As an example of chip-level skew, consider a microprocessor whose synchronous circuitry (i.e. flip-flops) is spread across a wide area of an integrated circuit chip. Now consider a single clock signal that is to be distributed across the chip in such a way that the rising edge of each clock cycle reaches each flip-flop at the same point in time. Skew in this environment is becoming a more problematic issue as device sizes shrink, clock speeds increase, and chip size increase. This means that the path delay in a signal trace may differ by many cycles of a clock period from one section of the chip to another. With system clock frequencies well into the gigahertz range, clock skews on the order of picoseconds can produce adverse affects on system performance, or even disrupt system functionality.
A similar problem arises in so-called “System-on-Chip” scenarios. A clock signal should be routed to a baseband section, a microprocessor, and a memory block (or other functional blocks) with minimal skew. Again, the length of an on-chip signal path from the clock generator to the various functional blocks can be long enough to introduce significant delay and thereby affect the maximum operating frequency. Skew can also be a problem on a board level system for the same reasons outlined above. But on a board level system, the problem can be even further exacerbated by even longer signal traces and more severe loading caused by signal paths that are routed on and off chips and other components.
The features of the invention believed to be novel are set forth with particularity in the appended claims. The invention itself however, both as to organization and method of operation, together with objects and advantages thereof, may be best understood by reference to the following detailed description of the invention, which describes certain exemplary embodiments of the invention, taken in conjunction with the accompanying drawings in which:
While this invention is susceptible of embodiment in many different forms, there is shown in the drawings and will herein be described in detail specific embodiments, with the understanding that the present disclosure is to be considered as an example of the principles of the invention and not intended to limit the invention to the specific embodiments shown and described. In the description below, like reference numerals are used to describe the same, similar or corresponding elements in the several views of the drawings.
The terms “a” or “an”, as used herein, are defined as one or more than one. The term “plurality”, as used herein, is defined as two or more than two. The term “another”, as used herein, is defined as at least a second or more. The terms “including” and/or “having”, as used herein, are defined as comprising (i.e., open language). The term “coupled”, as used herein, is defined as connected, although not necessarily directly, and not necessarily mechanically.
This invention, in certain embodiments consistent therewith, provides a flexible, integrated solution whereby a clock de-skew operation is performed within the clock generation function. Due to the inherent nature of the clock generation, the adjustable phase resolution of the output signal(s) can be made very fine as will be demonstrated below.
With reference to the exemplary architecture of a programmable skew clock generator circuit shown in
Accumulator 108 is referred to herein as a “frequency accumulator” 108 and the accumulator 112 is referred to herein as a “phase accumulator” 112. The frequency accumulator 108 is clocked by a reference clock with frequency Fref, and operates according to an input value K0 which serves as a frequency division constant that is loaded into the accumulator. The value of K0 is determined by the desired output frequency Fout according to the relationship:
where, KMAX is the maximum count of the frequency accumulator 108.
The phase accumulator 112 is clocked by an overflow signal 116 from the frequency accumulator 108 and operates according to an input value C0 which serves as a phase offset constant. The overflow signal 116 of frequency accumulator 116 provides a signal that is an average Fφ0. The value of C0 is also a function of the desired output frequency and is given by:
where CMAX is the maximum count of the phase accumulator 112 and in this case Fout is equal to Fφ0.
If a switching event is defined as a transition in the output signal from high to low or low to high, then one can view the frequency accumulator 108 as controlling the average frequency at which a switching event occurs. Meanwhile the phase accumulator 112 determines the phase of the transition relative to Fref by selecting the appropriate tap from tapped delay line 120. This is accomplished by providing the output 124 from phase accumulator 112 to a tap selection logic circuit 126 that controls a multiplexer 128. Tapped delay line 120 receives the input reference clock Fref and produces a sequence of delayed versions of Fref at a sequence of output taps in a known manner.
Tap selection logic circuit 126 determines which of the plurality of taps from delay line 120 should be selected to produce the desired output signal. The tap selection logic circuit 126 translates the contents of the phase accumulator 112 to a binary coded (or analogous) tap address. Multiplexer 128 receives the tap address from the tap selection logic circuit 126 that determines which tap of delay line 120 is passed to the output to produce Fφ0. The coarseness or fineness of the output frequency resolution is dependent upon the capacity of phase accumulator 112. The higher the capacity of the phase accumulator 112, the more resolution that is obtainable at the output.
The delay line 120 may have minor errors from delay element to delay element and across the delay line. Accordingly, the delay line may be locked in a delay locked loop and/or may incorporate any suitable mechanism to individually tune the delay of one or more of the delay elements forming a part thereof (shown as a “tune” input to the delay line).
The number of delay elements in the delay line 120 will determine the quantization error associated with placing an output edge at a precise moment in time. This quantization error will result in phase jitter on the output signal. Therefore, by use of the frequency accumulator 108 and phase accumulator 112 and other circuitry 100 as shown, one can generate any desired output value of frequency up to Fref within the resolution of the circuitry. The output frequency FΦ0 can therefore be defined (within the resolution of the circuit) by the equation:
The above definition of a switching event can be used to see how altering the contents of each of accumulators 108 and 112 can be used to control the phase of the output signal. If the frequency accumulator 108 is forced to begin the accumulation process from some number other than zero by preloading it (e.g., at the time of a circuit reset), then the amount of time it takes for an overflow of accumulator 108 to occur can be decreased. This results in a switching event occurring during an earlier reference clock cycle than would be the case if the accumulator 108 had started counting from zero. Since the operation of the phase accumulator 112 is dependent on the frequency accumulator 108's overflow, a similar adjustment is provided there as well.
In order to add any desired time skew to the output, the frequency accumulator can be preloaded with a frequency accumulator preload value given by:
where φ is the desired phase shift in radians. The phase accumulator preload value for the same phase shift is then given by:
The output of multiplexer 128 is given by the following equation:
where Tp is the pulse width, Tφ0 is the period and rect is a rectangular function as defined in Equation 6. If preload values PK and PC are set to zero for circuit 104, as illustrated in this exemplary embodiment, then there will be zero phase adjustment to the output.
Thus, a reference generator circuit 104 consistent with certain embodiments of the invention has a reference frequency accumulator 108, preloaded with a preload value PK0 and receives one cycle of the reference clock signal, followed by constant K0 as the accumulator input thereafter during subsequent clock cycles. The frequency accumulator has a maximum count KMAX and produces an overflow output when the maximum count is reached. A reference phase accumulator 112 is preloaded with a preload value PC0 and receives one overflow output from the frequency accumulator as a clock signal. It then receives a phase offset constant C0 as an input thereto, with the phase accumulator having a maximum count CMAX and producing a phase accumulator overflow output. A reference delay line 120 is clocked by the reference input signal and produces a plurality of delayed reference clock signals at a plurality of tap outputs. A reference tap selecting circuit receives the phase accumulator output and selects at least one of the tap outputs in response thereto to produce an output Fφ0.
For the reference generator 104, the preload values may be zero. Other similar circuits can be used to produce clock outputs that are skewed with reference thereto as will be described below. By use of the accumulator preload values of equations 4 and 5, any number of phase adjusted outputs can be created by duplication of the circuit arrangement of FIG. 1. Two such duplicates are shown in
The second duplicate receives input K2 at frequency accumulator 152 and C2 at phase accumulator 156 to determine the output frequency. The delay φ2 is determined by PK2 and PC2, again by application of equations 4 and 5 above. The output of phase accumulator 156 drives tap select logic 160 which controls multiplexer 164 to select one or more delay line taps from delay line 168 to produce output signal FΦ2 which is offset in time from Fφ0 by delay φ2.
For the two phase shifted outputs Fφ1 and Fφ2, the values of PK1, PC1, PK2 and PC2 are given by:
By analogy to the output equations 6 and 7 above, the output of multiplexers 144 and 164 are respectively given by:
where φ1 and φ2 represent the time delay measured from the reference signal Fφ0. So, in general, the output of any of the clock circuits is given by:
Thus, in accordance with certain embodiments consistent with the present invention, a programmable skew clock signal generator has a frequency generator circuit that produces an output signal Fφ0 from a reference signal Fref. A frequency accumulator is preloaded with a preload value PK1 and receives the reference signal as a clock signal, receives a frequency division constant K1 as an input thereto, with the frequency accumulator having a maximum count KMAX and producing an overflow output. A phase accumulator is preloaded with a preload value PC1 and receives the overflow output from the frequency accumulator as a clock signal and receives a phase offset constant C1 as an input thereto. The phase accumulator has a maximum count CMAX and produces a phase accumulator output. A delay line is clocked by the reference signal Fref and produces a plurality of delayed reference clock signals at a plurality of tap outputs. A tap selecting circuit receives the phase accumulator output and selects at least one of the tap outputs in response thereto to produce an output Fφ1 whose phase shift φ1 relative to F0 is a function of PK1 and PC1.
The results of a simulation of system 100 generating three independent outputs are shown in FIG. 2. The top waveform 202 is a 500 MHz reference clock Fref. The second waveform 206 represents a 55 MHz output that has no skew adjustment applied to it—i.e., accumulator 112 is preloaded with zero as CC. The third waveform 210 is phase shifted by φ1=π/8 relative to the first output signal. The last signal 214 is phase shifted by φ2=π/4 relative to the first output signal.
Referring now to
In addition, the accuracy of the delay line 320's individual delays can be enhanced by locking the delay line to the reference clock Fref in a delay locked loop, and further by any suitable tuning mechanism (shown as a “tune” input that equalizes the delays of the individual delay elements). In the exemplary delay locked loop depicted, the input reference clock Fref is compared in a phase comparator circuit 324 with a delayed version of the reference clock. This produces an output that is low pass filtered at filter 330 to produce a correction signal that is used to correct the overall delay of the delay line 320. If each delay element in the delay line is approximately equal, locking the delay line in the delay locked loop will bring the individual delays close to a desired value to accurately generate the output signals from the multiplexers. In other variations, one or more sets of phase detectors and low pass filters can be used to provide correction to the individual delay elements to further enhance the accuracy and consistency of the delay elements. Accordingly, many variations of the invention will occur to those skilled in the art upon consideration of the present teaching.
In another variation of the present invention, the multiplexer (e.g., multiplexer 128) can be integrated into the delay line circuit itself. This is illustrated in
Thus, transmission gate switches 412, 414, 416, . . . , 418 and 420 are coupled to the outputs of the plurality of delay elements 402, 404, 406, . . . , 408 and 410. Each of the transmission gate switches is controlled by a control signal that either effectively open circuits or short circuits the switch. These control signals may be individually brought out of the integrated circuit, or they may be processed by a decoder circuit 430 to reduce the number of input/output lines associated with the integrated circuit.
In the above embodiments, certain accumulators are preloaded with preload values in order to establish the desired skewing of the output clock signals. With reference to
In
In order to assure that a continuous overflow is not produced for large values of PK1, a set-reset (S/R) flip-flop 546 can be provided so that an average value of Fφ1 is produced if desired. By imposing a small delay 550 between the line 116 and the reset input of the flip flop 546, a reset can be assured if a set and reset signal occur simultaneously. If the circuit of
The following relationships govern the operation of the circuit of FIG. 5 and are similar to the prior equations:
In this embodiment, it should be recognized that not only can the phase shift be skewed by use of constants PK1 and PC1, but in addition, the phase skew can be made time varying by making the values of PK1 and PC1 vary with time in any desired manner as follows:
Thus, in accordance with certain embodiments consistent with the present invention, a programmable skew clock signal generator circuit has a reference frequency accumulator clocked by a reference frequency and receiving a constant K0 as an input, the frequency accumulator having a maximum count KMAX and producing an output and an overflow output. A reference phase accumulator receives a phase offset constant C0 as an input thereto, the phase accumulator having a maximum count CMAX and producing a phase accumulator output. A first adder is clocked by the reference frequency and adds the accumulator output with a value PK1 to produce a first adder overflow output. A second adder is clocked by the first adder overflow output and adds the reference phase accumulator output with a value PC1 to produce a second adder output. A delay line is clocked by the reference signal Fref and produces a plurality of delayed reference clock signals at a plurality of tap outputs. A tap selecting circuit receives the reference phase accumulator output and selects at least one of the tap outputs in response thereto to produce an output Fφ0, and receives the second adder output and selects at least one of the tap outputs in response thereto to produce an output Fφ1; wherein a phase shift φ1 relative to Fφ0 is a function of PK1 and PC1.
Thus, certain embodiments consistent with the present invention can provide the ability to control the phase of the output signals. Compared to other de-skew approaches that seek to provide skew correction as a post processing function, the phase control (de-skewing) of certain embodiments of the present invention can now be achieved in the frequency generation function. In certain embodiments, this can provide better phase resolution, higher level of integration, and an ability to adjust the phase over a wider range of output frequencies.
Certain embodiments consistent with the present invention can find broad application for potential use in circuits needing synchronous operation among multiple functional circuit blocks. One exemplary application is computing hardware; from low-end personal computers to high-end workstations and even supercomputers that utilize parallel processing. Other potential uses will occur to those skilled in the art, upon consideration of the present teachings.
While the present invention has been disclosed using several exemplary embodiments in which three output signals are generated, the invention itself should not be considered similarly limited. Embodiments of the present invention can be extended by repeating the circuit configurations disclosed any number of times to create any number of output clock signals having any desired phase relationship. The resolution of the clock signals generated can be extended to any desired accuracy limited only by virtue of the input clock signal frequency, the number of delay line delay elements, and the control exercised on the variation of the delay of the individual delay elements of the delay lines. Also, although circuit 104 uses accumulator preload values of zero, this is not a requirement by any means. The output frequencies of the plurality of clock generator circuits can be selected to be the same value as that of circuit 104 or different as desired or required for the particular use at hand.
While the invention has been described in conjunction with specific embodiments, it is evident that many alternatives, modifications, permutations and variations will become apparent to those of ordinary skill in the art in light of the foregoing description. Accordingly, it is intended that the present invention embrace all such alternatives, modifications and variations as fall within the scope of the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
6320436 | Fawcett et al. | Nov 2001 | B1 |
6353649 | Bockleman et al. | Mar 2002 | B1 |
6614813 | Dudley et al. | Sep 2003 | B1 |
6768442 | Meyers et al. | Jul 2004 | B2 |
20030152181 | Stengel et al. | Aug 2003 | A1 |
Number | Date | Country | |
---|---|---|---|
20040257130 A1 | Dec 2004 | US |