The following prior applications are herein incorporated by reference in their entirety for all purposes:
U.S. Patent Publication 2011/0268225 of application Ser. No. 12/784,414, filed May 20, 2010, naming Harm Cronie and Amin Shokrollahi, entitled “Orthogonal Differential Vector Signaling” (hereinafter “Cronie I”).
U.S. Patent Publication 2011/0302478 of application Ser. No. 12/982,777, filed Dec. 30, 2010, naming Harm Cronie and Amin Shokrollahi, entitled “Power and Pin Efficient Chip-to-Chip Communications with Common-Mode Resilience and SSO Resilience” (hereinafter “Cronie II”).
U.S. patent application Ser. No. 13/030,027, filed Feb. 17, 2011, naming Harm Cronie, Amin Shokrollahi and Armin Tajalli, entitled “Methods and Systems for Noise Resilient, Pin-Efficient and Low Power Communications with Sparse Signaling Codes” (hereinafter “Cronie III”)
U.S. patent application Ser. No. 13/176,657, filed Jul. 5, 2011, naming Harm Cronie and Amin Shokrollahi, entitled “Methods and Systems for Low-power and Pin-efficient Communications with Superposition Signaling Codes” (hereinafter “Cronie IV”).
U.S. patent application Ser. No. 13/542,599, filed Jul. 5, 2012, naming Armin Tajalli, Harm Cronie, and Amin Shokrollahi entitled “Methods and Circuits for Efficient Processing and Detection of Balanced Codes” (hereafter called “Tajalli I”.)
U.S. patent application Ser. No. 13/842,740, filed Mar. 15, 2013, naming Brian Holden, Amin Shokrollahi and Anant Singh, entitled “Methods and Systems for Skew Tolerance in and Advanced Detectors for Vector Signaling Codes for Chip-to-Chip Communication”, hereinafter identified as [Holden I];
U.S. Provisional Patent Application No. 61/946,574, filed Feb. 28, 2014, naming Amin Shokrollahi, Brian Holden, and Richard Simpson, entitled “Clock Embedded Vector Signaling Codes”, hereinafter identified as [Shokrollahi I].
U.S. patent application Ser. No. 14/612,241, filed Aug. 4, 2015, naming Amin Shokrollahi, Ali Hormati, and Roger Ulrich, entitled “Method and Apparatus for Low Power Chip-to-Chip Communications with Constrained ISI Ratio”, hereinafter identified as [Shokrollahi II].
U.S. patent application Ser. No. 13/895,206, filed May 15, 2013, naming Roger Ulrich and Peter Hunt, entitled “Circuits for Efficient Detection of Vector Signaling Codes for Chip-to-Chip Communications using Sums of Differences”, hereinafter identified as [Ulrich I].
U.S. patent application Ser. No. 14/816,896, filed Aug. 3, 2015, naming Brian Holden and Amin Shokrollahi, entitled “Orthogonal Differential Vector Signaling Codes with Embedded Clock”, hereinafter identified as [Holden II].
U.S. patent application Ser. No. 14/926,958, filed Oct. 29, 2015, naming Richard Simpson, Andrew Stewart, and Ali Hormati, entitled “Clock Data Alignment System for Vector Signaling Code Communications Link”, hereinafter identified as [Stewart I].
U.S. patent application Ser. No. 14/925,686, filed Oct. 28, 2015, naming Armin Tajalli, entitled “Advanced Phase Interpolator”, hereinafter identified as [Tajalli II].
U.S. Provisional Patent Application No. 62/286,717, filed Jan. 25, 2016, naming Armin Tajalli, entitled “Voltage Sampler Driver with Enhanced High-Frequency Gain”, hereinafter identified as [Tajalli III].
The following additional references to prior art have been cited in this application:
U.S. Pat. No. 6,509,773, filed Apr. 30, 2001 by Buchwald et al., entitled “Phase interpolator device and method” (hereafter called [Buchwald].
“Linear phase detection using two-phase latch”, A. Tajalli, et al., IEE Electronic Letters, 2003, (hereafter called [Tajalli IV].)
“A Low-Jitter Low-Phase-Noise 10-GHz Sub-Harmonically Injection-Locked PLL With Self-Aligned DLL in 65-nm CMOS Technology”, Hong-Yeh Chang, Yen-Liang Yeh, Yu-Cheng Liu, Meng-Han Li, and Kevin Chen, IEEE Transactions on Microwave Theory and Techniques, Vol 62, No. 3, March 2014 pp. 543-555, (hereafter called [Chang et al.])
“Low Phase Noise 77-GHz Fractional-N PLL with DLL-based Reference Frequency Multiplier for FMCW Radars”, Herman Jalli Ng, Rainer Stuhlberger, Linus Maurer, Thomas Sailer, and Andreas Stelzer, Proceedings of the 6th European Microwave Integrated Circuits Conference, 10-11 Oct. 2011, pp. 196-199, (hereafter called [Ng et al.])
“Design of Noise-Robust Clock and Data Recovery using an Adaptive-Bandwidth Mixed PLL/DLL”, Han-Yuan Tan, Doctoral Thesis, Harvard University November 2006, (hereafter called [Tan]).
The present embodiments relate to communications systems circuits generally, and more particularly to obtaining a stable, correctly phased receiver clock signal from a high-speed multi-wire interface used for chip-to-chip communication.
In modern digital systems, digital information has to be processed in a reliable and efficient way. In this context, digital information is to be understood as information available in discrete, i.e., discontinuous values. Bits, collection of bits, but also numbers from a finite set can be used to represent digital information.
In most chip-to-chip, or device-to-device communication systems, communication takes place over a plurality of wires to increase the aggregate bandwidth. A single or pair of these wires may be referred to as a channel or link and multiple channels create a communication bus between the electronic components. At the physical circuitry level, in chip-to-chip communication systems, buses are typically made of electrical conductors in the package between chips and motherboards, on printed circuit boards (“PCBs”) boards or in cables and connectors between PCBs. In high frequency applications, microstrip or stripline PCB traces may be used.
Common methods for transmitting signals over bus wires include single-ended and differential signaling methods. In applications requiring high speed communications, those methods can be further optimized in terms of power consumption and pin-efficiency, especially in high-speed communications. More recently, vector signaling methods have been proposed to further optimize the trade-offs between power consumption, pin efficiency and noise robustness of chip-to-chip communication systems. In such vector signaling systems, digital information at the transmitter is transformed into a different representation space in the form of a vector codeword that is chosen in order to optimize the power consumption, pin-efficiency and speed trade-offs based on the transmission channel properties and communication system design constraints. Herein, this process is referred to as “encoding”. The encoded codeword is communicated as a group of signals from the transmitter to one or more receivers. At a receiver, the received signals corresponding to the codeword are transformed back into the original digital information representation space. Herein, this process is referred to as “decoding”.
Regardless of the encoding method used, the received signals presented to the receiving device must be sampled (or their signal value otherwise recorded) at intervals best representing the original transmitted values, regardless of transmission channel delays, interference, and noise. Such Clock and Data Recovery (CDR) not only determines the appropriate sample timing, but may continue to do so continuously, providing dynamic compensation for varying signal propagation conditions.
Many known CDR systems utilize a Phase-Locked Loop (PLL) or Delay-Locked Loop (DLL) to synthesize a local receive clock having an appropriate frequency and phase for accurate receive data sampling.
To reliably detect the data values transmitted over a communications system, a receiver must accurately measure the received signal value amplitudes at carefully selected times. Various methods are known to facilitate such receive measurements, including reception of one or more dedicated clock signals associated with the transmitted data stream, extraction of clock signals embedded within the transmitted data stream, and synthesis of a local receive clock from known attributes of the communicated data stream.
In general, the receiver embodiments of such timing methods are described as Clock-Data Recovery (CDR), often based on Phase-Lock Loop (PLL) or Delay-Locked Loop (DLL) synthesis of a local receive clock having the desired frequency and phase characteristics.
In both PLL and DLL embodiments, a phase comparator compares the relative phase (and in some variations, the relative frequency) of a received reference signal and a local clock signal to produce an error signal, which is subsequently used to correct the phase and/or frequency of the local clock source and thus minimize the error. As this feedback loop behavior will lead to a given PLL embodiment producing a fixed phase relationship (as examples, 0 degrees or 90 degrees of phase offset) between the reference signal and the local clock, an additional fixed or variable phase adjustment is often introduced to permit the phase offset to be set to a different desired value (as one example, 45 degrees of phase offset) to facilitate receiver data detection.
Below, methods and systems are described for receiving N phases of a local clock signal and M phases of a reference signal, wherein M is an integer greater than or equal to 1 and N is an integer greater than or equal to 2, generating a plurality of partial phase error signals, each partial phase error signal formed at least in part by comparing (i) a respective phase of the M phases of the reference signal to (ii) a respective phase of the N phases of the local clock signal, and generating a composite phase error signal by summing the plurality of partial phase error signals, and responsively adjusting a fixed phase of a local oscillator using the composite phase error signal.
In some embodiments, M=1, and N partial phase error signals are summed to generate the composite phase error signal. Alternatively, the plurality of partial phase error signals includes M=N partial phase error signals, and wherein a given phase of the N phases of the local clock signal and a given phase of the M phases of the reference signal are each used to generate a single partial phase error signal. In further alternative embodiments, the plurality of partial phase error signals includes M×N partial phase error signals, and wherein each phase of the N phases of the local clock signal is compared to each phase of the M phases of the reference signal.
In some embodiments, each partial phase error signal of the plurality of partial phase error signals has a corresponding weight applied to it. In some embodiments, the weights are selected according to an M×N matrix.
In some embodiments, the M phases of the reference signal are received from a delay-lock loop operating on an input reference signal.
In some embodiments, at least one of the N phases of the local clock signal is generated using a phase interpolator operating on local oscillator signals and a phase offset signal. In some embodiments, generating at least one of the N phases of the local clock signal includes interpolating 4 phases using 4 differential pairs in the phase interpolator, each of the 4 phases being interpolated according to a corresponding differential pair connected to an independently tunable current source.
In some embodiments, at least one partial phase error signal is generated using a pair of flip-flops, wherein a first flip-flop of the pair of flip-flops is clocked using a given phase of the M phases of the reference signal and a second flip-flop is clocked using a given phase of the N phases of the local clock signal.
In some embodiments, each partial phase error signal is an analog signal generated using a respective charge pump, the respective charge pump receiving respective charge pump control signals generated by a respective comparison between the respective phase of the M phases of the reference signal and the respective phase of the N phases of the local clock signal.
Embodiments are described in which the Phase Detection and phase adjustment elements are combined, leading to lower circuit node capacitance and reduced circuit delays, these improvements in turn enabling increased loop stability and improved PLL lock characteristics, including increased loop lock bandwidth leading to lower clock jitter and improved power supply noise rejection.
Embodiments are also described in which a Delay-Locked Loop is used to convert the received reference clock signal into multiple reference clock phases, converting the PLL phase comparison operation into multiple comparisons made between a reference clock phase and a local clock phase. A summation or weighted summation of the multiple comparison results is then used as the error feedback signal for the PLL. A further embodiment is described in which multiple comparisons are made between a single received reference clock phase and multiple local clock phases, with the weighted sum of the multiple comparison results used as the error feedback term for the PLL. In at least one such further embodiment, said weighted sums comprise a two dimensional time domain filter.
As described in [Cronie I], [Cronie II], [Cronie III] and [Cronie IV], vector signaling codes may be used to produce extremely high bandwidth data communications links, such as between two integrated circuit devices in a system. As illustrated by the embodiment of
Individual symbols, e.g. transmissions on any single communications channel, may utilize multiple signal levels, often three or more. Operation at channel rates exceeding 10 Gbps may further complicate receive behavior by requiring deeply pipelined or parallelized signal processing, precluding reception methods in which the previous received value is known as the current value is being received.
Embodiments described herein can also be applied to prior art permutation sorting methods not covered by the vector processing methods of [Cronie II], [Cronie III], [Cronie IV], and/or [Tajalli I]. More generally, embodiments can apply to any communication or storage methods requiring coordination of multiple channels or elements of the channel to produce a coherent aggregate result.
Receiver Data Detection
To provide context for the following examples, one typical high-speed receiver embodiment [Stewart I] is used for illustrative purposes, without limitation.
As illustrated in
As described in [Tajalli I], [Holden I] and [Ulrich I], vector signaling codes may be efficiently detected by linearly combining sets of input signals using Multi-Input comparators or mixers (MIC). For the 5b6w code used by the example receiver, five such mixers acting on weighted subsets of the six received data input signals will detect the five data bits without need of further decoding. One additional mixer acting on combinations of the two received clock signals will similarly detect the clock signal. In
Because of the high data rates involved, multiple parallel phases of receive processing may be used in the example receiver. In one embodiment, the five detected data signals MIC0-MIC4 are processed in four parallel phases of receive data processing, each phase 230 including five data samplers and subsequent buffering, followed by recombination of the four phase outputs into a received data stream, shown in
Clock Recovery circuits (also known in the art as Clock Data Recovery or CDR) support such sampling measurements by extracting timing information, either from the data lines themselves or from dedicated clock signal inputs, and utilize that extracted information to generate clock signals to control the time interval used by the data line sampling device(s). The actual clock extraction may be performed using well known circuits such as a Phase Locked Loop (PLL) or Delay Locked Loop (DLL), which in their operation may also generate higher frequency internal clocks, multiple clock phases, etc. in support of receiver operation. In the embodiment of
PLL Overview
Phase Locked Loops are well represented in the literature. A typical PLL is composed of a phase comparator that compares an external reference signal to an internal clock signal, a low pass filter that smooths the resulting error value to produce a clock control signal, and a variable frequency clock source (typically, a Voltage Controlled Oscillator or VCO) controlled by the smoothed error value, producing the internal clock signal presented to the phase comparator. In a well-know variation, such a PLL design may incorporate a clock frequency divider between the VCO and the phase comparator, allowing a higher-frequency clock output to be phase locked to a lower-frequency reference signal.
In an alternative embodiment, the variable frequency clock source is replaced by a variable delay element, its (optionally multiple tapped) outputs thus representing one or more successive time-delayed versions of the original input signal rather than successive cycles of an oscillator to be phase compared to the reference input signal. For the purposes of this document, such Delay Locked Loops (DLL) are considered functionally equivalent to a PLL in such an application, particularly in regard to comprised elements of phase comparator, phase interpolator, and charge pump.
Numerous forms of phase comparators are known to the art. A simple XOR gate as in
The more complex state machine phase comparator of
As shown in
As will be recognized by those familiar with the art, comparable functional operation may be obtained regardless of the phase comparator type incorporated in a PLL design, thus to first approximation phase comparator choice is not limiting. Secondary design behaviors, including lock time, stability, power consumption, etc. must also be considered as part of the design process.
Receiver Clock Recovery
The example receiver utilizes a PLL embodiment as shown in
In at least one embodiment, a ring oscillator 340 composed of a sequence of identical gates in a closed loop is used as the internal Voltage Controlled Oscillator (VCO) timing source for the PLL. The VCO frequency is varied by analog adjustment of at least one of: gate propagation delay, inter-gate rise and fall time, and gate switching threshold within the ring oscillator. This may be implemented via switched capacitor banks, where a digital control signal is applied to selective place capacitive elements in parallel and/or series combinations to alter an RC time constant, as one non-limiting example. Still further, a current source that drives a gate of the ring oscillator may be increased or decreased to alter the output switching rise-time/fall-time, and thereby adjust the effective delay. Outputs taken at equal intervals (i.e. separated by equal numbers of ring oscillator gates) along the sequence of gates comprising the ring oscillator provide the four data phase sampling clocks, herein identified as the 0, 90, 180, and 270 degree clocks.
In one embodiment, the ring oscillator is composed of eight identical sets of logic gates (e.g., a set of inverter circuits), thus the phase difference from one such set to the next is 45 degrees. In this embodiment, the 0, 90, 180, and 270 degree outputs may be obtained, as examples, from the second, fourth, sixth, and eighth outputs. As many variations of such designs are known in the art, neither the number of elements in the ring oscillator nor the specific taps at which particular outputs are made should be construed as implying a limitation. As one example, the location of the 0 degree tap is arbitrary, as one familiar with the art will recognize that normal PLL behavior will phase align the ring oscillator with the external phase reference regardless of its initial phase. Similarly, equivalent designs may be obtained in which the output clock phases do not have square wave duty cycles; as one example being produced by the action of AND or OR gates with inputs from different tap locations. In the example receiver, it is desired that the VCO operate at a multiple of the received reference clock frequency, thus Frequency Divider 350 divides the VCO outputs by a comparable amount prior to the phase comparator. In one embodiment, binary (factor of two) dividers are used at 350 to obtain the correct sampling clock rate. In another embodiment, no divider is used, and the VCO outputs are presented to the phase interpolator directly.
Each of the four phases of sampling clocks is appropriately timed to sample received data for one of the four parallel processing phases. In particular, internal clock ph000 is aligned to optimally trigger data samplers in the phase0 phase of processing, clock ph090 in phase1, clock ph180 in phase2, and clock ph270 in phase3.
To allow the overall phase of the locked PLL signals to be offset from the reference clock input phase, the local clock output presented to the phase comparator is obtained from phase interpolator 360, the output phase of which is controllably intermediate between its input clock phases. Thus, the PLL may lock with its fixed phase relationship, while the internal clock signals obtained from ring oscillator 340 will be offset from that fixed phase by the phase delay amount introduced by phase interpolator 350, as controlled by signal Phase offset correction. Phase interpolators are known in the art, examples being provided by [Buchwald I] and [Tajalli II].
In one embodiment, phase interpolator 360 receives multiple local clock phases from the ring oscillator 340 having 90 degree phase differences. Said phase interpolator may be controlled to select two adjacent clock input phases and then to interpolate between them so as to produce an output at a chosen phase offset between the two selected values. For purposes of description, it may be assumed that a phase comparator design is used which drives the PLL to lock with a zero phase differential between the two phase comparator inputs. Thus, continuing the example, applying the 0 and 90 degree clock phases as inputs to the phase interpolator allows adjustment such that the PLL leads the reference clock input by between 0 and 90 degrees.
It will be apparent that equivalent results with comparable phase offsets may be obtained using other pairs of degree clocks and/or other phase comparator designs, which as previously described may lock with different phase differentials than that of the present example. Thus, neither the particular phase clocks chosen nor the particular phase comparator design described herein are limiting.
Phase Comparator with Interpolator
As communication channel data rates increase, it becomes increasingly difficult to maintain acceptable PLL lock range and accuracy, as inherent and parasitic circuit node capacitances introduce circuit delays and constrain the effective loop response bandwidth. An embodiment providing improved response characteristics suitable for such high-speed operation is illustrated in
As with conventional designs, the PLL VCO (or a clock divider driven by said VCO) provides the local oscillator inputs to phase interpolator elements 510 and 515, which together set the effective local clock phase. Four local oscillator phases with 90-degree offset are shown i.e. equivalent to two phases in quadrature relationship and their complimentary signals and thus identified as +I, +Q, and −I, −Q, permitting a full 360 degree or “four quadrant” phase adjustment. Other embodiments may utilize as few as two local oscillator phases, may use oscillator phases having other than 90-degree phase differences, or may select clock phases from an input set of more than four; as one non-limiting example, choosing at least two clock phases to be interpolated between from an input set of eight clock phases.
In a first embodiment, phase interpolator element 510 includes four mixing elements, each mixing element comprising a differential transistor pair and a controlled current source, with a common differential output driven by the four mixing elements in parallel. Thus, configuration of current source IA(i) controls the amount of local oscillator phase +1 presented to the common output ckp; similarly, current source IA(−i) controls the amount of complimentary output phase −I in the output, IA(q) controls the amount of +Q, and IA(−q) controls the amount of −Q. It will be readily apparent to one familiar with the art that configuration of the four current sources can produce an output clock at Ckp having any desired phase relationship to the PLL local clock input.
Similarly, phase interpolator element 515 current sources IB(i), IB(−i), IB(q), and IB(−q) may be configured to obtain an output clock at Ckn having any desired phase relationship to the PLL local clock input. In some embodiments, CkPLLp and CkPLLn may be configured to have complimentary relationships to provide phase comparator 520 with balanced and complimentary positive- and negative-going current amplitudes. However, configuration with non-complimentary IA and IB values may be performed to obtain particular results. As one example offered without limitation, an embodiment separately adjusting IA and IB values might obtain higher resolution phase adjustment, compared to an embodiment maintaining perfectly complimentary IA and IB values.
The second input to the phase comparator 520 is external reference clock CkRef+/CkRef−, producing the phase error output currents VCOct1+/VCOct1−. In one advanced embodiment, the two external reference clocks are of opposing polarity but not necessarily complementary phase, thus the positive polarity comparison and negative polarity comparison represent different phase comparisons. Such an advanced embodiment may be combined with non-complimentary IA and IB bias configurations, providing independent adjustment of local clock phase during those different phase comparisons. That is, in one embodiment, the CkRef input at the top of PD 520 is a first phase selected from the reference clock phases available in the circuit, and the IA currents are adjusted to provide a corresponding interpolated phase offset from the first selected phase, and the CkRef input at the bottom of PD 520 is a second phase selected from the reference clock phases available in the circuit, and the IB currents are adjusted to provide a corresponding interpolated phase offset from the second selected phase, wherein the amount of the relative phase offsets are the same.
Configuration of phase interpolator current source values may be performed by external control logic, including without limitation, a hardware configuration register, control processor output register, and hardware CDR adjustment logic.
Phase comparator 520 in the embodiment of
Phase comparator 1220 is also driven by received reference clocks CkRef+ and CkRef−, producing phase comparison results Phase Error (+) and Phase Error (−). In some embodiments, the circuit node labeled Circuit Balance Feedback may be monitored to determine the relative DC component of the interpolated clock signals, which may then be modified by adjustment of the configured current source values in 510 and 515. In some embodiments, each current source IA and IB receives seven control bits. It should be noted that embodiments are not limited to receiving seven control bits, and that any number of control bits may be implemented according to design constraints for PI resolution, for example. In some embodiments, current sources IA and IB are equal (e.g., IA=IB for +/−i, q). In such embodiments, the PIs 510 and 515 have 7 bits of resolution. In alternative embodiments, additional resolution may be implemented by introducing a shift in IB with respect to IA, or vice versa. In an exemplary embodiment, IA=IB+8, where 8 is a decimal shift added to the control bits of each current source IA to obtain the control bits of each current source IB. In such embodiments, the P-side PI 510 and N-side PI 515 are receiving two different VCO phases, and the phase comparator collects information from different phases of the VCO. Since the PIs 510 and 515 combine information from different phases of VCO, the PLL has more detailed information about phases of PLL and the bandwidth of the PLL is higher than a conventional PLL.
Embodiments for which IA=IB+shift are a special case of a matrix phase comparator in which there are two partial phase comparators. The first partial phase comparator (N-side XOR) compares the phase of reference with one set of VCO feedback phases, and a second partial phase comparator (P-side XOR) that compares the reference clock phase with a second set of VCO feedback phases. Matrix phase comparator embodiments are described in further detail below.
In some embodiments, a folded structure as shown in
In some embodiments, the second 180 degrees (4) may be used to provide circuit balance feedback, as shown in
The phase comparator of [Tajalli IV] may alternatively be used at 520 or 1220, providing equivalent phase detection with enhanced signal headroom in embodiments utilizing low power supply voltages. Other phase comparators, including all variations shown in
As one example of such alternative embodiment, the State Machine Phase/Frequency Detector of
Substituting the clocked latch circuit of
It should be noted that in this one embodiment the majority of phase interpolator 715 is functionally disabled and retained only to preserve the same parasitic load characteristics as are presented by active phase interpolator 725, to maximize circuit symmetry and maintain balanced loading characteristics to minimize secondary effects such as detection bias and drift.
Integrated Phase Comparator, Interpolation, and Charge Pump
As previously described, PLL phase comparator outputs are typically used to drive a charge pump circuit (CPC), the output of which is an analog error signal used to control the VCO. The described improvement from reduced capacitance and resulting higher circuit speed in integrating the PLL phase comparator and clock adjustment phase interpolator may be further extended by also integrating elements of the charge pump in the same manner.
In this combined embodiment, the charge pump control signals UPp, UPn, DOWNp, and DOWNn provided by the embodiment shown in
Other embodiments may be obtained by equivalent combination of phase comparator, phase interpolator, and charge pump elements.
Oversampling of Input Reference Signal
The asymmetric use of the phase interpolators in, as one example,
In the known art, [Tan] described a combined DLL/PLL structure, in which the voltage controlled delay line incorporated in the PLL VCO is duplicated as an input delay line acting on the reference clock input, and controlled by a single feedback error signal. [Ng] and [Chang] also describe use of a front-end DLL to serve as a frequency multiplier to facilitate generation of very high frequency clocks.
However, if such a controlled delay line is tapped, and so configured that the differential delay between taps is proportional to the time between received clock edges, a received clock passing through such a delay line produces a resulting set of outputs which take on some of the characteristics of a multiphase clock. As one example offered without limitation, the equal-interval outputs of a four-tap delay line having an overall delay comparable to the reference clock period will provide outputs having similar characteristic to quadrature phased clock signals. Continuing this example, if each such output is phase compared to an appropriately-selected local clock phase, a series of phase error results will be produced which may be combined to produce a more accurate aggregate clock error signal for the PLL VCO. The delayed versions of the receive clock represent additional opportunities for phase comparison with a clock derived from the VCO, thus providing a higher update rate for the controlled loop, and thus improved PLL loop bandwidth leading to reduced jitter and better noise immunity. That is, using this technique, the update rate of the loop will be increased, which in turn enables the circuit to track and correct the effects of noise and jitter at higher frequencies.
For the delayed phase comparisons to provide meaningful information to the PLL, the delay intervals provided by the delay line must be coordinated with the period between local clock phases, with such controls giving the delay element many of the aspects of a Delay-Locked Loop (DLL.) As seen in the block diagram of
Within PLL 300, the previous simple phase comparison (320 of
In some system environments, the described multi-phase reference clock may be directly available from the receiver, as one example where the communications protocol incorporates multiple clock signals.
The additional feedback information provided by the multiple comparison operations may also be obtained without the previously-described DLL front end.
In a further embodiment, summation 935 is performed by a weighted summation node comparable to the previously described MIC mixer, the different selected weights of said summation allowing further control of PLL static and dynamic operational characteristics. In particular, such weight adjustments may be used to produce additional closed-loop poles and/or zeroes in the PLL time domain transfer function, providing additional control of loop stability.
XOR(CKREF, VCO'000)
XOR(CKREF, VCO'090)
XOR(CKREF, VCO'180)
XOR(CKREF, VCO'270)
As shown in
It should be noted that in array-XOR embodiments, some comparisons might be done using XNORs. As such, an XOR or XNOR for different phase comparisons may be selected carefully to ensure system stability.
In at least one embodiment, the weights of said summation are configured such that they decline in proportion to the timing difference of the comparison clock phase relative to the PLL “normal lock” phase. As one example offered without limitation, if ph090 is the normal lock phase of the PLL, the comparison of ph090 and the received reference signal is weighted 1; comparisons of ph000 and ph180 (e.g. one tap offset from the normal lock phase) are weighted ½; comparison of the received reference signal and ph270 (two tap offsets from the normal lock phase) is weighed 14; etc. These various weighted comparison results are then summed to produce a composite signal which when low pass filtered 330, is the Error value controlling PLL VCO 340.
In at least one embodiment utilizing multiple phase comparators, the deterministic jitter produced by the multiple phase comparisons was seen to occur at a 12.5 GHz rate with equal phase comparator weights. Even though the amount of jitter was very small and the jitter rate was well above the loop filter cutoff frequency, the deterministic jitter was significantly reduced with the described weight adjustments, in which weight magnitudes decline in proportion to their offset distance from the primary reference signal sample. In some embodiments, different weighted values are used in a comparator circuit to construct a discrete time domain filter. This property can be used to simplify the design of analog filter 330. For example, with proper weighting values one might construct a discrete time domain zero in the transfer function that provides conditions to make the loop robust.
As with previously described examples, other embodiments may be obtained by equivalent combination of phase comparator, phase interpolator, and charge pump elements.
Matrix Phase Comparisons
The multi-phase comparison of multiple phases derived from a received reference signal and multiple phases derived from the local PLL clock may be generalized into a matrix phase comparator, one embodiment of which is shown in
In a full matrix comparison, each of M phases derived from the received reference signal is separately phase compared with each of the N phases derived from the local clock, which may be received from a PLL, or alternatively directly from a VCO or various other clock sources. For purposes of illustration, the N phases of the local clock are received from the PLL. Each resulting phase error signal is weighted by a configured or predetermined amount, with all (M*N) weighted results summed to produce an aggregate error result. An example of one phase comparator is shown in
An embodiment of the complete matrix phase comparator 1120 in
One familiar with the art will observe that the previously-described multi-phase comparator 920 of
In some embodiments, M=1, and N partial phase error signals are summed to generate the composite phase error signal. Alternatively, the plurality of partial phase error signals includes M=N partial phase error signals, and a given phase of the N phases of the local clock signal and a given phase of the M phases of the reference signal are each used to generate a single partial phase error signal. In further alternative embodiments, the plurality of partial phase error signals comprises M×N partial phase error signals, and each phase of the N phases of the local clock signal is compared to each phase of the M phases of the reference signal.
In some embodiments, each partial phase error signal of the plurality of partial phase error signals has a corresponding weight applied to it.
In some embodiments, the M phases of the reference signal are received from a delay-lock loop operating on an input reference signal.
In some embodiments, at least one of the N phases of the local clock signal is generated using a phase interpolator operating on local oscillator signals and a phase offset signal. In some embodiments, at least one of the N phases of the local clock signal comprises interpolating 4 phases using 4 differential pairs in the phase interpolator, each of the 4 phases being interpolated according to a corresponding differential pair connected to an independently tunable current source.
In some embodiments, at least one partial phase error signal is generated using a pair of flip-flops, wherein a first flip-flop of the pair of flip-flops is clocked using a given phase of the M phases of the reference signal and a second flip-flop is clocked using a given phase of the N phases of the local clock signal.
In some embodiments, each partial phase error signal is an analog signal generated using a respective charge pump, the respective charge pump receiving respective charge pump control signals generated by a respective comparison between the respective phase of the M phases of the reference signal and the respective phase of the N phases of the local clock signal.
The clock signal received from MIC5 in
Similarly, known methods of communicating a clock signal using edge transitions of the data lines may be combined with the PLL and timing control mechanisms described herein. In particular, vector signaling codes with guaranteed transition density over time, such as taught by [Shokrollahi I] are amenable to such combination.
This Application is a Continuation of application Ser. No. 16/813,526, filed Mar. 9, 2020, entitled High Performance Phase Locked Loop”, naming Armin Tajalli, which is a Continuation of application Ser. No. 16/533,594, filed Aug. 6, 2019, entitled High Performance Phase Locked Loop”, naming Armin Tajalli, which is a Continuation of application Ser. No. 16/107,822, filed Aug. 21, 2018, entitled “High Performance Phase Locked Loop”, naming Armin Tajalli, which is a Continuation of application Ser. No. 15/494,439 filed Apr. 21, 2017, entitled, “High Performance Phase Locked Loop,” naming Armin Tajalli, which claims the benefit of U.S. Provisional Application No. 62/326,591 filed Apr. 22, 2016, entitled “High Performance Phase Locked Loop”, naming Armin Tajalli, all of which are hereby incorporated by reference in their entirety for all purposes.
Number | Name | Date | Kind |
---|---|---|---|
4839907 | Saneski | Jun 1989 | A |
5266907 | Dacus | Nov 1993 | A |
5302920 | Bitting | Apr 1994 | A |
5528198 | Baba et al. | Jun 1996 | A |
5565817 | Lakshmikumar | Oct 1996 | A |
5602884 | Wieczorkiewicz et al. | Feb 1997 | A |
5629651 | Mizuno | May 1997 | A |
5802356 | Gaskins et al. | Sep 1998 | A |
6026134 | Duffy et al. | Feb 2000 | A |
6037812 | Gaudet | Mar 2000 | A |
6122336 | Anderson | Sep 2000 | A |
6307906 | Tanji et al. | Oct 2001 | B1 |
6316987 | Dally et al. | Nov 2001 | B1 |
6380783 | Chao et al. | Apr 2002 | B1 |
6389091 | Yamaguchi et al. | May 2002 | B1 |
6426660 | Ho et al. | Jul 2002 | B1 |
6507544 | Ma et al. | Jan 2003 | B1 |
6509773 | Buchwald et al. | Jan 2003 | B2 |
6633621 | Bishop et al. | Oct 2003 | B1 |
6650699 | Tierno | Nov 2003 | B1 |
6717478 | Kim et al. | Apr 2004 | B1 |
6838951 | Nieri et al. | Jan 2005 | B1 |
6917762 | Kim | Jul 2005 | B2 |
7078978 | Wakii | Jul 2006 | B2 |
7102449 | Mohan | Sep 2006 | B1 |
7158441 | Okamura | Jan 2007 | B2 |
7199728 | Dally et al. | Apr 2007 | B2 |
7336112 | Sha et al. | Feb 2008 | B1 |
7532697 | Sidiropoulos et al. | May 2009 | B1 |
7535957 | Ozawa et al. | May 2009 | B2 |
7616075 | Kushiyama | Nov 2009 | B2 |
7650525 | Chang et al. | Jan 2010 | B1 |
7688929 | Co | Mar 2010 | B2 |
7697647 | McShea | Apr 2010 | B1 |
7822113 | Tonietto et al. | Oct 2010 | B2 |
7839229 | Nakamura et al. | Nov 2010 | B2 |
7852109 | Chan et al. | Dec 2010 | B1 |
7860190 | Feller | Dec 2010 | B2 |
3036300 | Evans et al. | Oct 2011 | A1 |
8253454 | Lin | Aug 2012 | B2 |
8407511 | Mobin et al. | Mar 2013 | B2 |
8583072 | Ciubotaru et al. | Nov 2013 | B1 |
8593305 | Tajalli et al. | Nov 2013 | B1 |
8791735 | Shibasaki | Jul 2014 | B1 |
8929496 | Lee et al. | Jan 2015 | B2 |
9036764 | Hossain et al. | May 2015 | B1 |
9059816 | Simpson et al. | Jun 2015 | B1 |
9288082 | Ulrich et al. | Mar 2016 | B1 |
9300503 | Holden et al. | Mar 2016 | B1 |
9306621 | Zhang et al. | Apr 2016 | B2 |
9374250 | Musah et al. | Jun 2016 | B1 |
9397868 | Hossain et al. | Jul 2016 | B1 |
9401828 | Cronie et al. | Jul 2016 | B2 |
9438409 | Liao et al. | Sep 2016 | B1 |
9461862 | Holden et al. | Oct 2016 | B2 |
9520883 | Shibasaki | Dec 2016 | B2 |
9565036 | Zerbe et al. | Feb 2017 | B2 |
9577815 | Simpson et al. | Feb 2017 | B1 |
9602111 | Shen et al. | Mar 2017 | B1 |
9906358 | Tajalli | Feb 2018 | B1 |
9960902 | Lin et al. | May 2018 | B1 |
10003315 | Tajalli | Jun 2018 | B2 |
10055372 | Shokrollahi | Aug 2018 | B2 |
10057049 | Tajalli | Aug 2018 | B2 |
10326435 | Arp et al. | Jun 2019 | B2 |
10574487 | Hormati | Feb 2020 | B1 |
10848351 | Hormati | Nov 2020 | B2 |
20030001557 | Pisipaty | Jan 2003 | A1 |
20030146783 | Bandy et al. | Aug 2003 | A1 |
20030212930 | Aung et al. | Nov 2003 | A1 |
20030214977 | Kuo | Nov 2003 | A1 |
20040092240 | Hayashi | May 2004 | A1 |
20040141567 | Yang et al. | Jul 2004 | A1 |
20050024117 | Kubo et al. | Feb 2005 | A1 |
20050084050 | Cheung et al. | Apr 2005 | A1 |
20050117404 | Savoj | Jun 2005 | A1 |
20050128018 | Meltzer | Jun 2005 | A1 |
20050141662 | Sano et al. | Jun 2005 | A1 |
20050201491 | Wei | Sep 2005 | A1 |
20050220182 | Kuwata | Oct 2005 | A1 |
20050275470 | Choi | Dec 2005 | A1 |
20060008041 | Kim | Jan 2006 | A1 |
20060062058 | Lin | Mar 2006 | A1 |
20060140324 | Casper et al. | Jun 2006 | A1 |
20060232461 | Felder | Oct 2006 | A1 |
20060233291 | Garlepp et al. | Oct 2006 | A1 |
20070001713 | Lin | Jan 2007 | A1 |
20070001723 | Lin | Jan 2007 | A1 |
20070047689 | Menolfi et al. | Mar 2007 | A1 |
20070058768 | Werner | Mar 2007 | A1 |
20070086267 | Kwak | Apr 2007 | A1 |
20070127612 | Lee et al. | Jun 2007 | A1 |
20070146088 | Arai et al. | Jun 2007 | A1 |
20070147559 | Lapointe | Jun 2007 | A1 |
20070183552 | Sanders et al. | Aug 2007 | A1 |
20070201597 | He et al. | Aug 2007 | A1 |
20070253475 | Palmer | Nov 2007 | A1 |
20080007367 | Kim | Jan 2008 | A1 |
20080111634 | Min | May 2008 | A1 |
20080136479 | You et al. | Jun 2008 | A1 |
20080165841 | Wall et al. | Jul 2008 | A1 |
20080181289 | Moll | Jul 2008 | A1 |
20080219399 | Nary | Sep 2008 | A1 |
20080317188 | Staszewski et al. | Dec 2008 | A1 |
20090103675 | Yousefi et al. | Apr 2009 | A1 |
20090167389 | Reis | Jul 2009 | A1 |
20090195281 | Tamura et al. | Aug 2009 | A1 |
20090231006 | Jang et al. | Sep 2009 | A1 |
20090243679 | Smith et al. | Oct 2009 | A1 |
20090262876 | Arima et al. | Oct 2009 | A1 |
20090262877 | Shi et al. | Oct 2009 | A1 |
20100033259 | Miyashita | Feb 2010 | A1 |
20100090723 | Nedovic et al. | Apr 2010 | A1 |
20100090735 | Cho | Apr 2010 | A1 |
20100156543 | Dubey | Jun 2010 | A1 |
20100180143 | Ware et al. | Jul 2010 | A1 |
20100220828 | Fuller et al. | Sep 2010 | A1 |
20110002181 | Wang et al. | Jan 2011 | A1 |
20110025392 | Wu et al. | Feb 2011 | A1 |
20110148498 | Mosalikanti et al. | Jun 2011 | A1 |
20110234278 | Seo | Sep 2011 | A1 |
20110268225 | Cronie et al. | Nov 2011 | A1 |
20110302478 | Cronie et al. | Dec 2011 | A1 |
20110311008 | Slezak et al. | Dec 2011 | A1 |
20120051480 | Usugi et al. | Mar 2012 | A1 |
20120170621 | Tracy et al. | Jul 2012 | A1 |
20120200364 | Iizuka et al. | Aug 2012 | A1 |
20120206177 | Colinet et al. | Aug 2012 | A1 |
20120213299 | Cronie et al. | Aug 2012 | A1 |
20120235717 | Hirai et al. | Sep 2012 | A1 |
20120327993 | Palmer | Dec 2012 | A1 |
20130088274 | Gu | Apr 2013 | A1 |
20130091392 | Valliappan et al. | Apr 2013 | A1 |
20130093471 | Cho et al. | Apr 2013 | A1 |
20130107997 | Chen | May 2013 | A1 |
20130108001 | Chang et al. | May 2013 | A1 |
20130207706 | Yanagisawa | Aug 2013 | A1 |
20130243127 | Ito et al. | Sep 2013 | A1 |
20130271194 | Madoglio et al. | Oct 2013 | A1 |
20130285720 | Jibry | Oct 2013 | A1 |
20130314142 | Tamura et al. | Nov 2013 | A1 |
20140286381 | Shibasaki | Sep 2014 | A1 |
20140286457 | Chaivipas | Sep 2014 | A1 |
20150043627 | Kang et al. | Feb 2015 | A1 |
20150078495 | Hossain et al. | Mar 2015 | A1 |
20150117579 | Shibasaki | Apr 2015 | A1 |
20150180642 | Hsieh et al. | Jun 2015 | A1 |
20150220472 | Sengoku | Aug 2015 | A1 |
20150222458 | Hormati et al. | Aug 2015 | A1 |
20150249559 | Shokrollahi et al. | Sep 2015 | A1 |
20150256326 | Simpson et al. | Sep 2015 | A1 |
20160056980 | Wang et al. | Feb 2016 | A1 |
20160087610 | Hata | Mar 2016 | A1 |
20160134267 | Adachi | May 2016 | A1 |
20170228215 | Chatwin et al. | Aug 2017 | A1 |
20170310456 | Tajalli | Oct 2017 | A1 |
20180083763 | Black et al. | Mar 2018 | A1 |
20180083809 | Tajalli et al. | Mar 2018 | A1 |
20180219539 | Arp et al. | Aug 2018 | A1 |
20180227114 | Rahman et al. | Aug 2018 | A1 |
20180343011 | Tajalli et al. | Nov 2018 | A1 |
20180375693 | Zhou et al. | Dec 2018 | A1 |
20190109735 | Norimatsu | Apr 2019 | A1 |
20190377378 | Gharibdoust | Dec 2019 | A1 |
20200162233 | Lee et al. | May 2020 | A1 |
20210248103 | Khashaba et al. | Aug 2021 | A1 |
20210258029 | Cyzs | Aug 2021 | A1 |
Number | Date | Country |
---|---|---|
203675093 | Jun 2014 | CN |
0740423 | Oct 1996 | EP |
3615692 | Nov 2004 | JP |
Entry |
---|
International Search Report and Written Opinion for PCT/US2017/029011, dated Jul. 13, 2017, 1-5 (5 pages). |
Chang, Hong-Yeh , et al., “A Low-Jitter Low-Phase-Noise 10-GHz Sub-Harmonically Injection-Locked PLL With Self-Aligned DLL in 65-nm CMOS Technology”, IEEE Transactions on Microwave Theory and Techniques, vol. 62, No. 3, Mar. 2014, 543-555 (13 pages). |
Ha, J.C. , et al., “Unified All-Digital Duty-Cycle and phase correction circuit for QDR I/O interface”, Electronic Letters, The Institution of Engineering and Technology, vol. 44, No. 22, Oct. 23, 2008, 1300-1301 (2 pages). |
Loh, Mattew , et al., “A 3×9 Gb/s Shared, All-Digital CDR for High-Speed, High-Density I/O”, IEEE Journal of Solid-State Circuits, vol. 47, No. 3, Mar. 2012, 641-651 (11 pages). |
Nandwana, Romesh Kumar, et al., “A Calibration-Free Fractional-N Ring PLL Using Hybrid Phase/Current-Mode Phase Interpolation Method”, IEEE Journal of Solid-State Circuits, vol. 50, No. 4, Apr. 2015, 882-895 (14 pages). |
Ng, Herman Jalli, et al., “Low Phase Noise 77-GHz Fractional-N PLL with DLL-based Reference Frequency Multiplier for FMCW Radars”, European Microwave Integrated Circuits Conference, Oct. 10-11, 2011, 196-199(4 pages). |
Riley, M. W., et al., “Cell Broadband Engine Processor: Design and Implementation”, IBM Journal of Research and Development, vol. 51, No. 5, Sep. 2007, 545-557 (13 pages). |
Ryu, Kyungho , et al., “Process-Variation-Calibrated Multiphase Delay Locked Loop With a Loop-Enbedded Duty Cycle Corrector”, IEEE Transactions on Circuits and Systems, vol. 61, No. 1, Jan. 2014, 1-5 (5 pages). |
Tajalli, Armin , “Wideband PLL Using Matrix Phase Comparator”, Journal of Latex Class Files, vol. 14, No. 8, Aug. 2016, 1-8 (8 pages). |
Tan, Han-Yuan , “Design of Noise-Robust Clock and Data Recovery Using an Adaptive-Bandwidth Mixed PLL/DLL”, Harvard University Thesis, Nov. 2006, 1-169 (169 pages). |
Wang, Yi-Ming , et al., “Range Unlimited Delay-Interleaving and -Recycling Clock Skew Compensation and Duty-Cycle Correction Circuit”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 23, No. 5, May 2015, 856-868 (13 pages). |
Cui, Delong , et al., “A Dual-Channel 23-Gbps CMOS Transmitter/Receiver Chipset for 40-Gbps RZ-DQPSK and CS-RZ-DQPSK Optical Transmission”, IEEE Journal of Solid-State Circuits, vol. 47, No. 12, Dec. 2012, 3249-3260 (12 pages). |
Inti, Rajesh , et al., “A 0.5-to-2.5 Gb/s Reference-Less Half-Rate Digital CDR with Unlimited Frequency Acquisition Range and Improved Input Duty-Cycle Error Tolerance”, IEEE Journal of Solid-State Circuits, vol. 46, No. 12, Dec. 2011, 3150-3162 (13 pages). |
Pozzoni, Massimo , et al., “A Multi-Standard 1.5 to 10 Gb/s Latch-Based 3-Tap DFE Receiver with a SSC Tolerant CDR for Serial Backplane Communication”, IEEE Journal of Solid-State Circuits, vol. 44, No. 4, Apr. 2009, 1306-1315 (10 pages). |
Shu, Guanghua , et al., “A 4-to-10.5 Gb/s Continuous-Rate Digital Clock and Data Recovery With Automatic Frequency Acquisition”, IEEE Journal of Solid-State Circuits, vol. 51, No. 2, Feb. 2016, 428-439 (12 pages). |
Yoo, Danny , et al., “A 36-Gb/s Adaptive Baud-Rate CDR with CTLE and 1-Tap DFE in 28-nm CMOS”, IEEE Solid-State Circuits Letters, vol. 2, No. 11, Nov. 2019, 252-255 (4 pages). |
Zaki, Ahmed M., “Adaptive Clock and Data Recovery for Asymmetric Triangular Frequency Modulation Profile”, IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), Aug. 21, 2019, 1-6 (6 pages). |
Number | Date | Country | |
---|---|---|---|
20220191000 A1 | Jun 2022 | US |
Number | Date | Country | |
---|---|---|---|
62326591 | Apr 2016 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16813526 | Mar 2020 | US |
Child | 17684273 | US | |
Parent | 16533594 | Aug 2019 | US |
Child | 16813526 | US | |
Parent | 16107822 | Aug 2018 | US |
Child | 16533594 | US | |
Parent | 15494439 | Apr 2017 | US |
Child | 16107822 | US |