The following prior applications are herein incorporated by reference in their entirety for all purposes:
U.S. Patent Publication 2011/0268225 of application Ser. No. 12/784,414, filed May 20, 2010, naming Harm Cronie and Amin Shokrollahi, entitled “Orthogonal Differential Vector Signaling” (hereinafter “Cronie I”).
U.S. Patent Publication 2011/0302478 of application Ser. No. 12/982,777, filed Dec. 30, 2010, naming Harm Cronie and Amin Shokrollahi, entitled “Power and Pin Efficient Chip-to-Chip Communications with Common-Mode Resilience and SSO Resilience” (hereinafter “Cronie II”).
U.S. patent application Ser. No. 13/030,027, filed Feb. 17, 2011, naming Harm Cronie, Amin Shokrollahi and Armin Tajalli, entitled “Methods and Systems for Noise Resilient, Pin-Efficient and Low Power Communications with Sparse Signaling Codes” (hereinafter “Cronie III”).
U.S. patent application Ser. No. 13/176,657, filed Jul. 5, 2011, naming Harm Cronie and Amin Shokrollahi, entitled “Methods and Systems for Low-power and Pin-efficient Communications with Superposition Signaling Codes” (hereinafter “Cronie IV”).
U.S. patent application Ser. No. 13/542,599, filed Jul. 5, 2012, naming Armin Tajalli, Harm Cronie, and Amin Shokrollahi entitled “Methods and Circuits for Efficient Processing and Detection of Balanced Codes” (hereafter called “Tajalli I”.)
U.S. patent application Ser. No. 13/842,740, filed Mar. 15, 2013, naming Brian Holden, Amin Shokrollahi and Anant Singh, entitled “Methods and Systems for Skew Tolerance in and Advanced Detectors for Vector Signaling Codes for Chip-to-Chip Communication”, hereinafter identified as [Holden I];
U.S. Provisional Patent Application No. 61/946,574, filed Feb. 28, 2014, naming Amin Shokrollahi, Brian Holden, and Richard Simpson, entitled “Clock Embedded Vector Signaling Codes”, hereinafter identified as [Shokrollahi I].
U.S. patent application Ser. No. 14/612,241, filed Aug. 4, 2015, naming Amin Shokrollahi, Ali Hormati, and Roger Ulrich, entitled “Method and Apparatus for Low Power Chip-to-Chip Communications with Constrained ISI Ratio”, hereinafter identified as [Shokrollahi II].
U.S. patent application Ser. No. 13/895,206, filed May 15, 2013, naming Roger Ulrich and Peter Hunt, entitled “Circuits for Efficient Detection of Vector Signaling Codes for Chip-to-Chip Communications using Sums of Differences”, hereinafter identified as [Ulrich I].
U.S. patent application Ser. No. 14/816,896, filed Aug. 3, 2015, naming Brian Holden and Amin Shokrollahi, entitled “Orthogonal Differential Vector Signaling Codes with Embedded Clock”, hereinafter identified as [Holden II].
U.S. patent application Ser. No. 14/926,958, filed Oct. 29, 2015, naming Richard Simpson, Andrew Stewart, and Ali Hormati, entitled “Clock Data Alignment System for Vector Signaling Code Communications Link”, hereinafter identified as [Stewart I].
U.S. patent application Ser. No. 14/925,686, filed Oct. 28, 2015, naming Armin Tajalli, entitled “Advanced Phase Interpolator”, hereinafter identified as [Tajalli II].
U.S. Provisional Patent Application No. 62/286,717, filed Jan. 25, 2016, naming Armin Tajalli, entitled “Voltage Sampler Driver with Enhanced High-Frequency Gain”, hereinafter identified as [Tajalli III].
U.S. Provisional Patent Application No. 62/288,717, filed Apr. 22, 2016, naming Armin Tajalli, entitled “High Performance Phase Locked Loop”, hereinafter identified as [Tajalli IV].
U.S. patent application Ser. No. 15/582,545, filed Apr. 28, 2017, naming Ali Hormati and Richard Simpson, entitled “Clock Data Recovery Utilizing Decision Feedback Equalization”, hereinafter identified as [Hormati I].
U.S. patent application Ser. No. 15/602,080, filed May 22, 2017, naming Ali Hormati, entitled “Data-Driven Phase Detector Element for Phase Locked Loops,” hereinafter identified as [Hormati II].
The following additional references to prior art have been cited in this application:
U.S. Pat. No. 6,509,773, filed Apr. 30, 2001 by Buchwald et al., entitled “Phase interpolator device and method” (hereafter called [Buchwald].
“Linear phase detection using two-phase latch”, A. Tajalli, et al., IEE Electronic Letters, 2003, (hereafter called [Tajalli V].)
“A Low-Jitter Low-Phase-Noise 10-GHz Sub-Harmonically Injection-Locked PLL With Self-Aligned DLL in 65-nm CMOS Technology”, Hong-Yeh Chang, Yen-Liang Yeh, Yu-Cheng Liu, Meng-Han Li, and Kevin Chen, IEEE Transactions on Microwave Theory and Techniques, Vol 62, No. 3, March 2014 pp. 543-555, (hereafter called [Chang et al.])
“Low Phase Noise 77-GHz Fractional-N PLL with DLL-based Reference Frequency Multiplier for FMCW Radars”, Herman Jalli Ng, Rainer Stuhlberger, Linus Maurer, Thomas Sailer, and Andreas Stelzer, Proceedings of the 6th European Microwave Integrated Circuits Conference, 10-11 Oct. 2011, pp. 196-199, (hereafter called [Ng et al.])
“Design of Noise-Robust Clock and Data Recovery using an Adaptive-Bandwidth Mixed PLL/DLL”, Han-Yuan Tan, Doctoral Thesis, Harvard University November 2006, (hereafter called [Tan]).
U.S. Pat. No. 7,492,850, filed Aug. 31, 2005 by Christian Ivo Menolfi and Thomas Helmut Toifl, entitled “Phase locked loop apparatus with adjustable phase shift” (hereafter called [Menolfi].)
“A Calibration-Free Fractional-N Ring PLL Using Hybrid Phase/Current-Mode Phase Interpolation Method”, by Romesh Kumar Nandwana et al, IEEE Journal of Solid-State Circuits Vol. 50, No. 4, April 2015, ppg. 882-895, (hereafter called [Nandwana].)
The present embodiments relate to communications systems circuits generally, and more particularly to obtaining a stable, correctly phased receiver clock signal from a high-speed multi-wire interface used for chip-to-chip communication.
In modern digital systems, digital information has to be processed in a reliable and efficient way. In this context, digital information is to be understood as information available in discrete, i.e., discontinuous values. Bits, collection of bits, but also numbers from a finite set can be used to represent digital information.
In most chip-to-chip, or device-to-device communication systems, communication takes place over a plurality of wires to increase the aggregate bandwidth. A single or pair of these wires may be referred to as a data lane, a channel, or a link and multiple data lanes create a communication bus between the electronic components. At the physical circuitry level, in chip-to-chip communication systems, buses are typically made of electrical conductors in the package between chips and motherboards, on printed circuit boards (“PCBs”) boards or in cables and connectors between PCBs. In high frequency applications, microstrip or stripline PCB traces may be used.
Common methods for transmitting signals over bus wires include single-ended and differential signaling methods. In applications requiring high speed communications, those methods can be further optimized in terms of power consumption and pin-efficiency, especially in high-speed communications. More recently, vector signaling methods have been proposed to further optimize the trade-offs between power consumption, pin efficiency and noise robustness of chip-to-chip communication systems. In those vector signaling systems, digital information at the transmitter is transformed into a different representation space in the form of a vector codeword that is chosen in order to optimize the power consumption, pin-efficiency and speed trade-offs based on the transmission channel properties and communication system design constraints. Herein, this process is referred to as “encoding”. The encoded codeword is communicated as a group of signals from the transmitter to one or more receivers. At a receiver, the received signals corresponding to the codeword are transformed back into the original digital information representation space. Herein, this process is referred to as “decoding”.
Regardless of the encoding method used, the received signals presented to the receiving device are sampled (or their signal value otherwise recorded) at intervals best representing the original transmitted values, regardless of transmission channel delays, interference, and noise. This Clock and Data Recovery (CDR) not only determines the appropriate sample timing, but continues to do so continuously, providing dynamic compensation for varying signal propagation conditions.
Many known CDR systems utilize a Phase-Locked Loop (PLL) or Delay-Locked Loop (DLL) to synthesize a local receive clock having an appropriate frequency and phase for accurate receive data sampling.
To reliably detect the data values transmitted over a communications system, a receiver accurately measures the received signal value amplitudes at carefully selected times. Various methods are known to facilitate such receive measurements, including reception of one or more dedicated clock signals associated with the transmitted data stream, extraction of clock signals embedded within the transmitted data stream, and synthesis of a local receive clock from known attributes of the communicated data stream.
In general, the receiver embodiments of such timing methods are described as Clock-Data Recovery (CDR), often based on Phase-Lock Loop (PLL) or Delay-Locked Loop (DLL) synthesis of a local receive clock having the desired frequency and phase characteristics.
In some communications systems, multiple data lanes may be received originating from a single transmitter or multiple transmitters utilizing coordinated transmission clocks. In such isochronous or plesiochronous environments, CDR phase errors detected at one receive data lane input may suggest corrections that are also applicable at other receive data lane inputs derived from the same clock source.
Methods and systems are described for obtaining, at a phase-error aggregator, a plurality of data-derived phase-error signals for two or more data lanes of a multi-wire bus, each data-derived phase-error signal generated using at least (i) a phase of one or more phases of a local oscillator signal and (ii) a corresponding data signal associated with one of the two or more data lanes, generating a composite phase-error signal representing a combination of the two or more obtained data-derived phase-error signals, receiving the composite phase-error signal at a loop filter responsively generating an oscillator control signal; and receiving the oscillator control signal at a local oscillator and responsively adjusting a timing of the local oscillator to adjust the one or more phases of the local oscillator signal.
As illustrated by the embodiment of
Without implying limitation, the example communication link 120 in
It should be noted that in some embodiments transmitter 110 will use a single clock source as the time base for generation of each output signal it sends over wires 125. In most chip-to-chip communication environments, the propagation characteristics of communications medium 120 are relatively consistent, thus in such systems the multiple received signals at receiver 125 will generally remain relatively well correlated in timing, albeit with tractable variations in arrival time (e.g. skew and jitter.) In such systems, the CDA component of signal detection within receiver 125 may be considered as having two distinct aspects; first, synthesis of a stable local clock equivalent to the clock source within transmitter 110 and second, derivation of individual sampling times from that local clock to accurately capture the value of each received signal input.
One familiar with the art will recognize that this receive timing model may not apply in environments where communications medium 120 introduces significant and rapidly-varying perturbations into the transmission line characteristics of wires 125. One obvious example is multichannel wireless communication, where the propagation time, signal strength, and noise characteristics of different channels or paths may vary widely and independently change at a rapid rate. In such environments, known art solutions include individual CDA subsystems for each receive signal, comprising voltage-controlled oscillator (VCO), phase detector, and other phase-locked loop (PLL) elements.
Some known art chip-to-chip communications receivers also incorporate individual CDA phase-locked loops per signal input as an implementation convenience, maintaining multiple PLL VCOs operating at different phase offsets to produce the necessary sampling clocks, rather than one PLL VCO clock which then undergoes phase adjustment for each receive input sampler. However, at high clock speeds the power requirement for these duplicated PLLs may become a significant component of overall receiver power consumption.
One typical high-speed receiver embodiment is illustrated in
Representative examples of MIC embodiments detecting the H4 or ENRZ code are shown in
Each received data signal is sampled 230a, 230b, 230c, at a time determined to maximize the quality of the detected data (e.g. at the “center of open eye”) producing data values D0, D1, D2. As these samples occur in successive receive unit intervals, specific instances are identified in
As shown, the receiver of
[Hormati I] teaches that the combination of high speed data samplers and at least one stage of loop-unrolled or predictive DFE can be utilized to efficiently detect both a received data value and a CDR timing phase error sample. In such a so-called baud rate CDR, differences in a sampler output in consecutive sampling intervals can be used as an indicator that the sample timing is earlier or later than optimum. These data-derived phase-error signals (composed in
One such embodiment is shown in
The receiver of
Concurrently, CDA subsystem 300 utilizes data-derived phase-error information T_En0-2(0:n−1) and E/L0-2(0:n−1) to maintain phase lock of local oscillator 250, which through phase interpolators 390 and/or delay elements 235a, 235b, and 235c controls the sample timing of Samplers 230a, 230b, 230c.
Each instance of Phase Interpolator 390 is configured to produce phase-adjusted (according to the data lane-specific delay values) sampling clocks suitable for triggering the samplers for one data lane in each parallel data sampling instance. Each instance of Phase Interpolators 390 may be configured independently by control logic 320, thus allowing the sample timing for one data lane, for example data lane-specific delay value d0, to be adjusted to be earlier or later than the sample timing for another data lane, for example data lane-specific delay value d1. Aggregators (storage elements maintaining a cumulative record over multiple input instances) are used to analyze data-derived phase-error signals (and in some embodiments, additionally analyze transitions used to verify data-derived phase-error signals are valid,) to determine whether or not the average error is 0 for a given sub-channel. If a particular average or aggregated error is non-zero, the subchannel/data lane-specific timing associated with that error result is adjusted accordingly, in the present example by adjusting the value of the Phase Interpolator 390 for that subchannel.
A combination of phase error aggregators 490a/b/c and the three data lane-specific phase interpolators 390 producing data lane-specific adjusted sampling clocks as described above is detailed in
Another embodiment may additionally incorporate a plurality of data lane-specific error aggregators 442, each data lane-specific aggregator 442 configured to receive data-driven phase-error signals associated with an associated data lane and to responsively determine a respective data lane-specific control signal indicative of the data lane-specific delay value. Other embodiments may be further composed of data lane-specific phase interpolators as shown in
In some embodiments, producing an adjustable digital signal delay suitable for phase adjustment is done by the switched capacitive node loading embodiment of
In general, the relative sample timing or sampling phase for a given data lane will be the same in each processing phase of all parallel processing phases, albeit in consecutive receive unit intervals. Further embodiments may permit incremental phase adjustments to be made between processing phases, as one example to compensate for inherent timing differences caused by clock distribution variations among the various processing instances.
Overall phase lock is maintained by phase-error aggregator 240 configured to obtain a plurality of data-derived phase-error signals for two or more data lanes of a multi-wire bus, each data-derived phase-error signal generated using at least a phase of one or more phases of a local oscillator signal and a corresponding data signal associated with one of the two or more data lanes, the phase-error aggregator 240 configured to responsively generate a composite phase-error signal representing a combination of the two or more obtained data-derived phase-error signals. This phase-error signal is filtered by a loop filter 245 configured to receive the composite phase-error signal and to responsively generate an oscillator control signal for adjusting local oscillator 250. Local oscillator 250 receives the oscillator control signal and responsively adjusts a timing of the local oscillator to adjust the one or more phases of the local oscillator signal.
The aggregation of phase error may be performed in the analog or the digital domain. The embodiment of
As mentioned above, in Table I, each data lane D0-D2 has an early late value of ‘1’ or ‘−1’, and is only used in the combination if there was a verified transition (e.g., using transition indication signals T_En0-2). If there was no transition, then the corresponding E/L value is ‘N/A’. The E/L values having verified transitions of the three lanes are combined, and the counter is incremented up or down according to the sign of the summation by a magnitude of the summation. In the first row, data lane 0 is late while data lanes 1 and 2 are early, and thus the counter is incremented by a magnitude of 1. It should be noted that in some embodiments, the counter may increment or decrement in opposite directions as the example above. It should also be noted that in some embodiments, the E/L signals provided by the samplers may always be combined. In such embodiments, the received information may correspond to a test pattern, or the received information may be designed to have a sufficient transition density such that the erroneous E/L signals are effectively overridden by E/L signals that did in fact have transitions.
Equivalent digital phase aggregator embodiments may implement all or some of the selection or Logic functions 910, 920, 1120 as programmed logical instructions, and all or some of the counter 930, 1030 functions as programmed arithmetic instructions, executed by a computer processor or programmed logic element.
One embodiment of a voltage control oscillator is shown in the ring oscillator of
In a first operational mode, two or more of the signal inputs are elements of a common signal group and clocking domain, as one example utilizing ODVS H4 encoding. In this mode, timing for each input of the common signal group is derived from the same local timing reference. As previously described herein, optional phase offsets may be provided to incrementally adjust individual input samplers to compensate for inherent timing offsets such as differing signal propagation delays.
In a second operational mode, the various signal inputs are members of at least two distinct signal groups, which may derive from different clocking domains. The at least first and second local clock sources enable independent sampling intervals to be maintained, separately locked to those distinct input clocks.
In a third operational mode, two or more of the signal inputs may be derived from a common clocking domain, but with sufficiently intractable propagation time variations to preclude satisfactory reception in the first operational mode. The at least first and second local clock sources are simultaneously used to produce isochronous clocks, equivalent in frequency but differing (potentially variably) in phase, each synchronized to a different one of the two or more signal inputs. In an alternative mode in which the two or more signal inputs are derived from a common clocking domain, a single local clock source e.g., 1450 may be used, and the data lane-specific delay value may be applied to the generated sampling clock via data lane-specific delay elements d0-2 to compensate for propagation time variations.
Each data-derived phase-error signal may be produced in response to a data sample and an edge sample of the data signal obtained according to one or more phases of the local oscillator signal, and a previous data sample. The previous data sample may be obtained from a data history or other storage element, or from a parallel processing phase operating on a previous time interval of the data signal.
As previously described relative to
In some embodiments, the phase of the local oscillator signal used for generating each data-derived phase-error signal for a given data lane is delayed by a data lane-specific delay value. In a further embodiment, respective data lane-specific control signals indicative of the data lane-specific delay value are determined, each data lane-specific control signal generated using a plurality of data lane-specific error aggregators operating on respective data-driven phase-error signals from an associated data lane.
In some embodiments, the method of generating the composite phase-error signal comprises receiving, at a plurality of charge-pumps, the plurality of data-derived phase-error signals, and responsively generating a plurality of currents representing the data-derived phase-error signals; and combining, at a common analog summation node, the plurality of currents to generate the composite phase-error signal.
In some embodiments, the method of generating the composite phase-error signal comprises performing a digital combination of the plurality of data-derived phase-error signals.
In some embodiments, the data-derived phase-error signals that are determined to have had a transition in the corresponding data signal are combined to generate the composite phase-error signal.
The wire communication methods disclosed in this application are equally applicable to other communication media including optical and wireless communications. Descriptive terms such as voltage or signal level should be considered to include equivalent metrics such as current and charge. Similarly, specific examples provided herein are for purposes of description, and do not imply a limitation, in particular in regards to numbers of input signals, signal encoding, number of bits detected, etc.
As used herein, “physical signal” includes any suitable behavior and/or attribute of a physical phenomenon capable of conveying information. In accordance with at least one embodiment, physical signals may be tangible and non-transitory.
Number | Name | Date | Kind |
---|---|---|---|
5150384 | Cahill | Sep 1992 | A |
5266907 | Dacus | Nov 1993 | A |
5528198 | Baba | Jun 1996 | A |
5629651 | Mizuno | May 1997 | A |
5802356 | Gaskins | Sep 1998 | A |
6307906 | Tanji | Oct 2001 | B1 |
6316987 | Dally | Nov 2001 | B1 |
6380783 | Chao et al. | Apr 2002 | B1 |
6389091 | Yamaguchi | May 2002 | B1 |
6509773 | Buchwald et al. | Jan 2003 | B2 |
6717478 | Kim | Apr 2004 | B1 |
7199728 | Dally | Apr 2007 | B2 |
7336112 | Sha | Feb 2008 | B1 |
7535957 | Ozawa | May 2009 | B2 |
7616075 | Kushiyama | Nov 2009 | B2 |
7650525 | Chang | Jan 2010 | B1 |
7688929 | Co | Mar 2010 | B2 |
7860190 | Feller | Dec 2010 | B2 |
8036300 | Evans | Oct 2011 | B2 |
8253454 | Lin | Aug 2012 | B2 |
8791735 | Shibasaki | Jul 2014 | B1 |
9036764 | Hossain | May 2015 | B1 |
9059816 | Simpson | Jun 2015 | B1 |
9148198 | Zhang | Sep 2015 | B1 |
9306621 | Zhang | Apr 2016 | B2 |
9374250 | Musah | Jun 2016 | B1 |
9520883 | Shibasaki | Dec 2016 | B2 |
9565036 | Zerbe | Feb 2017 | B2 |
9906358 | Tajalli | Feb 2018 | B1 |
10055372 | Shokrollahi | Aug 2018 | B2 |
20030146783 | Bandy | Aug 2003 | A1 |
20040092240 | Hayashi | May 2004 | A1 |
20050024117 | Kubo | Feb 2005 | A1 |
20050128018 | Meltzer | Jun 2005 | A1 |
20050220182 | Kuwata | Oct 2005 | A1 |
20050275470 | Choi | Dec 2005 | A1 |
20060140324 | Casper | Jun 2006 | A1 |
20060232461 | Felder | Oct 2006 | A1 |
20060233291 | Garlepp | Oct 2006 | A1 |
20070001713 | Lin | Jan 2007 | A1 |
20070001723 | Lin | Jan 2007 | A1 |
20070201597 | He | Aug 2007 | A1 |
20080007367 | Kim | Jan 2008 | A1 |
20080317188 | Staszewski | Dec 2008 | A1 |
20090195281 | Tamura | Aug 2009 | A1 |
20090262876 | Arima | Oct 2009 | A1 |
20100180143 | Ware | Jul 2010 | A1 |
20100220828 | Fuller | Sep 2010 | A1 |
20110156757 | Hayashi | Jun 2011 | A1 |
20120327993 | Palmer | Dec 2012 | A1 |
20130088274 | Gu | Apr 2013 | A1 |
20130271194 | Pellerano | Oct 2013 | A1 |
20130285720 | Jibry | Oct 2013 | A1 |
20130314142 | Tamura | Nov 2013 | A1 |
20140286381 | Shibasaki | Sep 2014 | A1 |
20150078495 | Hossain | Mar 2015 | A1 |
20150117579 | Shibasaki | Apr 2015 | A1 |
20150180642 | Hsieh | Jun 2015 | A1 |
20150236885 | Ling | Aug 2015 | A1 |
20150256326 | Simpson | Sep 2015 | A1 |
20170310456 | Tajalli | Oct 2017 | A1 |
Number | Date | Country |
---|---|---|
203675093 | Jun 2014 | CN |
Entry |
---|
Loh, M., et al., “A 3×9 Gb/s Shared, All-Digital CDR for High-Speed, High-Density I/O”, Matthew Loh, IEEE Journal of Solid-State Circuits, Vo. 47, No. 3, Mar. 2012. |
Number | Date | Country | |
---|---|---|---|
20190130942 A1 | May 2019 | US |