This application claims the priority under 35 U.S.C. §119 of European patent application no. 10195747.0, filed on Dec. 17, 2010, the contents of which are incorporated by reference herein.
The invention relates to a multi phase clock and data recovery system.
For very high-speed serial data transmission typically embedded clock signaling is applied, where the transmitter utilizes a certain encoding scheme to include sufficient clock information in the serialized data stream to allow the receiving side to retrieve the originally transmitted data by means of Clock and Data Recovery (CDR) and a complementary decoder. Coding schemes may additionally provide signal conditioning like for example dc-balancing and/or spectral shaping. An often applied coding scheme is 8B10B, where every data byte translates into a 10-bit symbol, and that also provides control symbols, some of them including unique sequences to unambiguously determine the symbol boundaries.
The Clock and Data Recovery (CDR) function in the receiving path can be accomplished by a synchronous solution utilizing a data-tracking PLL that is feedback-controlled to sample the center of the bits, or by an over-sampled solution which samples the input signal more than twice per bit period with a clock derived from a reference clock followed by a digital data & clock recovery algorithm.
Although over-sampled solutions have some benefits, a major disadvantage is that these required more circuit speed in the implementation and therefore typically also consume more power. This is especially critical if an implementation targets the maximum achievable speed in a certain semi-conductor process.
For synchronous data-tracking PLLs the double-sampled architecture (half-bit spaced alternating center and edge samples) with early-late phase-detection (also called bang-bang phase detection) is often utilized as this provides intrinsically good phase alignment, due to the fact that the clock and the data recovery functions utilize the same samplers and have matched signals paths.
In order to achieve very high data rates in a technology with limited circuit speed, it is beneficial to apply parallelism by means of multi-phase oscillators and distributed interleaved samplers, as shown in
It is therefore an object of the invention to improve the speed of the data acquisition and correction of a clock and data recovery circuit.
This object is achieved in a multi-phase clock and data recovery circuit system comprising:
a voltage controlled oscillator including a plurality of identical structural cells coupled in a ring, the voltage controlled oscillator providing a first plurality of phased shifted signals having the same frequency;
a feedback loop including:
a second plurality of data samplers adapted to receive the first plurality of phase shifted signals provided by the voltage controlled oscillator; and
a phase detector coupled to coupled to a phase alignment circuit receiving output signals generated by the second plurality of data samplers and generating control signals to the voltage controlled oscillator at a bit rate of the input signal.
In an embodiment of the invention the multi-phase clock and data recovery circuit system the first plurality of phase shifted signals comprises a first set of signals and a second set of signals, coupled in pairs, each signal of the first set having a corresponded quadrature signal in the second set. In the multi-phase clock and data recovery circuit system each signal of the first set signals and the second set of signals are inputted to a respective third set of data samplers and a fourth set of data samplers.
Preferably, in the multi-phase clock and data recovery circuit system, the data samplers comprise one hot data samplers. The one hot data samplers advantageously comprise T latches or T flip-flops generating T-aligned signals. In the data and clock recovery circuit according to the present invention, the phase detector receives a combination of the T-aligned signals.
The invention is defined by the independent claims. Dependent claims define advantageous embodiments.
The above and other advantages will be apparent from the exemplary description of the accompanying drawings in which
This present invention describes a distributed interleaved phase-detectors in a multi-phase CDR architecture operated directly at the phase-skewed sampler outputs in order to obtain phase-detection feedback at the bit rate, thereby allowing improved tracking bandwidth. The phase alignment between samples will typically still be applied to provide the desired set of samples as a data word on a single-phase clock at its outputs towards the next function in the receive path, but this phase-alignment is not part of the phase-tracking feedback loop anymore. Furthermore this invention enables the application of distributed interleaved charge-pumps for improved control linearity.
The present invention describes a solution for bit-wise phase detection and bit-rate phase-feedback in a multi-phase architecture operated with two samples per bit, using sets of three consecutive samples, each set including the last taken sample of the previous set and the next two samples, wherein phase detection is performed on each individual set when all sample results in that set are available, and wherein based on these phase detector decisions frequency control feedback is provided at the bit rate.
It is part of this invention to apply samplers which output can indicate one of three possible states: a decision that the input was representing a logical “0” at the sampling moment, a decision that the input was representing a logical “1” at the sampling moment, or no decision because the sampler is either in its reset phase or is already sampling but has not reached a decision yet.
The referred type of sampler will be denoted as fractional-T sampler because it provides information about the sampling decision only for a fraction of the total oscillator period. In order to generate full-T pulses from such a sampler output, an additional full-T generating latch or flip-flop can be applied behind it, but these additional full-T generating latches or flip-flips are not part of the phase control feedback loop.
It is also part of the invention to apply samplers with two separate one-hot logical outputs to indicate a “0” or “1” decision, while none of these two outputs is asserted during the reset phase and as long as there is no decision during the sampling phase.
A multi-phase architecture example for 20-phases, using fractional-T samplers, is shown in
Each edge detector monitors three consecutively clocked sampler outputs. The middle of those three samples will become an edge sample and the two surrounding samples therefore data samples. Each edge-detector indicates whether the frequency needs to be increased, decreased or shouldn't change. These correction pulses are fed into charge-pumps, which inject current into or subtract a current from their outputs when they receive an up or down pulse on their inputs. The charge-pump outputs are summed into the loop-filter to adapt the oscillator frequency, thereby also correcting the phases.
Let us consider first the edge-detector logic function, not looking at timing aspects immediately. For comprehensibility of the logic functions, this will be done as if the edge-detectors are operated on forever-present data (di) and edge (xi) sample results. It should be kept in mind that this is only done for comprehensibility, as for an implementation according to this invention the edge-detectors hook-up to the fractional-T sampler outputs, whose results are not forever present. After explanation of the logic functions, the fractional-T sampler will be described in more detail, and finally the edge-detector timing aspects and the actual hook-up of the edge-detectors to the fractional-T samplers will be discussed in more detail.
f shows an embodiment which is a straightforward implementation of the terms in both logic functions.
These are only a few possible example embodiments to implement the desired logic functions, but several alternatives are possible.
Note that in the explanation of the edge-detector it is implicitly assumed that the up and dw correction pulses are indicated by a logic one while no correction corresponds with logic zero. This is an arbitrary choice and could be chosen inverse for either up, dw or both.
Note that embodiment of the edge-detectors in actual practical designs may includes additionally an enabling signal to disable parts of the design when these are not needed.
Furthermore, the fractional-T sampler receives at least one clock phase to sample the data, where one edge triggers the sampling, while the other edge initiates the reset of the sampler to get ready for the next sampling with little or no impact from previous decisions. In the examples it is chosen that the rising edge triggers the sampling and a falling edge starting the reset for comprehensibility reasons, but note that this can also be chosen oppositely.
For routing reasons, it is attractive to have each sampler operated from a single clock phase. In that case the sampling and reset phase will each roughly take half of the oscillator period. Note that multiple clock phases can be used for each sampler to modify the ratio between sampling and reset phase, at the cost of some extra clock-phase routing complexity.
Fractional-T samplers make a decision during the sampling phase about the logic value represented by their input signal during the input sensitivity time-window and provide that decision to their outputs. The input sensitivity window is the time around the sampling triggering edge during which the decision is determined, although the decision may become available later at the output due to the time required for regeneration to a logic value. In order to distinguish a logic one and logic zero and no decision or reset from each other, at least three states are needed. A simple and convenient way to accomplish that is with 2 logic one-hot outputs, here denoted by postfixes ‘p’ and ‘n’, that indicate a logic one when the ‘p’ output is asserted, a logic zero if the ‘n’ output is asserted and no decision or reset phase if none of them is asserted.
This figure shows an embodiment with NMOS input pair which is particularly suitable for input signals with a high common-mode level. Obviously, a complementary version having a PMOS input pair can be applied instead for low common-mode input levels.
The one-hot sampler outputs provide a windowing function, as that the edge-detector can only generate up or dw pulses during the periods that all three samplers have reached a decision and the logic functions like discussed before become true. The three samples for one edge-detector are nominally skewed by half a bit each, resulting in a one bit period skew between the first and the third sample. For a 20-phase architecture, this results in up or dw correction pulse lengths of about 4 bits or less. In general for a 2N-phase architecture, the up or dw correction pulse duration will be about (N−1) bits or less.
The fact that the up or dw pulse can become shorter is illustrated by an example in
Note that if there is no transition the samplers will reach a decision, but still no up or dw pulse will be generated as the logic functions will not become true.
The behavior of samplers and edge-detectors can therefore be summarized as follows:
The edge detector up and dw outputs drive charge-pumps to pump a current in or out of a loop-filter. An example of a charge-pump and loop-filter implementation containing a pole and a zero, is shown in
Note that the loop-filter is shared for all charge-pumps. Furthermore, the bias generation and unity gain buffer for the dummy paths can potentially be shared for all charge-pumps.
In order to reduce the number of charge-pumps the up and dw pulse of the two edge-detector driven by samplers with opposite phase clocks can be combined with an OR function as they never overlap. This is illustrated in
a shows a possible embodiment of a level-shifter, which can be used in case of low-swing oscillator signals and logical output clock signal levels are desirable. The inputs of level-shifters are coupled to the different phases in the oscillator core. There are 2N single-ended level-shifters needed to cover all phases of a 2N-phase architecture.
The delay of a level-shifter, defined as the delay between the input signal transition that initiates the sampling-triggering output edge and the actual output transition being the sampling-triggering event, should show little variation, as several instantiations of the level-shifter operate in parallel (see
In CMOS inverter implementations, typically, the PMOS device has a larger W/L than the NMOS device in order to balance drive-strengths of both devices. However, for level-shifter performance, it can be beneficial to make the NMOS device Mn stronger than the PMOS device Mp, by increasing the size of Mn and decreasing the size of Mp, while keeping the input capacitance constant. This leads to a certain optimum, where the W/L of the NMOS Mn device may become even larger than the W/L of the PMOS device Mp. This optimization reduces timing variation of the falling edge at node ‘ofe’. Furthermore this optimization causes the level-shifter output duty cycle to deviate from 50%, which can be advantageously used to lengthen the ‘sampling’ phase of the samplers at the expense of the ‘reset’ phase. A side effect of this optimization is that the rising edge on node ‘ofe’ will show increased variation, which doesn't have to be a limiting factor as a CDR can be implemented such that its relies on the accuracy of one clock edge type only. Between node ‘ofe’ with a timing-optimized falling edge, and a data sampler, an odd number of inverters should be inserted if the data sampler samples the input data on a rising clock edge. Otherwise, if the data sampler samples the input data on a falling clock edge an even number of inverters should be inserted there.
b shows an example of a level-shifter embodiment with differential input, consisting of a differential buffer stage followed by two single-ended structures which are each similar to
Next to the previously described set of edge-detectors, optionally, an additional set of edge-detectors may be applied, driven by an equidistantly phase-spaced sub-set of all samplers. Connection of this extra set of edge-detectors to the sub-set of used samplers is similar to the case when only that sub-set of samplers would have been present. For example, for the 20-phase oscillator, this can be accomplished by only using every other sampler and ignoring the samplers in between. This allows disabling the set of not-used samplers. Connection of the additional set of edge detectors is similar to if only these 10 samplers would have been present, so equivalent to a 10-phase oscillator architecture. Note that the position of center and/or edge samples will change compared to the case using all 20-phases. It may also be advantageous to intentionally shift center and edge positions between the full-set and a sub-set in order to accomplish load balancing to the samplers. The up/dw outputs of an additional set of edge-detectors may drive a selected sub-set of the already present charge-pumps or an additional set of charge-pumps. This principle is not limited to one extra set of edge-detector and is also not limited to a sub-sampling factor of two. For example, in case of a 24-phase oscillator, it would be possible to use sub-sets of 12 (every second sample), 8 (every third sample), 6 (every fourth sample), or 4 (every sixth sample) samples. Using one or more sub-sets is advantageous when a large range of input data rates needs to be supported and the tuning range of the oscillator is limited.
Although it is most convenient to apply single clock samplers, usage of two or more phases per sampler would provide some potentially interesting degrees of freedom:
Note that in this invention many examples have been given using a multi-phase architecture with 20 phases. A 20-phase architecture has some benefit that it advantageously fits with the 10-bit granularity of the often applied 8B10B coding. However, it should be clear to the skilled reader that the invention can be applied for any even number of phases, 2N, where N corresponds with the number of parallel sampled bits.
Note that for any of the example circuit embodiments, it may sometime be advantageous in practice to apply complementary implementations, or different implementations with similar functionality.
Note that specific choices for logic high or low levels for certain signals in this document are all examples to illustrate the principle, and are not limiting the scope. Note that although circuit embodiment examples have been shown using CMOS device technology which has well-known advantages for the implementation of logic functions, this invention is not limited to application in CMOS technology, but can also be realized in other technologies, like for example bipolar or BiCMOS.
Note that although this invention is particularly suitable for application in integrated circuits, it may also be applied for systems where the part according to this invention includes multiple components.
It is remarked that the scope of protection of the invention is not restricted to the embodiments described herein. Neither is the scope of protection of the invention restricted by the reference numerals in the claims. The word “comprising” does not exclude other parts than those mentioned in the claims. The word “a(n)” preceding an element does not exclude a plurality of those elements. Means forming part of the invention may both be implemented in the form of dedicated hardware or in the form of a programmed purpose processor. The invention resides in each new feature or combination of features.
Number | Date | Country | Kind |
---|---|---|---|
10195747.0 | Dec 2010 | EP | regional |