Communications links such as high-speed input-output (I/O) communications links typically include a transmitter (TX) module that transmits a serial data bit steam across a channel to a receiver (RX) module. Usually, the serial data bit stream does not include an associated clock signal. To ensure faithful transmission of data from the TX module to the RX module, the RX module may include clock data recovery (CDR) circuitry to reconstruct the clock signal used during transmission. The recovered clock signal allows the RX module to accurately sample incoming serial data during data recovery.
Many communications links use non-return-to-zero (NRZ) signaling in which a binary signal is encoded in a two-state pulse amplitude modulation with no rest condition (i.e., 0V does not encode a logic state). For example, logic “1” may be represented by +1V, while logic “0” may be represented by −1V (rather than 0V). In transceivers using two-state NRZ signaling, typical CDR circuitry includes a single data sampler and a single edge sampler to recover the frequency and phase of the TX clock signal used to transmit a serial data bit stream.
In designing high-speed communications links, it may be desirable to encode data using higher order pulse amplitude modulation (PAM) signaling, such as PAM4. As opposed to typical two-level NRZ signaling, PAM4 signals have four distinct levels (e.g., −3, −1, +1, and +3). While PAM4 signals may allow for a higher communication bandwidth, the presence of higher order modulations in PAM4 significantly complicates implementation of conventional clock recovery algorithms in transceivers using PAM4 signaling.
In many cases, oversampled clock recovery is an effective method to handle a variety of waveform shapes and clock patterns, but is very costly in terms of hardware and algorithmic efficiency for PAM4 signals. On the other hand, Baud rate clock recovery is much more resource efficient in PAM4, but produces noticeably off-centered data sampling when the received signal does not have a sharp peak and is less effective in handling clock patterns.
It is in this context that the embodiments described herein arise.
The following embodiments describe methods and circuitry for implementing hybrid clock data recovery (CDR) in high-speed serial communications links adapted to transmit higher-order pulse amplitude modulated (PAM) signals.
It will be recognized by one skilled in the art, that the present exemplary embodiments may be practiced without some or all of these specific details. In other instances, well-known operations have not been described in detail in order not to unnecessarily obscure the present embodiments.
Communications links are commonly used to transport data between separate integrated circuits packages, printed circuit boards, components, systems, etc. Such communications links may be used to connect integrated circuits that include communications capabilities, such as memory chips, digital signal processing circuits, microprocessors, application specific integrated circuits, programmable logic device integrated circuits, field-programmable gate arrays, application specified standard products, or any other suitable integrated circuit.
A high-speed link might, as an example, carry data at 10 gigabits per second or more. A high-speed communications system 10 is shown in
TX circuitry 62 may be formed on a first integrated circuit while RX circuitry 64 may be formed on a second integrated circuit (as an example). Integrated circuit devices 62 and 64 may be mounted on a printed circuit board (PCB). Transmitter block 62 may convey data to RX block 64 through channel 66. If desired, more than one channel may be used to link TX block 62 to RX block 64.
In general, channel 66 may be formed from any suitable physical transmission medium. Examples of transmission paths that may be used in channel 66 include traces on printed circuit boards, differential signaling paths made up of pairs of conductive wires, coaxial cable paths (e.g., a CAT 5 cable), fiber optic cable paths, combinations of such paths, backplane connectors, or other suitable communications link paths. In a typical system, integrated circuits 62 and 64 may be mounted on one or more circuit boards and channel 66 may involve transmission line structures fabricated on the circuit board or boards.
This example is merely illustrative. Communications links of the type described in connection with
Receiver circuitry 64 may include an RX buffer such as buffer 80, an RX equalizer such as equalizer 82, and a data latching circuit 84. In the example of
Buffer 80 may output the received data to equalization circuitry 82. Equalization circuitry 82 may provide further high-frequency boosting or direct signal level boosting to compensate for any undesired frequency-dependent signal loss commonly seen in high-speed serial links (e.g., losses in copper-based channels that exhibit undesired low-pass transfer characteristics that result in signal degradation at high data rates). Equalization circuitry 82 may implement linear equalization schemes such as finite impulse response (FIR) and feed forward equalization (FFE) schemes or nonlinear adaptive equalization schemes such as infinite impulse response (IIR) or decision feedback equalization (DFE) schemes (as examples).
Equalizer 82 may output the received data that has been equalized to RX data latch 84. Data latch 84 may be a serial-in parallel-out (SIPO) or a de-serializer data circuit (as an example). In this example, data source 84 may convert the serial data bit stream to parallel data for subsequent processing.
Continuous time linear equalizer 200 may receive a serial data bit stream DataIn. The serial data bit stream received from channel 66, however, may contain significant high frequency amplitude distortions that cause intersymbol interference on the receiver side. CTLE circuit 200 may therefore provide a desired peaking gain (e.g., an adjustable gain difference between a high frequency AC gain and a low frequency DC gain). Alternatively, CTLE 200 may restore the transmitter/channel characteristic towards a reference characteristic with little to no intersymbol interference in order to reconstruct the serial data bitstream in a usable form.
Clock data recovery (CDR) circuitry 204 may receive an equalized data signal from CTLE circuit 200. In one embodiment, hybrid phase detector 206 of CDR circuitry 204 may contain data samplers and edge samplers to recover the phase of clock used to transmit the serial bit stream over channel 66. Phase, frequency, period, or other relevant parameters of the recovered clock signal may be adjusted based on the edge and data samples and output to oscillator 208 as control signals. Oscillator 208 may then generate the physical waveform of the recovered clock and output RecClk back to hybrid phase detector 206 in a continuous feedback process. Phase detector circuit 206 may then use the recovered clock to sample the equalized serial bit stream to generate recovered data DataOut that may be output from CDR circuit.
Conventional TX/RX modules encode data serially using a two-level, non-return-to-zero (NRZ) signaling in which logic “1” is represented by a positive voltage state (e.g., +1V) and logic “0” is represented by a negative voltage state (e.g., −1V). In other words, typical NRZ signals have no logic state represented by 0V, sometimes called a rest condition.
The waveform of an illustrative NRZ serial data signal is shown in
An eye diagram of an illustrative two-state NRZ signal is shown in
While NRZ signaling is conventional in many high-speed communications links, higher bandwidths may be achieved using signals with higher order modulations, such as four-state pulse amplitude modulation (PAM4). A diagram of an illustrative PAM4 waveform in shown in
Half a clock period later, at time t=kT+T/2−φ, the PAM4 signal may transition to another one of the four possible signal values as shown by the signal traces on the right hand side of the eye diagram. Thus, a PAM4 signal has twelve possible transitions (four possible values times three possible transition destinations) compared to the two possible transitions of a conventional two-state NRZ signal. As shown in
In general, PAM4 and other higher order pulse amplitude modulation schemes such as PAM8 and PAM16 have a greater number of allowed signal states, and thus a greater number of possible transitions and transition thresholds. As a result, clock data recovery techniques that employ edge sampling (e.g., oversampled CDR) may require significantly more hardware resources and computation time in PAM4 than in two-state signal environments. Oversampling PAM4, PAM8 and higher order bitstreams may require a larger number of data and edge slicers compared to CDR methods that sample only once per signaling event (e.g., Baud rate CDR).
Additionally, conventional oversampled CDR techniques require the data signal RX to be sampled several times per data cycle. Oversampling in this manner may require generating an appropriate number of additional sampling clock signals with a predetermined phase offset. For example, sampling the data signal at the rising edge, falling edge, and data portions of a clock cycle may require one clock signal of a given frequency and three additional clock signals of the same frequency with phase offsets of 90, 180 and 270 degrees. However, generating additional high speed clock signals may adversely affect power consumption and routing congestion in the transceiver device. CDR algorithms that require multiple samples per clock cycle, while complex enough to handle a variety of waveform shapes and clock patterns, also demand significantly more computation time and/or processing power in comparison to Baud rate algorithms.
Baud rate CDR techniques, which sample the RX bitstream only once per signal change, are more cost effective than oversampling for higher order modulations such as PAM4. Because Baud rate algorithms rely on a smaller number of data samples, however, these algorithms may have difficulty handling various clock patterns, sometimes leading to bit errors in the recovered RX data. Additionally, if the received data signal does not have a sharp peak, Baud rate CDR techniques may result in noticeably off-center sampling in comparison to oversampled CDR.
A clock data recovery scheme for signals with higher order modulations that balances the advantages of oversampled CDR and Baud rate CDR while minimizing their respective disadvantages may therefore be desirable. This hybrid CDR scheme would be more cost-effective than oversampled CDR, but have more complexity than Baud rate CDR, allowing it to handle clock patterns and a wider variety of PAM waveforms.
Hybrid phase detector 206 may also include Baud rate sampling circuit 500. Baud rate sampler circuit 500 may be used to sample incoming bitstream DataIn during the data portions of the signal. For the exemplary PAM4 signal shown in
Adaptation circuit 504 may monitor edge data signal de, Baud rate data signal d, and Baud rate error signal err and adjust the frequency and/or phase of the sampling clock accordingly. Phase and frequency information may be output as control signal Ctrl to an oscillator such as oscillator 208 of
Similarly, even Baud rate sampling portion may include data slicers 606-2 and error slicer 608-2, each clocked by signal Clk(180). Clk(180) may be a clock signal with the same frequency as Clk(0), but with a phase offset of 180 degrees (e.g., the rising edge of Clk(180) may occur half a clock period later than the rising edge of Clk(0)). For example, if odd data slicers 606-1 sample DataIn at the rising edge of Clk(0), even data slicers 606-2 would sample DataIn at the falling edge of Clk(0). Like odd portion 602-1, even Baud rate sampling portion 602-2 may include even summation node circuit 604-2.
Partial edge oversampling circuit 502 may include edge slicer 502 clocked by signal Clk(90). Clk(90) may have the same frequency as Clk(0), but with a phase offset of 90 degrees (e.g., the rising edge of Clk(90) may occur a quarter clock period later than the rising edge of Clk(0)). Edge sampler 502 may receive a control signal corresponding to the threshold voltage of the transition edge being sampled. In the example of
It should be appreciated that phase detectors implementing pure oversampled CDR typically have more than one edge sampler. For example, phase detectors for pure oversampled CDR in PAM4 environments may include three or more edge samplers, while PAM8 phase detectors may include include seven or more edge samplers (related to the number of possible transition thresholds). In contrast, partial edge oversampling circuit 502 used in hybrid phase detector 506 may include fewer edge detectors (e.g., only one slicer in the embodiment of
During normal operation of the hybrid phase detector shown in
In one embodiment, DataIn may be the output of a linear equalizer such as continuous time linear equalizer (CTLE) 200 shown in
Similarly, even summation node circuit 604-2 may have an input that also receives DataIn and a second input that receives even scaling coefficients DFE_even from DFE scaling block 610. Even scaling coefficients DFE_even may or may not be the same as odd scaling coefficients DFE_odd. Like odd data samplers 606-1, even data samplers 606-2 may be high gain flip-flops, comparators or other circuitry that converts the analog signal received at even summation node circuit 604-1 to a corresponding digital signal. DFE scaling coefficients DFE_even may be dynamically computed based on the data latched at the output of even data samplers 606-2 (i.e., d_even) and input to DFE feedback block 610 during a given cycle of Clk(180). Even data sampler 606-2 and even error sampler 608-1 may output signals d_even and err_even to adaptation logic 504.
Adaptation logic 504 may contain error minimization adaptation circuitry that attempts to minimize the phase difference between TX clock used to transmit DataIn and the recovered sampling clock (e.g., RecClk of
If a transition occurred (i.e., if dk≠dk+1) and if the value of the DataIn at an edge is equal to the value of DataIn during the data portion of the same cycle (i.e., if dek=dk), the sampling clock may be too early. This follows from the fact that, in general, the edge values of DataIn are not equal to the signal values of DataIn. For example, the PAM4 signal shown in
Similarly, if a transition occurred (i.e., if dk≠dk+1) and if the edge value of DataIn is equal to negative of the signal value, the threshold zero transition has already occurred and the sampling clock is too late. In this case, the oversampling phase error P_Err_OS is assigned a normalized value of −1.
If, however, the signal value dk is equal to the signal value during the subsequent data cycle dk+1, a transition did not occur. In this case, there is insufficient information to detect phase error in the sampling clock only using data dk and threshold zero edge data dek. P_Err_OS may therefore be assigned a value of 0.
It should be appreciated that the oversampled phase error P_Err_OS may be assigned any suitable values to designate early sampling, late sampling, and no information cases; the use of +1, −1, and 0 in the above cases is merely illustrative. The cases of on-time, early, and late edge sampling are illustrated further in the simplified eye diagrams of
Referring back to
P_Err_Baud=−sgn (ek)*sgn (ak+1−ak−11) (2)
where ak+1 and ak−1 are either +1 or −1 depending on the sign of k+1 and k−1 data samples, respectively. For example, ak−1 may be −1 if dk−1 is negative. Error signals ek may be computed based on discrepancies between sampled data signals dk and an expected Baud rate locking condition (e.g., by comparing dk to a fixed reference voltage). The algorithm for computing ek and Baud rate phase error P_Err_Baud is described in further detail with respect to
e
k
=d
k−(ak*Vref) (3)
where ak is 1 if the signal value dk is positive and −1 if the sampled signal value dk is negative. Vref may be a reference voltage that depends on one or more of the allowed signal values. Thus, ek may be the voltage difference between an expected signal value Vref and the sampled signal value dk.
The Baud rate phase detection algorithm may determine whether the sample clock is early or late by comparing the sign of the error with the signs of the signal before and after transition. In particular, the phase error P_Err_Baud may be computed by equation (2) described in connection with
In the example of
On the other hand, a late sample of waveform 900 may yield dk less than the reference voltage Vref. Here, ak is equal to 1 since dk is still positive. Thus, ek is negative, while ak+1 minus ak−1 is again negative. The sign of P_Err_Baud is therefore negative. After computing a negative Baud rate phase error as above, a Baud rate phase detector may adjust the phase of the sampling clock such that the sampling edge occurs earlier.
Referring back to
P_Err_Mix={w*P_Err_OS}+{(1−w)*P_Err_Baud} (4)
where w is a numerical weighting factor that quantifies the relative proportion of oversampling to Baud rate phase detection implemented by the hybrid phase detector (e.g., the number of edge samplers in partial edge oversampling circuit 502). In this case, the weighting parameter w may be a real number between 0 and 1, inclusive. For example, w=0 may describe a phase detector implementing pure Baud rate CDR, and w=1 may describe a phase detector implementing pure oversampled CDR. For a hybrid phase detector implementing a hybrid CDR algorithm, w may be a fractional value between 0 and 1.
Computed in this way, P_Err_Mix may represent the overall phase error in the recovered clock signal. Thus, at step 706, the value of P_Err_Mix may be used to adjust an oscillator such as oscillator 208 such that the oscillator outputs a corrected recovered clock. In particular, adaptation logic may change the value of the phase parameter output to the oscillator based on sign and magnitude of P_Err_Mix. By continuously computing P_Err_Mix over a number of data cycles and adjusting the oscillator in this way, adaptation logic within the hybrid phase detector may reduce or eliminate any phase difference between the TX clock used to transmit a PAM bitstream and the recovered clock generated by the RX oscillator.
The embodiments thus far have been described with respect to integrated circuits. The methods and apparatuses described herein may be incorporated into any suitable circuit. For example, they may be incorporated into numerous types of devices such as programmable logic devices, application specific standard products (ASSPs), application specific integrated circuits (ASICs), microcontrollers, microprocessors, central processing units, graphics processing units (GPUs), etc. Examples of programmable logic devices include programmable arrays logic (PALs), programmable logic arrays (PLAs), field programmable logic arrays (FPGAs), electrically programmable logic devices (EPLDs), electrically erasable programmable logic devices (EEPLDs), logic cell arrays (LCAs), complex programmable logic devices (CPLDs), and field programmable gate arrays (FPGAs), just to name a few.
The programmable logic device described in one or more embodiments herein may be part of a data processing system that includes one or more of the following components: a processor; memory; IO circuitry; and peripheral devices. The data processing can be used in a wide variety of applications, such as computer networking, data networking, instrumentation, video processing, digital signal processing, or any suitable other application where the advantage of using programmable or re-programmable logic is desirable. The programmable logic device can be used to perform a variety of different logic functions. For example, the programmable logic device can be configured as a processor or controller that works in cooperation with a system processor. The programmable logic device may also be used as an arbiter for arbitrating access to a shared resource in the data processing system. In yet another example, the programmable logic device can be configured as an interface between a processor and one of the other components in the system. In one embodiment, the programmable logic device may be one of the family of devices owned by INTEL®/ALTERA® Corporation.
Although the methods of operations were described in a specific order, it should be understood that other operations may be performed in between described operations, described operations may be adjusted so that they occur at slightly different times or described operations may be distributed in a system which allows occurrence of the processing operations at various intervals associated with the processing, as long as the processing of the overlay operations are performed in a desired way.
The following examples pertain to further embodiments.
Example 1 is an integrated circuit comprising: a first sampling circuit that receives a data signal and obtains a corresponding edge sample; a second sampling circuit that receives the data signal and obtains a corresponding error sample; and adaptation logic circuitry that adjusts the first and second sampling circuits based on the edge sample and the error sample.
Example 2 is the integrated circuit of Example 1, where the first sampling circuit optionally comprises an oversampling circuit.
Example 3 is the integrated circuit of any one of Examples 1-2, where the second sampling circuit optionally comprises a baud rate sampling circuit.
Example 4 is the integrated circuit of any one of Examples 1-3, where the second sampling circuit optionally further obtains a data sample that is fed to the adaptation logic circuitry.
Example 5 is the integrated circuit of any one of Examples 1-4, where the first and second sampling circuits optionally receive a data signal that transitions from a first data value to a second data value, and the edge sample is optionally sampled while the data signal is transitioning from the first data value to the second data value.
Example 6 is the integrated circuit of any one of Examples 1-5, where the data signal is optionally modulated using at least a fourth order pulse amplitude modulation (PAM) scheme.
Example 7 is the integrated circuit of any one of Examples 1-6, where the data signal has a predetermined number of transition thresholds, and the first sampling circuit is optionally configured to sample only a subset of the predetermined number of transition thresholds.
Example 8 is the integrated circuit of any one of Examples 1-7, where the adaptation logic circuitry optionally uses the edge sample to determine whether a first sampling clock associated with the first sampling circuit is early or late and to generate a first error signal.
Example 9 is the integrated circuit of any one of Examples 1-8, where the adaptation logic circuitry optionally uses the error sample to determine whether a second sampling clock associated with the second sampling circuit is early or late and to generate a second error signal.
Example 10 is the integrated circuit of any one of Examples 1-9, where the adaptation logic circuitry optionally combines the first and second error signals using a predetermined weighting scheme.
Example 11 is a method of operating an integrated circuit, the method comprising: receiving a data signal; obtaining a data sample on the data signal; obtaining an error sample on the data signal; obtaining an edge sample on the data signal; and adjusting a clock signal based on the obtained data sample, error sample, and the edge sample.
Example 12 is the method of Example 11, wherein obtaining the data sample and the error sample optionally comprises sampling the data signal with a baud rate sampling circuit.
Example 13 is the method of any one of Examples 11-12, wherein obtaining the edge sample optionally comprises sampling the data signal with a partial oversampling circuit.
Example 14 is the method of any one of Examples 11-13, optionally further comprising: generating a first error value by comparing the data sample and the edge sample; and generating a second error value based on data sample and the error sample.
Example 15 is the method of any one of Examples 11-14, optionally further comprising: scaling the first error value by a first weighting factor; scaling the second error value by a second weighting factor; and computing a third error value by combining the scaled first error value and the scaled second error value, wherein adjusting the clock signal comprises dynamically adjusting the clock signal based on the third error value.
Example 16 is a communications system comprising: a transmitter that transmits a signal; and a receiver that receives the transmitted signal, the receiver comprises: a baud rate sampling circuit that obtains first samples; an oversampling circuit that obtains second samples; and an adaptation circuit that adjusts a clock signal based on the first and second samples.
Example 17 is the communications system of Example 16, where the baud rate sampling circuit optionally supports a modulation scheme selected from the group consisting of: pulse amplitude modulation (PAM) 4, PAM 8, and PAM 16.
Example 18 is the communications system of any one of Examples 16-17, where the signal has a predetermined number of allowed data transitions, and the oversampling circuit optionally evaluates only a subset of the predetermined number of allowed data transitions.
Example 19 is the communications system of any one of Examples 16-18, where the oversampling circuit optionally includes only one edge sampler.
Example 20 is the communications system of any one of Examples 16-19, where the edge sampler optionally has a threshold value of zero.
Example 21 is an integrated circuit comprising means for receiving a data signal; means for obtaining a data sample on the data signal; means for obtaining an error sample on the data signal; means for obtaining an edge sample on the data signal; and means for adjusting a clock signal based on the obtained data sample, error sample, and the edge sample.
Example 22 is the integrated circuit of Example 21, where the means for obtaining the data sample and the error sample optionally comprises means for sampling the data signal with a baud rate sampling circuit.
Example 23 is the integrated circuit of any one of Examples 21-22, where the means for obtaining the edge sample optionally comprises means for sampling the data signal with a partial oversampling circuit.
Example 24 is the integrated circuit of any one of Examples 21-23, optionally further comprising means for generating a first error value by comparing the data sample and the edge sample; and means for generating a second error value based on data sample and the error sample.
Example 25 is the integrated circuit of any one of Examples 21-24, optionally further comprising: means for scaling the first error value by a first weighting factor; means for scaling the second error value by a second weighting factor; and means for computing a third error value by combining the scaled first error value and the scaled second error value, wherein the means for adjusting the clock signal comprises means for dynamically adjusting the clock signal based on the third error value.
For instance, all optional features of the apparatus described above may also be implemented with respect to the method or process described herein. The foregoing is merely illustrative of the principles of this disclosure and various modifications can be made by those skilled in the art.