The present application is related to U.S. patent application Ser. No. 11/540,946, filed Sep. 29, 2006, entitled “Method and Apparatus for Determining Latch Position for Decision-Feedback Equalization Using Single-Sided Eye,” U.S. patent application Ser. No. 11/686,148, filed Mar. 14, 2007, entitled “Method and Apparatus for Decision-Feedback Equalization Using Single-Sided Eye with Global Minimum Convergence,” and U.S. patent application Ser. No. 11/864,110, filed Sep. 28, 2007, entitled “Method and Apparatus for Determining Threshold of One or More DFE Transition Latches Based on Incoming Data Eye,” each incorporated by reference herein.
The present invention relates generally to decision-feedback equalization techniques, and more particularly, to decision-feedback equalization techniques that employ oversampled phase detectors.
Digital communication receivers must sample an analog waveform and then reliably detect the sampled data. Signals arriving at a receiver are typically corrupted by intersymbol interference (ISI), crosstalk, echo, and other noise. Thus, receivers must jointly equalize the channel, to compensate for such distortions, and decode the encoded signals at increasingly high clock rates. Decision-feedback equalization (DFE) is a widely-used technique for removing intersymbol interference and other noise. For a detailed discussion of decision feedback equalizers, see, for example, R. Gitlin et al., Digital Communication Principles, (Plenum Press 1992) and E. A. Lee and D. G. Messerschmitt, Digital Communications, (Kluwer Academic Press, 1988), each incorporated by reference herein. Generally, decision-feedback equalization utilizes a nonlinear equalizer to equalize the channel using a feedback loop based on previously decided symbols.
In one typical DFE implementation, a received analog signal is sampled and compared to one or more thresholds to generate the detected data. A DFE correction, v(t), is subtracted in a feedback fashion to produce a DFE corrected signal w(t). The same clock, generated from the received signal by a clock and data recovery (CDR) circuit, is generally used to sample the incoming signal and for the DFE operation.
A number of techniques exist for clock and data recovery for DFE. Nonetheless, a need still exists for methods and apparatus for generating additional samples/decisions (other than the main data sample/decision) to allow CDR for DFE equalized signals, as well as adpatation of parameters which generate these additional samples. Yet another need exists for methods and apparatus for properly sampling a multi-tap DFE equalized signal with a bang-bang phase detector or a quasi-linear phase detector-based CDR loop.
Generally, methods and apparatus are provided for decision-feedback equalization with an oversampled phase detector. According to one aspect of the invention, a method is provided for detecting data in a receiver employing decision-feedback equalization. A received signal is sampled using a data clock and a transition clock to generate a data sample signal and a transition sample signal. A DFE correction is obtained for each of the data sample and transition sample signals to generate DFE detected data and DFE transition data. One or more coefficients used for the DFE correction for the transition sample signals are adapted using the DFE transition data.
The adaptation may optionally comprise a precomputation for the DFE correction for the transition sample signals. Further, the adaptation may optionally be conditioned upon a data transition. The coefficients used for the DFE correction for the transition sample signals can be computed, for example, as a function of a prior value of the transition sample signals, for example, using one or more digital state machines. One or more parameters of the adapting step, such as an adaptation rate, may optionally be altered upon detection of a predefined pattern.
A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings.
The present invention provides methods and apparatus for generating additional samples/decisions (other than the main data sample/decision) to allow CDR for DFE equalized signals, as well as adaptation of parameters which generate these additional samples. According to one aspect of the present invention, loop unrolling is employed for the data samples, as well as for additional samples used by oversampled phase detectors, such as transition, early and late samples. Raw samples clocked by the transition, early and late clock are processed using decision feedback with unrolling to produce the corresponding DFE equalized transition, early and late samples (and optionally additional samples).
According to another aspect of the present invention, an algorithm is provided for updating the transition, early, late or other coefficients that are employed for DFE equalization. While the exemplary embodiments focus on DFE architectures where the data portion is not a standard analog feedback loop but a hybrid loop employing analog feedback as well as “pre-computation” or “unrolling” techniques, the present invention could be used with fully unrolled, fully analog feedback, or hybrid architectures, as would be apparent to a person of ordinary skill in the art.
The phase of the analog waveform is typically unknown and there may be a frequency offset between the frequency at which the original data was transmitted and the nominal receiver sampling clock frequency. The function of the CDR 150 is to properly sample the analog waveform such that when the sampled waveform is passed through a data detector 160, the data is recovered properly despite the fact that the phase and frequency of the transmitted signal is not known. The CDR 150 is often an adaptive feedback circuit and the feedback loop must adjust the phase and frequency of the nominal clock to produce a modified recovered clock that can sample the analog waveform to allow proper data detection.
As previously indicated, the data detector 160 can be implemented as a slicer (i.e., a decision device based on an amplitude threshold) or a more complicated detector such as a sequence detector. For high speed applications, the data detector 160 is often implemented as a slicer that is clocked by the CDR clock. In addition to sampling the data signal, the slicer 160 essentially quantizes the signal to a binary “1” or “0” based on the sampled analog value and a slicer threshold, st. If the input to the slicer 160 at time n is w(n), then the output, ŷ(n), of the slicer 160 is given as follows:
In general, the CDR 150 may be composed of several components, such as a phase detector (PD), a loop filter, and a clock generation circuit. As shown in
The BBPD 154 processes several quantities to compute an estimate of timing adjustment needed to properly sample the signal, in a known manner. The timing adjustment is filtered by the loop 152 before adjusting the phase of the sampling clocks. For the BBPD 154, there needs to be two sampling clocks: a data sampling clock which samples the recovered data and a transition sampling clock that is offset from the data clock by half a baud period
and which samples the “transition” data. The transition sample data is denoted as ŷ(n−½) to indicate is sampled relative to ŷ(n) by a phase offset of
In addition, the BBPD 154 makes use of a one baud period delayed version of the recovered data. The delayed data is ŷ(n−1) (not shown explicitly in
As data rates increase for serializer/deserializer applications, the channel quality degrades and the use of decision feedback equalization (DFE) in conjunction with finite impulse response (TXFIR) and receive equalization (RXEQ) filtering will be required to achieve the bit error rate (BER) performance required by more and more demanding applications. Note that the FIR function of the transmitter (TX) might be moved from the transmitter to the receiver (RX) and incorporated into the RXEQ function.
As discussed hereinafter, a DFE correction, v(t), generated by a DFE filter 370 and digitized by a digital-to-analog converter 380 is subtracted by an analog summer 335 from the output, z(t), of the RXEQ 330 to produce a DFE corrected signal w(t).
w(t)=z(t)−v(t) (2)
Then, the signal w(t) is sampled by a switch 340:
w(n)=w(nT) (3)
with T being the baud period. The sampled signal w(n) is then sliced by a slicer 360 to produce the detected data ŷ(n). The slicer output in turn is used to produce the filtered DFE output v(n) which is converted by the DAC 380 to the continuous time signal v(t). The DFE filter output 380 is given by:
where b(l) represents the coefficients of the L tap DFE.
As discussed above in conjunction with
and which samples the “transition” data. The analog signal out of the RXEQ 330 is sampled at the baud rate by a switch 342 using the transition clock. The sampled signal w(n) is also sliced by a second slicer 362 to produce the detected data ŷ(n−½). The transition sample data is denoted as ŷ(n−½) to indicate is sampled relative to ŷ(n) by a phase offset of
It is noted that the DFE filter 370 uses as its input past data decisions starting at y(n−1) and earlier. The DFE filter 370 does not use the current decision ŷ(n). This guarantees that the operation is causal. Since an analog representation, w(t), of the DFE signal exists, it can be sampled directly by both the data clock using switch 340 (to produce w(n)) and the transition clock using switch 342 and these sampled latched signals can drive a traditional BBPD 354. For this circuit 300 to work, the entire DFE loop correction must be performed within one baud period T before the next correction is needed. At very high data rates, it is difficult to design circuits that operate this fast or to make them very accurate.
Consequently, a well known technique may be employed whereby the DFE terms are “precomputed” and chosen based upon the amplitude value of y(n). Since there is no DFE feedback loop, the process of generating the DFE “corrected” decisions can be pipelined.
As shown in
As shown in
For the case when ŷd(n−1)=1,
For the case when ŷd(n−1)=−1,
The outputs of the latches 460 are applied to DFE logic 470 to generate the DFE corrected decision ŷd(n).
The DFE and non-DFE decision operations may have different optimal sampling points. Therefore, the DFE latches should be sampled with a correct sampling phase that may be offset from the normal CDR data clock sampling phase by some offset pd in units of baud interval T. Thus, the switch 440 in the DFE path is controlled by a clock that is offset from the CDR data clock by an amount equal to pd(T). A number of techniques have been proposed or suggested for manually establishing the offset pd(T). The optimal sampling phase, however, is dependent on the channel or other equalizer settings. Thus, the sampling phase can be adaptively determined using the techniques described in United States Patent Application entitled “Method and Apparatus for Adaptively Establishing a Sampling Phase for Decision-Feedback Equalization,” filed Feb. 17, 2006, and incorporated by reference herein.
It is noted that the DFE can extended to more than one tap at the expense of additional area and computation time. The exemplary DFE phase placement circuit presented herein can be extended to a system with multiple DFE taps without changing the DFE phase placement circuit. For additional taps, the number of latches and the DFE logic block would be modified, as would be apparent to a person of ordinary skill in the art.
In the DFE precomputation embodiment shown in
Hybrid DFE Architectures: Analog Feedback And Loop Unrolling
As discussed hereinafter, a DFE correction, φ(t), generated by a DFE filter 570 and digitized by a digital-to-analog converter 580 is subtracted by an analog summer 535 from the output, y(t), of the AEQ 530 to produce a DFE corrected signal z(t).
z(t)=y(t)−φ(t).
Here, taps 2 and higher (b(2) . . . ) are implemented using the analog feedback and the first tap, b(1), is realized using unrolled or precomputation based DFE.
Then, the signal z(t) is sampled by a switch 540:
z(k)=z(kT)
with T being the baud period. The sampled signal z(k) is then sliced by a slicer 560 and by a second slicer 562. The slicers 560, 562 digitize the sample and compare the digitized sample to thresholds of b(1) and −b(1), respectively, using the CDR recovered clock. The output of slicers 560, 562 are applied to respective inputs of a multiplexer 565 that produces the filtered DFE output v(k) which is converted by the DAC 580 to the continuous time signal φ(t). The output of multiplexer 565 is also sliced by a slicer 568, delayed and then fed back to provide the input selection for the multiplexer 565.
Unrolled One Tap Data and Transition DFE
As shown in
As discussed hereinafter, a DFE correction, φ(t), generated by a DFE filter 670 and digitized by a digital-to-analog converter 680 is subtracted by an analog summer 635 from the output, y(t), of the AEQ 630 to produce a DFE corrected signal z(t).
z(t)=y(t)−φ(t).
Here, taps 2 and higher (b(2) . . . ) for the data path are implemented using the analog feedback and the first tap, b(1) is realized using unrolled or precomputation based DFE.
Then, the signal z(t) is sampled by the switch 640, sampled by the data clock:
z(k)=z(kT)
with T being the baud period. It is noted that z(k) is used as a short hand notation for z(kT+τ). The sampled signal z(k) is then sliced by slicers 660 and 662 that digitize the sample and compare the digitized sample to thresholds of b(1) and −b(1), respectively, using the CDR recovered data clock. The output of slicers 660, 662 are applied to respective inputs of a multiplexer 665 that produces the filtered DFE output v(k) which is converted by the DAC 680 to the continuous time signal φ(t). The output of multiplexer 665 is also sliced by a slicer 668, delayed and then fed back to provide the input selection for the multiplexer 665.
As previously indicated,
The sampled transition signal z(k−½) is then sliced by slicers 660-t and 662-t that digitize the sample and compare the digitized sample to thresholds of bt(1) and −bt(1), respectively, using the CDR recovered transition clock. The output of slicers 660-t, 662-t are applied to respective inputs of a multiplexer 665-t that produces the filtered DFE output v(k−½). Thus, one tap bt(1) is used for the transition sample generation. In this case, a DFE transition signal is obtained, derived as
v(k−½)=sgn[z(k−½)−bt(1)v(k−1)] (9)
This can be further simplified as
As shown in
which is the signal z(t) sampled by the transition clock. In general, a transition error signal e(k−½) can be derived from the transition signal v(k−½) as
e(k−½)=v(k−½)−vid(k−½) (11)
where vid(k−½) is the “ideal” signal for the transition. When v(k)≠v(k−1), indeed vid(k−½)=0 and then
e(k−½)=v(k−½) (12)
An adaptation “gradient” which adapts a coefficient will typically use an error signal and some other data derived signal and in this case, the transition coefficient bt(1,k) can be adapted as follows:
bt(1,(k+1)=bt(1,k)−2μte(k−½)v(k−1) (13)
which reduces to:
bt(1,(k+1)=bt(1,k)−2μtv(k−½)v(k−1) (14)
when e(k−½)=v(k−½), where μt is an adaptation gain factor which can be set independent of any adaptation gain factor μ for the regular DFE taps b(j). The value of the transition tap at time k is bt(1,k) and it is updated per the above equation to obtain its value bt(1,k+1) at time k+1. The use of bt(1) has the impact of performing transition ISI cancellation on the first tap. Additional transition cancellation for higher order taps could be considered by considering more terms at the cost of additional complexity, power and area. For optimal performance, the adaptation may be conditioned upon a data transition, i.e.:
bt(1,(k+1)=bt(1,k)−2μtv(k−½)v(k−1) when v(k)≠v(k−1) (15)
They may also be conditioned on other data that has been conditioned depending on the history of the data bits v(k) over some window of decisions spanning a finite length into the past.
Unrolled One Tap Data and Transition DFE Without Data Conditioning
As shown in
Unrolled One Tap Data DFE and Two Tap Transition DFE (Data Conditioning)
v(k−½)=sgn([z(k−½)−bt(1)v(k−1)−bt(2)v(k−2)]) (17)
The adaptation equations are:
bt(1,k+1)=bt(1,k)−2μtv(k−½)v(k−1) (18)
bt(2,k+1)=bt(2,k)−2μtv(k−½)v(k−2) (19)
Both the transition coefficients bt(1) and bt(2) may be optimally adapted when v(k)≠v(k−1) (data transition) or some other advantageous data condition.
An adaptive loop 885 uses equations 18 and 19 to generate the updated transition thresholds bt(1) and bt(2) based on the data samples v(k−1), v(k−2) and v(k−½).
Extension to More Transition Taps With Data Conditioning
The transition sample generation for a M taps can be obtained as follows:
The adaptation equation for the mth tap is
bt(m,k+1)=bt(m,k)−2μtv(k−½)v(k−m) (21)
To implement the calculation of v(k−½) in an unrolled fashion for a M tap transition DFE requires 2M latches with the appropriate threshsolds dervied from the signed combinations of bt(m). It may be possible depending on the demands of the application and corresponding performance required to use fewer transition taps and power down hardware no longer necessary to support the full number of taps.
Quasi-Linear DFE Phase Detector
Quasi-linear phase detectors (QLPD) (see, e.g., Y. Choi et al, “Jitter Transfer Analysis of Tracked Oversampling Techniques for Multigigabit Clock and Data Recovery,” IEEE Transactions on Circuits and Systems, 775-783 (November 2003)) employ additional clocks and require the use of additional DFE samples. For example, “early” and “late” clocks and corresponding samples may be required in addition to the data and transition clocks and samples. The transition clock is
earlier than the main data clock. The “early” clock is typically
earlier than the transition clock and the “late” clock is typically
earlier than the data clock.
U.S. patent application Ser. No. 11/356,691, filed Feb. 17, 2006, entitled “Method and Apparatus For Generating One or More Clock Signals for a Decision Feedback Equalizer Using DFE Detected Data,” generates additional DFE samples, for example, for “early” and “late” clocks, has been considered based on the use of one method to generate these samples without optimizing the DFE coefficient or threshold in the process of generating these samples. Also, no adaptation methodology is provided for updating the “early” and “late” DFE coefficients. One aspect of the present invention provides a more optimal solution for generating these samples and a method for adapting them. Consider still the hybrid DFE architecture as in
and must obtain the corresponding signed DFE early and late samples. Early and late DFE coefficients be(m) and bl(m) are defined which can be independently chosen from the data coefficients b(m) and the transition coefficients bt(m). The early/late samples used for the CDR update are:
Considering the ideal values at the early/late samples to be vid(k−¾)=γv(k−1) and vid(k−¼)=γv(k), we have for the early/late samples, errors:
Note that although be and bl are assumed to consist of M taps each, the number of taps for the data, transition, early, or late coefficients can be independently chosen. The early/late coefficients can be adapted as follows:
be(m,k+1)=be(m,k)−2μee(k−¾)v(k−m) (26)
bl(m,k+1)=be(m,k)−2μle(k−¼)v(k−m) (27)
where only the transition data is conditioned.
In
e(k−¾)=sgn([z(k−¾)−be(1)v(k−1)−γv(k−1)]) (28)
e(k−¼)=sgn([z(k−¼)−bl(1)v(k−1)−γv(k)]) (29)
and the error equations can be written more succinctly as:
e(k−¾)=sgn([z(k−¾)−αe]) (30)
e(k−¼)=sgn([z(k−¼)−αl]) (31)
where αe=be(1)v(k−1)−γv(k−1) and αl=bl(1)v(k−1)−γv(k), as shown in
An adaptive loop 985 generates the updated transition thresholds bt(1), be(1) and bl(1) based on the data samples v(k−¼), v(k−½), v(k−¾) and v(k−1). As shown in
The CDR digital loop filter can be decimated or decimated in a parallel sampled fashion, as described in U.S. patent application Ser. No. 10/965,138, entitled “Parallel Sampled Multi Stage Decimated Digital Loop Filter for Clock/Data Recovery,” incorporated by reference herein. Using a parallel sampled approach will mean more complexity but is otherwise a straightforward extension of the present invention. The loop filter may or may not incorporate look ahead techniques, as described in U.S. patent application Ser. No. 11/029,977, entitled “Look-Ahead Digital Loop Filter for Clock and Data Recovery,” incorporated by reference herein. While the examples shown herein are for a one tap unrolling for the data path, the disclosed architectures can be extended with more complexity to additional unrolled DFE taps for the main data path, as would be apparent to a person of ordinary skill in the art. The sample generation and adaptation architectures shown here were with an hybrid DFE architecture employing unrolling for the first data tap. The clock recovery samples can be generated even when a full analog DFE loop is used for the main data path.
In further variations, the architectures used here could optionally be combined with techniques such pattern qualification, as described in U.S. patent application Ser. No. 11/541,498, entitled, “Method and Apparatus for Generating One or More Clock Signals for a Decision-Feedback Equalizer Using DFE Detected Data In The Presence Of An Adverse Pattern,” incorporated by reference herein. One example of this is shown in
The architectures used here can also be considered with “quasi-non-linear” phase detectors. For example, the “early” and “late” clocks need not be uniformly spaced at
with respect to the data clock.
The above variations can be considered in combination with the non-conditioned data case as well as the conditioned data case.
The invention is described in terms of unrolled generation of the transition, late, early, or additional samples (as opposed to the main data samples). The use of an analog summing node in conjunction with unrolling can be considered for the non-data samples can be used. For example, the use of a local analog summing node for the bt(2)v(k−2) can be employed to minimize the number of latches.
A plurality of identical die are typically formed in a repeated pattern on a surface of the wafer. Each die includes a device described herein, and may include other structures or circuits. The individual die are cut or diced from the wafer, then packaged as an integrated circuit. One skilled in the art would know how to dice wafers and package die to produce integrated circuits. Integrated circuits so manufactured are considered part of this invention.
While exemplary embodiments of the present invention have been described with respect to digital logic blocks, as would be apparent to one skilled in the art, various functions may be implemented in the digital domain as processing steps in a software program, in hardware by circuit elements or state machines, or in combination of both software and hardware. Such software may be employed in, for example, a digital signal processor, micro-controller, or general-purpose computer. Such hardware and software may be embodied within circuits implemented within an integrated circuit.
Thus, the functions of the present invention can be embodied in the form of methods and apparatuses for practicing those methods. One or more aspects of the present invention can be embodied in the form of program code, for example, whether stored in a storage medium, loaded into and/or executed by a machine, or transmitted over some transmission medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. When implemented on a general-purpose processor, the program code segments combine with the processor to provide a device that operates analogously to specific logic circuits.
It is to be understood that the embodiments and variations shown and described herein are merely illustrative of the principles of this invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention.
Number | Name | Date | Kind |
---|---|---|---|
7277516 | Chou et al. | Oct 2007 | B2 |
7801208 | Hidaka | Sep 2010 | B2 |
20100104000 | Pozzoni | Apr 2010 | A1 |
20100215091 | Palmer | Aug 2010 | A1 |
Number | Date | Country | |
---|---|---|---|
20100329326 A1 | Dec 2010 | US |