A modern integrated circuit (IC) must meet very stringent design and performance specifications. In many applications for communication devices, transmit and receive signals are exchanged over communication channels. These communication channels include impairments that affect the quality of the signal that traverses them. One type of IC that uses both a transmit element and a receive element is referred to as a serializer/deserializer (SERDES). The transmit element on a SERDES typically sends information to a receiver on a different SERDES over a communication channel. The communication channel is typically located on a different structure from where the SERDES is located. To correct for impairments introduced by the communication channel, a transmitter and/or a receiver on a SERDES or other IC may include circuitry that performs channel equalization. Channel equalization is a broad term that comprises many different technologies for improving the accuracy of communication between a transmitter and a receiver. One typical type of equalization is referred to as decision feedback equalization and is performed by a decision feedback equalizer (DFE). A DFE is typically implemented in a receiver and improves the signal-to-noise ratio (SNR) of the signal, but it can suffer from burst error propagation.
A feed forward equalizer (FFE) does not suffer from burst error propagation, but nor does it provide the improvement in SNR as does a DFE.
Additionally, a DFE can only be utilized for post cursor equalization, where a FFE can be used for either or both of pre or post cursor equalization.
Further, current FFE implementations use a trans-conductance (gm) stage to implement, thus making such an implementation inefficient with respect to power consumption and die area.
Moreover, these drawbacks become more pronounced when attempting to design and fabricate a receiver that can operate using both PAM 2 and PAM 4 modalities. The acronym PAM refers to pulse amplitude modulation, which is a form of signal modulation where the message information is encoded into the amplitude of a series of signal pulses. PAM is an analog pulse modulation scheme in which the amplitude of a train of carrier pulses is varied according to the sample value of the message signal. A PAM 2 communication modality refers to a modulator that takes one bit at a time and maps the signal amplitude to one of two possible levels (two symbols), for example −1 volt and 1 volt. A PAM 4 communication modality refers to a modulator that takes two bits at a time and maps the signal amplitude to one of four possible levels (four symbols), for example −3 volts, −1 volt, 1 volt, and 3 volts. For a given baud rate, PAM 4 modulation can transmit up to twice the number of bits as PAM 2 modulation.
These drawbacks can be mitigated using forward error correction (FEC). FEC generally comprises techniques used for controlling errors in data transmission over unreliable or noisy communication channels. Generally, the sending device encodes a message in a redundant way by using an error-correcting code (ECC). The redundancy allows the receiver to detect a limited number of errors that may occur anywhere in the message, and often to correct these errors without retransmission. FEC gives the receiver the ability to correct errors without needing a reverse channel to request retransmission of data, but at the cost of a fixed, higher forward channel bandwidth. FEC is therefore applied in situations where retransmissions are costly or impossible, such as one-way communication links.
An amount of FFE and DFE applied to a communication signal can be different based on the presence or absence of FEC in a receiver system. For example, a receiver without FEC may operate better with more DFE relative to FFE, while a receiver with FEC may operate better with more FFE relative to DFE correction.
Therefore, it would be desirable to have a way to adjust an amount of FFE and DFE in a receiver based on whether there is forward error correction (FEC) present and based on a channel performance parameter, such as bit error rate (BER).
In an embodiment, a pipelined receiver comprises a programmable feed forward equalizer (FFE), a programmable decision feedback equalizer (DFE), and logic for controlling a ratio of FFE and DFE to apply to a received signal based on at least one channel parameter.
Other embodiments are also provided. Other systems, methods, features, and advantages of the invention will be or will become apparent to one with skill in the art upon examination of the following figures and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the invention, and be protected by the accompanying claims.
The invention can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present invention. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.
A modal PAM2/4 pipelined programmable receiver having feed forward equalizer (FFE) and decision feedback equalizer (DFE) optimized for forward error correction (FEC) bit error rate (BER) performance (hereafter referred to as a modal PAM2/PAM4 FFE DFE receiver optimized for FEC) can be implemented in any integrated circuit (IC) that uses a digital direct conversion receiver (DCR). In an embodiment, the modal PAM2/PAM4 FFE DFE receiver optimized for FEC is implemented in a serializer/deserializer (SERDES) receiver operating at a 50 gigabit per second (Gbps) data rate by implementing a pulse amplitude modulation (PAM) 4 modulation methodology operating at 25 GBaud (Gsymbols per second). The 50 Gbps data rate is enabled, at least in part, by the pipelined implementation to be described below, and is backward compatible with PAM 2 modulation methodologies operating at a data rate of 25 Gbps.
As used herein, the term “cursor” refers to a subject bit, the term “pre-cursor” or “pre” refers to a bit that precedes the “cursor” bit and the term “post-cursor” or “post” refers to a bit that is subsequent to the “cursor” bit.
The transceiver 112-1 comprises a logic element 113, which includes the functionality of a central processor unit (CPU), software (SW) and general logic, and will be referred to as “logic” for simplicity. It should be noted that the depiction of the transceiver 112-1 is highly simplified and intended to illustrate only the basic components of a SERDES transceiver.
The transceiver 112-1 also comprises a transmitter 115 and a receiver 118. The transmitter 115 receives an information signal from the logic 113 over connection 114 and provides a transmit signal over connection 116. The receiver 118 receives an information signal over connection 119 and provides a processed information signal over connection 117 to the logic 113.
The system 100 also comprises a SERDES 140 that includes a plurality of transceivers 142. Only one transceiver 142-1 is illustrated in detail, but it is understood that many transceivers 142-n can be included in the SERDES 140.
The transceiver 142-1 comprises a logic element 143, which includes the functionality of a central processor unit (CPU), software (SW) and general logic, and will be referred to as “logic” for simplicity. It should be noted that the depiction of the transceiver 142-1 is highly simplified and intended to illustrate only the basic components of a SERDES transceiver.
The transceiver 142-1 also comprises a transmitter 145 and a receiver 148. The transmitter 145 receives an information signal from the logic 143 over connection 144 and provides a transmit signal over connection 146. The receiver 148 receives an information signal over connection 147 and provides a processed information signal over connection 149 to the logic 143.
The transceiver 112-1 is connected to the transceiver 142-1 over a communication channel 122-1. A similar communication channel 122-n connects the “n” transceiver 112-n to a corresponding “n” transceiver 142-n.
In an embodiment, the communication channel 122-1 can comprise communication paths 123 and 125. The communication path 123 can connect the transmitter 115 to the receiver 148 and the communication path 125 can connect the transmitter 145 to the receiver 118. The communication channel 122-1 can be adapted to a variety of communication methodologies including, but not limited to, single-ended, differential, or others, and can also be adapted to carry a variety of modulation methodologies including, for example, PAM 2, PAM 4 and others. In an embodiment, the receivers and transmitters operate on differential signals. Differential signals are those that are represented by two complementary signals on different conductors, with the term “differential” representing the difference between the two complementary signals. The two complementary signals can be referred to as the “true” or “t” signal and the “complement” or “c” signal. All differential signals also have what is referred to as a “common mode,” which represents the average of the two differential signals. High-speed differential signaling offers many advantages, such as low noise and low power while providing a robust and high-speed data transmission.
The reference to a “pipelined” processing system refers to the ability of the FFE 220, the DFE 230, the RSA 240 and the QES 214 to process 8 pipelined stages 212 (referred to below as sections D0 through D7) simultaneously.
The DFE 230 receives a threshold voltage input from a digital-to-analog converter (DAC) 272 over connection 273. The RSA 240 receives a threshold voltage input from a digital-to-analog converter (DAC) 274 over connection 275. The DAC 272 and the DAC 274 can be any type of DAC that can supply a threshold voltage input based on system requirements.
In each pipelined stage 212, the FFE 220 and the DFE 230 generate analog outputs, which are summed together at summing node 280, referred to as “sum_t” and “sum_c.” The summing node 280 is also the input to RSA 240, which acts as an analog-to-digital converter. The RSA 240 converts an analog voltage into a complementary digital value.
The RSA 240 converts an analog voltage into a complementary digital value. The output of the RSA comprises sampled data/edge information and is provided over connection 216 to a phase detector (PD) 218. The output of the phase detector 218 comprises an update signal having, for example, an up/down command, and is provided over connection 222 to a clock (CLK) element 224. The clock element 224 provides an in-phase (I) clocking signal over connection 226 and provides a quadrature (Q) clocking signal over connection 228. The in-phase (I) clocking signal is provided to the pipelined FFE 220, the DFE 230, and to the RSA 240; and the quadrature (Q) clocking signal is provided to the QES element 214.
The QES element 214 receives a threshold voltage input from a DAC 276 over connection 277. The DAC 276 can be any type of DAC that can supply a threshold voltage input based on system requirements.
The output of the RSA 240 on connection 232 is a digital representation of the raw, high speed signal prior to extracting any line coding, forward error correction, or demodulation to recover data. In the case of PAM 2, the output is a sequence of ones and zeros. In the case of PAM N, it is a sequence of N binary encoded symbols. For example, for PAM 4, the output comprises a string of four distinct symbols each identified by a different two bit digital word. The output of the RSA 240 is provided over connection 232 to a serial-to-parallel converter 234. The serial-to-parallel converter 234 converts the high speed digital data stream on connection 232 to a lower speed bus of parallel data on connection 236. The output of the serial-to-parallel converter 234 on connection 236 is the parallel data signal and is provided to a forward error correction (FEC) element 242. Although shown as being implemented with an FEC element 242, the receiver 200 need not include forward error correction. The modal PAM2/PAM4 FFE DFE receiver optimized for FEC can be implemented in a receiver with or without FEC, and can be used to optimize receiver performance whether or not an FEC is present.
The output of the serial-to-parallel converter 234 on connection 237 is an error, or test, signal and is provided to an automatic correlation engine (ACE) 246. The error, or test, signal is used to drive system parameters to increase signal-to-noise ratio in the receiver 200, and can be generated in several ways. One way is to use samplers inside the QES element 214 to identify zero crossings (also called edge data, or the transition between data bits). Another method is to use auxiliary samplers inside the RSA element 240 to identify the high amplitude signals (equivalent to the open part of an eye diagram). So, for example, using the edge data method, if a sampler inside the QES element 214 began to detect a positive signal where the zero crossing point should occur, then the ERROR signal on connection 237 would increase, and various system parameters could be driven to reduce that error. The output of the FEC 242 is provided over connection 149 to the CPU 252.
The output of the ACE 246 is provided over connection 248 to the CPU 252. The implementation of the ACE 246 could be done with hardware on chip, firmware off chip, or a combination of hardware and firmware, and a CPU, in which case the CPU 252 would read and write to the ACE 246 over connection 248. The ACE 246 compares the received data to a pseudorandom binary sequence (PRBS) pattern and provides a correlation function to support implementation of a least mean square (LMS) algorithm for tuning the receiver 200.
The CPU 252 is connected over a bi-directional link 254 to registers 256. The registers 256 store DFE filter coefficients, FFE controls, CTLE controls, RSA threshold voltage controls, offset correction values for the RSA and QES elements, and controls for the DACs.
An output of the registers 256 on connection 261 is provided to the phase detector 218, an output of the registers 256 on connection 262 is provided to the pipelined DFE 230, an output of the registers 256 on connection 263 is provided to the pipelined FFE 220 and an output of the registers 256 on connection 264 is provided to the QES element 214. Although not shown for simplicity of illustration, the registers 256 also provide control outputs to the CTLE 202 and to all the DACs. In an embodiment, the output of the QES element 214 on connection 238 comprises sampled data/edge information and is provided to the phase detector 218 and the serial-to-parallel converter 234.
In an embodiment, a channel performance parameter, such as bit error rate (BER) can be used as an indicator of channel performance. The BER can then be used to set, adjust or establish receiver parameters, such as a number and gain of FFE taps and DFE taps; and also to determine an optimal ratio of FFE to DFE implementation. In this regard, the receiver 200 also comprises a BER element 282. The BER element 282 can operate in a number of different ways, as known to those having ordinary skill in the art.
For example, in an embodiment in which pseudorandom binary sequence (PRBS) data is being sent, the data stream can be used by the BER element 282 to determine errors. In such an embodiment, the BER element 282 receives the data stream over connection 236, and, if the FEC 242 is implemented, receives the output of the FEC element 242 over connection 149. The BER element 282 uses the data stream over connection 236 and the output of the FEC 242 to determine errors in the data stream, and provides the error information to the CPU 252 over connection 286. If the FEC 242 is not implemented, then the BER element 282 receives only the data on connection 236, and determined errors solely from the data stream.
In an embodiment in which PRBS data is not sent, then exclusive Or (XOR) errors can be monitored via appropriately offset (test-data) RSA samplers & normal (good-data) RSA samplers via the ACE element 246, as known to those having ordinary skill in the art. In such an implementation, the XOR errors are provided from the ACE element 246 to the BER element 282 over connection 284. The BER element 282 then determines errors in the data stream, and provides the error information to the CPU 252 over connection 286.
In another implementation, mission FEC encoded data can detect errors internal to the FEC element 242, and provide the errors to the BER element 282 over connection 149. As used herein, the term “mission FEC encoded data” refers to live data (as opposed to PRBS data) that has at least some protocol-level encoding. A common protocol is Reed-Solomon error correction encoding. The BER element 282 then determines errors in the data stream, and provides the error information to the CPU 252 over connection 286. The CPU 252 then uses the BER information to adjust the FFE 220 and the DFE 230 via the registers 256. The adjustment of the FFE 220 and the DFE 230 can comprise one or more of the number of FFE and DFE stages implemented and the gain of each FFE and DFE stage.
The elements in
Generally, a receive signal on connection 204 is applied to an array of FFE/DFE/RSA/QES sections. If an array of N sections is implemented, then each section can process the receive signal at a rate of 1/(UI*N) which significantly relaxes power requirements compared to the standard (un-pipelined) processing.
For example, a 25 Gbaud receive signal could be processed by an array of 8 sections, each section running at 3.125 GHz. The start time for each section is offset by 1 UI from its neighboring section, so that when the outputs from all 8 sections are summed together (signal 236), it is updated at the original 25 Gbaud rate.
The FFE unit cell 300 also comprises a capacitor 321 and a capacitor 322. The FFE unit cell 300 is illustrated as operating on a differential signal with an input signal “in_t” provided on connection 332 and an input signal “in_c” provided on connection 334. The “in_t” signal and the “in_c” signal are the “true” and “complement” differential data outputs of the CTLE 202 of
The clock generation logic 302 receives an 8-phase clock input signal on connection 303 and generates appropriate clock signals to allow the FFE unit cell 300 to switch at the appropriate time, and will be described in greater detail below.
The FFE unit cell 402 comprises FFE clock generation logic 412, switches 414 and 416, and a capacitor 418. The capacitor 418 is illustrated as an adjustable capacitance as will be described below. An 8-phase clock signal is provided to the FFE clock generation logic 412 over an 8-phase clock bus 426. In the embodiment shown in
An input signal is provided to the FFE unit cells 402, 404, 406, 408 and 410 over connection 204, which is the “in_t” and “in_c” signals output of the CTLE 202 (
The sum_t signal on connection 419 and the sum_c signal on connection 420 is equivalent to the input signal on connection 204 modified by a programmable coefficient that is generated by operation of the FFE clock generation logic 412 selecting a subset of 8 available clock phases from the 8-phase clock input signal on the 8-phase clock bus 426 that is provided to the FFE unit cell 402, and similarly provided, to the FFE clock generation logic 440, 450, 460 and 470 in the FFE unit cells 404, 406, 408 and 410, respectively.
The FFE clock generation logic 412 uses a subset of clock phases (generated by using selected combinations) of the 8-phase clock input signal on the 8-phase clock bus 426 to generate the TRK signal on connection 415 and the EVAL signal on connection 417. The FFE clock generation logic 412 also generates a precharge signal, referred to as “PRE”, which is not shown in
The specific phases selected from the 8-phase clock signal on bus 426 define the time that the voltage at the input 204 is sampled onto the capacitor 418 (and the capacitors 431, 432, 433 and 434), through switch 414 (and the switches 444, 454, 464 and 474), and later through the switch 416 (and switches 446, 456, 466 and 476) and applied to the summing node 280.
With particular regard to the FFE unit cell 402, but applicable to the unit cells 404, 406, 408 and 410, the FFE clock generation logic 412 controls the operation of the switches 414 and 416 to control and determine the time that the input voltage on connection 204 is applied to the capacitor 418, thus adjustably controlling, or programming, the value of the capacitor 418, and thus determining the value of the coefficient on connection 419 or connection 420. The time that the input voltage is applied to the capacitors 431, 432, 433 and 434, is similarly controlled by respective FFE clock generation logic 440, 450, 460 and 470, thus determining the total value of the signal on connection 424. Similarly, by adjusting the number of FFE LSB unit cells enabled for each cursor, the FFE 220 provides a widely adjustable coefficient to the input signal on connection 204.
The value of the signal on connection 424 is generated by multiplying the input signal (Vin) on connection 204 by a coefficient (Coeff, corresponding to the value of each capacitance C0 through C4, in this embodiment) to generate the output (Vout), so Vout=Coeff*Vin. In such an example, the value of the “Coeff” is set by the size of the capacitor 418 (and 431, 432, 433 and 434). However, in an alternative embodiment, the value of the coefficient (Coeff) can be determined by enabling or disabling FFE LSB cells (more cells in parallel is equivalent to one cell with a bigger capacitor), or by changing whether an FFE LSB cell provides an output to sum_t, or to sum_c. For example, if an FFE unit cell provides an output to sum_c, it is applying a negative coefficient, and if it provides an output to sum_t is applying a positive coefficient. In an embodiment, a combination of these three methodologies is used to generate the overall value on connection 424.
In the example of
With regard to the FFE unit cell 402, but applicable to the FFE unit cells 404, 406, 408 and 410, the FFE clock generation logic 412 controls the timing of the switches 414 and 416 and the registers 256 (
A graphical example of the input signal provided to the FFE clock generation logic 412 is shown in the graph 480. The vertical axis 482 of the graph 480 refers to relative amplitude in volts (V), with a normalized value range of between −1V and +1V. The horizontal axis 484 refers to the phase of the signal on connection 426. The signal on connection 426 is sampled at 45 degree intervals to generate the 8 clock phases in one clock cycle represented by the trace 485. The FFE clock generation logic in each FFE unit cell selects the appropriate subset of the 8 clock phases to control the operation of each FFE unit cell 402, 404, 406, 408 and 410 to apply a selectable coefficient to the input via respective capacitors 418, 431, 432, 433 and 434, to generate a widely programmable equalized output voltage on connection 424. In an embodiment, the FFE clock generation logic 412 can be implemented as a 1:8 demultiplexer, where each of the 8 outputs is a signal that is separated in phase from each adjoining output by 45 degrees and having a different voltage value.
The input signal on connection 204 to the FFE cells 402, 404, 406, 408 and 410 will be described in conjunction with the timing diagram of
The traces labeled “D0” through “D7” in
The terms “TRK” or “TRACK” refer to a tracking period during which the capacitor is connected to the input 204 to allow the capacitor to be charged to the input voltage on connection 204. Referring to
The term “HOLD” refers to a hold period during which the capacitor is decoupled from the input node 204, and thus from the charging voltage and is allowed to remain in a charged state.
The term “EVAL” refers to an evaluation period during which the capacitors are coupled to the summing node 280. Referring to
As shown in
By selecting the number of FFE LSB cells to enable for each cursor, and selecting the sign of the EVAL signals in those selected cells, an FFE filter function is implemented. The clock signals determine the time that each FFE LSB unit cell will sample the input on connection 204 thus determining which cursor on which FFE LSB unit cell will sample the input. In addition, the registers 256 provide control signals that enable more/less of each cursor to be applied to the summing node by controlling each FFE LSB cell to use the ck_ev0 or ck_ev1 signals to determine whether the coefficient is positive or negative. The registers 256 control whether the signal ck_ev0 or the signal ck_ev1 will be connected to the capacitor in each unit cell, and the FFE clock generation logic 412 circuit applies the input at the right time, using selected phases of the 8 phase clock.
The track (TRK) periods in each FFE unit cell should be aligned with specific cursors used for the equalizer. In the implementation described herein, there are five UIs (five FFE LSB unit cells in
The DFE cell 600 also comprises a capacitor 621 and a capacitor 622. The DFE cell 600 is illustrated as operating on a differential signal with a “r2r_t” signal provided on connection 632 and a “r2r_c” signal provided on connection 634 from the DAC 272. The switches 612 and 614 receive a clock signal “ck_trk”, the switches 616 and 617 receive a clock signal “ck_ev0_lsb” and the switches 618 and 619 receive a clock signal “ck_ev1_lsb.” The switch 615 receives a clock signal “ck_pre” on connection 633. The “ck_pre” signal precharges the capacitors 621 and 622. The “true” output “sum_t” of the DFE cell 600 is provided over connection 644 and the “complement” output “sum_c” is provided over connection 646. The outputs “sum_t” and “sum_c” are provided to the RSA element 240 (
The clock generation logic 602 receives an 8-phase input signal on connection 603 and receives a PAM 4 feedback word over connection 652. The clock generation logic 302 generates appropriate clock signals to allow the DFE cell 600 to switch at the appropriate time, and will be described in greater detail below.
The DFE cell 650 also comprises a capacitor 671 and a capacitor 672. The DFE cell 650 is illustrated as operating on a differential signal with a “r2r_t” signal provided on connection 682 and a “r2r_c” signal provided on connection 684 from the DAC 272. The switches 662 and 664 receive a clock signal “ck_trk”, the switches 666 and 667 receive a clock signal “ck_ev0_msb” and the switches 668 and 669 receive a clock signal “ck_ev1_msb.” The switch 665 receives a clock signal “ck_pre” on connection 683. The “ck_pre” signal precharges the capacitors 671 and 672. The “true” output “sum_t” of the DFE cell 650 is provided over connection 694 and the “complement” output “sum_c” is provided over connection 696. The outputs “sum_t” and “sum_c” are provided to the RSA element 240 (
The value of the capacitors 621 and 622 in the DFE cell 600 are referred to as “1X” and the capacitors 671 and 672 in the DFE cell 650 are referred to as “2X.” Similarly, the switches 612, 614, 615, 616, 617, 618 and 619 are configured using the nomenclature “1×” to correspond to the 1× of the capacitors 621 and 622. The switches 662, 664, 665, 666, 667, 668 and 669 are configured using the nomenclature “2X” to correspond to the 2X of the capacitors 671 and 672. The components labeled “2X” are twice the value of the components labeled “1X.” By scaling the switch sizes by the same factor as the capacitor sizes, the charge and discharge times of the 1X or 2X cell is the same.
The clock generation logic 602 receives an 8-phase input signal on connection 603 and receives a PAM4 feedback word over connection 652. The clock generation logic 602 generates appropriate clock signals to allow the DFE cell 650 to switch at the appropriate time, and will be described in greater detail below.
The value of Vout on connection 722 is given by:
Vout=Vref·VAL/2N, where Vref=VDD, and where N=the number of bits and VAL is the digital input value.
Vout=(0.5*(8b_Dac/255)+0.5)*VDD
8b_Dac=0->0.5*VDD
8b_Dac=127->0.749*VDD
8b_Dac=255->1.0*VDD
The DFE clock generation logic 602 selects the appropriate subset of the 8 clock phases to control the operation of each DFE unit cell to apply a selectable coefficient to the summing node (1022,
The DAC 272 provides a programmable voltage over connection 273 to the LSB block 600 and the MSB block 650 through the switches 1012 and 1062, respectively. The switches 1012 and 1062 are controlled by the “ck_trk” signal from the DFE clock generation logic 1002 over connection 1026. The embodiment shown in
The switch 1016 is controlled by the “ck_ev_lsb” signal over connection 1028. The “ckev_lsb” signal corresponds to the “ck_ev0_lsb” signal and the “ck_ev1_lsb” signal in
The switch 1066 is controlled by the “ck_ev_msb” signal over connection 1029. The “ck_ev_msb” signal corresponds to the “ck_ev0_msb” signal and the “ck_ev1_msb” signal in
Referring to
In the diagram 1100, detail is provided for slice 5, which samples the main cursor at clock phase 4.
The term “PRE” refers to a period during which the capacitors in each unit cell (e.g., the capacitors 621, 622, 671 and 672 in the differential unit cells shown in
The terms “TRK” or “TRACK” refer to a period during which the capacitor is connected to the output of the DAC 272. Referring to
The term “HOLD” refers to a hold period during which the capacitor is decoupled from the input of the DAC 272, and thus from the charging voltage and is allowed to remain in a charged state.
The term “EVAL” refers to a period during which the capacitors are coupled to the summing node 280. Referring to
The timing for the FFE section (220,
The DFE for slice 5 (shown using 1104) is always operating in parallel with the FFE (shown using 1102), and applying its output to the same summing node (summing node 280,
In this embodiment, there are 10 DFE taps, referred to as DFE coefficients, with each tap corresponding to a particular cursor. The number of taps could be greater or smaller than 10, and depends on the particular application and the amount of equalization expected from the design. There can be more DFE taps (10) than there are pipeline stages (eight (8)), if previous decisions are stored in memory, as will be explained below. The DFE taps and the associated cursors are shown in the section 1104 of the diagram 1100. The diagram 1100 describes the timing associated with the D5 slice. During the track phase “TRK”, the DFE coefficient for each tap is sampled onto a capacitor (1021/1071) by the DAC 272. The DAC setting is equivalent to the value of the coefficient for a given cursor, and could also be referred to as the “tap weight”. In this implementation, there are taps for the cursors POST4 through POST13. The relatively long track phase of six (6) UI allows for complete charging of the DFE sampling caps (1021/1071) by the DAC 272.
The section 1106 shows how previous decisions from the various other DFE slices are used by the D5 slice to evaluate the DFE coefficients. The line 1110 shows the instant that the RSA for slice 5 is clocked, in order to determine the voltage at the summing node 280. Note that slice 5 does not use the most recent decisions, which are from slices 4, 3, and 2, shown as “not used” using reference numeral 1107. This relaxes the power needed to meet timing requirements in high data rate designs. These three decisions correspond to postcursors 1, 2, and 3, which are sampled in the FFE (shown using 1102), and so the entire pipelined receiver can still compensate for distortions at these cursors. Also note, slice 5 uses the decision from its own RSA, from the previous cycle (shown using reference numeral 1115), to apply the coefficient for postcursor 8. For all decisions that occurred previous to this (postcursors 9 through 13), the decision is stored in a memory element, such as a flip flop, so it will not be overwritten before slice 5 uses it. This is shown in the diagram 1100 by the boxes 1121, 1122, 1123, 1124 and 1125 at the outputs of the five decisions prior to postcursor 8. The boxes 1121, 1122, 1123, 1124 and 1125 refer to memory elements.
Each of the traces, e.g., “D0”, from
The RSA 240 uses three samplers, each with a different threshold level, to determine which of the four PAM 4 symbols to use to encode the summing node 280 with the correct voltage. The three threshold levels correspond to the three samplers and are illustrated using reference numerals 1203, 1205 and 1207. For example, if the voltage on the summing node 280 is less than the voltage associated with sampler at level 1205, but more than the voltage associated with the sampler at level 1203, then the RSA 240 will choose PAM 4 symbol 01 (voltage level 1204), which will cause any DFE unit cells that use that decision word to initiate the “ck_ev0_msb” signal and the “ck_ev1_lsb” signal. Since the circuitry associated with the MSB and LSB are sized at a 2X to 1X ratio, the total charge that the DFE unit cell capacitors contribute to the summing node 280 using the PAM 4 symbol 01 will be proportional to (−2)+(+1)=−1. In other words, the DFE coefficient, which is stored as a DAC driven voltage onto the capacitors 1021 and 1071 would be applied to the summing node 280 in factors of either −3, −1, +1, or +3, depending on the decision symbol. This results in a linear contribution by the DFE decision to the summing node 280, with a constant spacing between each adjacent symbol, as shown by levels 1202, 1204, 1206 and 1208 in
Using the same hardware, and only changing registers in 256, the design can relax from receiving PAM 4 data at a given data rate, to receiving PAM 2 data at half that data rate. One simple way to configure PAM2 operation would be to disable all the LSB cells, so that only −2 and +2 feedback contributions would result from the MSB cells. Another way would be to program the DACs that drive the three RSA thresholds (274 in
The range of time in UI over which the FFE and the DFE operate are shown using bars. Generally, the FFE operates linearly on both pre- and post-cursors (UIs before and after “0”), and the DFE operates non-linearly on post-cursors only. For example, the range over which the FFE may operate comprises two pre-cursors (−2 UI) to 5 post cursors (5 UI) for a total in this example of 7 UI, shown using reference numeral 1312. The range over which the DFE may operate comprises 9 post cursors for a total in this example of 9 UI, shown using reference numeral 1314. In this example, the FFE and the DFE overlap for 3 UI, shown using reference numeral 1315. The term “overlap” as used herein refers to a mode in which at least one tap of both the FFE and the DFE operate on a subject cursor or bit. The number of UI over which the FFE and the DFE operate is related to the number of “taps” for each of the FFE and the DFE, with each tap corresponding to 1 UI.
Generally, it is desirable to minimize the overlap of the operation of the FFE and the DFE, as the FFE and the DFE are beneficial for different optimization criteria. For example, in a situation in which there is forward error correction (FEC) and latency is not a primary optimization criteria, it is generally desirable to maximize the range over which the FFE operates. This is because the DFE can introduce non-linear burst errors which can make the FEC coding gain less effective than with no DFE. This situation is illustrated with bar 1322 showing the maximum number of FFE taps (in this example) and bar 1324 showing a minimized number of DFE taps.
In a situation in which there is no FEC, or its latency effects, or the signal-to-noise ratio (SNR) of the signaling medium indicates that the receiver doesn't need FEC, it is generally desirable to maximize the range over which the DFE operates. This situation is illustrated with bar 1334 showing the maximum number of DFE taps and bar 1332 showing a minimized number of FFE taps. In accordance with an embodiment of the modal PAM2/PAM4 FFE DFE receiver optimized for FEC, the number of FFE taps and the gain of each FFE tap are variable and the number of DFE taps and the gain of each DFE tap are variable, based on one or more system and channel parameters. Non-limiting examples of channel parameters are the BER of the communication channel over which the receiver 200 is communicating and the signal-to-noise ratio (SNR) of the communication channel over which the receiver 200 is communicating. Further, a variable gain element associated with each FFE tap and each DFE tap can be used to adjust, control, and vary the gain of each FFE tap and each DFE tap based at least in part on one or more of the channel parameters.
The selection and implementation of the FFE taps 1412 and the FFE variable gain stages 1414 are controlled by signals from the registers 256 over connection 263 (
The output of the CTLE 202 is provided on connection 204 (in_t and in_c) as input signal r(n) and is provided to a first FFE variable gain stage 1432. The input signal on connection 204 then traverses FFE tap 1442, which creates a one (1) UI delay, so that the input signal r(n−1) can be provided to FFE variable gain stage 1434. The input signal is processed this way until it reaches the Nth FFE tap 1446 after which it is processed by FFE variable gain stage 1438. The output of each FFE variable gain stage 1414 is provided over connection 1425 to the summing node 280.
The output of the summing node 280 is provided over connection 1426 to a quantizer 1427. The quantizer 1427 processes the analog signal on connection 1426 and generates a digital one (1) bit output signal, s(n), on connection 1428.
The digital one (1) bit output signal on connection 1428 is provided to a first DFE variable gain stage 1452. The input signal on connection 1428 then traverses DFE tap 1462, which creates a one (1) UI delay, so that the input signal s(n−1) can be provided to DFE variable gain stage 1454. The input signal is processed this way until it reaches the Nth DFE tap 1466 after which it is processed by DFE variable gain stage 1458. The output of each DFE stage 1424 is provided over connection 1425 to the summing node 280.
The summing node 280 combines the outputs of the FFE variable gain stages 1414 and the DFE variable gain stages 1424 to generate an equalized signal on connection 1425.
In an embodiment, the amount of FFE and DFE to apply to a received signal can be determined apriori based on known system parameters. When implemented in this manner, a single receiver implementation can be used for multiple communication system applications. For example, for many applications, the communication standard being implemented will either be able to tolerate the latency induced by forward error correction (FEC), or it will not. In other applications, the communication standard will be known to have a worst case BER or SNR, which is typically worse than what is acceptable without FEC, and will then default to always having FEC enabled. Typically, if FEC is utilized in the communication system, it is generally preferable to minimize the number of DFE taps, and thus maximize the number of FFE taps. This situation is illustrated in
In alternative embodiments, such as when the ratio of the FFE/DFE cannot be determined apriori, or where optimal receiver performance may vary based on configuration or varying receiver parameters, one or more of the channel parameters or the receiver parameters may be used as a metric for determining the optimal FFE and DFE settings. For example, the bit error rate (BER) of the receiver can be utilized as a metric for determining the optimal FFE and DFE settings.
In an implementation in which non-overlapping FFE/DFE settings are being utilized, a least mean squares (LMS) algorithm can be utilized to optimize each of the FFE and DFE configurations. For example, two configurations cases A: {FFE=[1:3], DFE[4:10]} and B: {FFE=[1:4],DFE[5:10]} can be optimized separately, and then the system's BER can be measured (with or without FEC, depending if FEC is implemented) to determine the optimal FFE and DFE settings. The numbers in the brackets refer to the UIs over with the FFE and the DFE operate.
In other embodiments it may be beneficial to overlap the FFE and the DFE taps so that both FFE and DFE operate on at least one cursor. In an embodiment, an overlapped optimal setting of the FFE and DFE can be determined by utilizing a BER metric to optimize concurrent FFE/DFE tap settings. One way to accomplish this is to sweep both the FFE taps and the DFE taps through their full cross-product of settings, to identify an ideal setting via measuring a BER metric. Alternatively, a gradient search of successive approximation along the path of steepest descent can be utilized to optimize the tuning time.
In block 1502, one or more receiver or system parameters are determined. For example, the bit error rate (BER) of the communication channel can be determined by the receiver using one or more of the methods described above in
In block 1504, these parameters are applied to adjustably control the number and operation of FFE taps and DFE taps in the receiver 200.
In block 1506, it is determined whether it is desirable to have overlapping FFE and DFE.
If it is determined in block 1506 that FFE and DFE overlap is not desired, then in block 1508, the FFE is independently optimized. As an example, the FFE can be optimized using a least mean squares (LMS) or other known methodology for optimizing FFE performance.
In block 1510, the DFE is independently optimized. As an example, the DFE can be optimized using a least mean squares (LMS) or other known methodology for optimizing DFE performance.
In block 1512, a system parameter is measured. For example, the BER of the communication channel and the receiver can be measured.
In block 1514, it is determined whether the system parameter is optimized, which is a direct reflection on whether the settings of the FFE and the DFE are optimized. If it is determined that the system parameter is not optimized, then the process returns to block 1508, and the optimization process repeats. If it is determined that the system parameter is optimized, then the process ends.
If it is determined in block 1506 that FFE and DFE overlap is desired, then in block 1516, the FFE and the DFE are optimized together using a system parameter. In an embodiment, the BER of the communication channel and the receiver can be measured and used as an indicator of DFE and FFE optimization.
In block 1518, it is determined whether the system parameter is optimized, which is a direct reflection on whether the settings of the FFE and the DFE are optimized. If it is determined that the system parameter is not optimized, then the process returns to block 1516, and the optimization process repeats. If it is determined that the system parameter is optimized, then the process ends.
This disclosure describes the invention in detail using illustrative embodiments. However, it is to be understood that the invention defined by the appended claims is not limited to the precise embodiments described.