Embodiments disclosed herein relate generally to integrated circuit (“IC”) devices and in particular to receivers with decision feedback equalization.
Point-to-point parallel links have shown potential in delivering high-bandwidth and low-latency inter-chip communication, and have been widely used in applications such as chip interconnections, networking and communication switches, memory interfaces, and multimedia product communications applications. With the design of such links, some design considerations may include bandwidth (increasing bit rate), latency (allowing for real-time data processing in the channels and improving phase noise tracking in clock-data recovery), cost/overhead, and I/O complexity (enabling the integration of a large number of I/Os in a system).
Frequency-dependent channel attenuation and signal distortion, which can lead to reduced received signal amplitude and inter-symbol interference (ISI), can make I/O design challenging. For example, a sampled bit in a bit stream can be distorted from precursor bits (precursor distortion) and/or postcursor bits (postcursor distortion). Precursor distortion results from energy in a bit sample that is effectively “projected” ahead by one or more upstream (or precursor) bits. Conversely, postcursor distortion is residual energy in a bit sample left from one or more downstream (or postcursor) bits. Fortunately, equalization may be used to address channel attenuation and compensate for either or both precursor and postcursor distortion.
Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements.
The terms “transmitter” and “receiver” are used in their ordinary sense and generally refer to devices for transmitting and receiving, respectively, bit stream signals over a channel. The term “channel” refers to a transmission path through which a signal (x(t) in the depicted figure) propagates from a transmitter output to a receiver input. It may include combinations of electrical, wireless, and/or optical transmission media. For example, it could include combinations of packaging components (e.g., bond wires, solder balls), package traces, sockets, printed-circuit board (PCB) traces, cables (e.g., coaxial, ribbon, twisted pair), wave guides, air (and any other wireless transmission media), optical cable (and other optical transmission components), and so on.
The DFE 131 may be used to reduce inter-symbol interference (ISI) so that data can effectively be recovered from a bit stream signal received from the transmitter 110. The depicted DFE comprises a feed-forward filter portion 132 to reduce precursor distortion and a feedback filter portion 139 to reduce postcursor distortion. It also includes a summer 135 and a decision slicer 137. The decision slicer 137 determines a digital value for a sampled bit based on the sum of weighted versions of the sampled bit, one or more previous bits (from feedback path 139) and/or one or more bits to follow (from feed-forward 132) summed together at summer 135. The basic idea is to skew the decision threshold of a received bit based on the values of previous bits (for postcursor distortion) and/or subsequent bits (for precursor distortion). Some embodiments may or may not include a feed-forward filter component 132 in the signal path from the input (“in”) to the summer 135. For example, a feed-forward filter might be omitted from a DFE when a filter is included in the transmitter to reduce precursor distortion from the transmitter side. In addition, the summed components may be positive or negative, i.e., they may be additive or subtractive, depending upon the characteristics of the channel. For example, weighted, e.g., reduced, versions of postcursor bit values may be subtracted from a bit sample to adjust for postcursor distortion. Moreover, there may be any number of summed components, depending on design considerations
With reference to
In the depicted embodiment, sample and hold (S/H) switches 204, 206 sample data on a clock's falling edge (as indicated in the timing diagram of
The summing circuit 208 may be implemented with any suitable circuit for summing together the inputs, as indicated, in a sufficient amount of time (e.g., to suitably make the resultant sum, Da, available to flip-flop 210 within a given clock cycle). Its inputs and outputs may be voltage or current mode inputs/outputs, and any suitable summing approach (e.g., analog methodology) could be used. For example, an asynchronous and/or single-stage summing amplifier circuit such as the differential, current-mode digital-to-analog converter architecture, as shown in
The depicted summing circuit 208 has four weighted inputs indicated at α1, α0, α-1, and α-2. Each “alpha” term corresponds to a coefficient value for multiplying (or weighting) its corresponding received input signal. The four inputs: α1, α0, α-1, and α-2 receive signals: inp/inn, D, Db, and Dc, as indicated in
D=α-2*Dc+α-1*Db+α0*D+α1*(inp−inn)
By adding or removing precursor and/or postcursor energy from a sampled bit based on measured and/or determined values of its surrounding bit(s), the expected distorting effects attributable to precursor and/or postcursor energy can be reduced. Accordingly, the coefficient values, (α1, α0, α-1, α-2) may be selected based on measured or expected channel parameters so as to suitably reduce distortion resulting from such postcursor and/or precursor bits. The values may be positive or negative, and may vary depending upon desired performance and operating environments. In addition, they may be fixed or adjustable (e.g., through one time or multi-adjustable settings).
Moreover, while the depicted circuit uses four taps to process a first precursor, a cursor, a first postcursor, and a second postcursor, it should be appreciated that any number of taps to process a desired combination of precursors and/or postcursors could also be employed depending upon design considerations such as performance requirements and operating environment. For example, it has been observed that in some systems, the first postcursor bit (corresponding to Db) may be the most problematic. Thus, in some equalizer embodiments, the α-2 term might be omitted. This may be especially appealing in systems requiring faster summing. Likewise, negligible distortion might arise from incoming precursors, and so the α1 term could also (or alternatively) be omitted. Thus, with different systems and different needs, different combinations of feed-forward and feedback filters can be employed.
Flip-flops 210 and 212 may be implemented with any suitable circuitry in cooperation with the circuits used to implement S/H switches 204, 206 and summing amplifier 208 to receive, determine, store, and pass along bit values, e.g., in response to a clock signal. Such circuits could include but are not limited to switches, latches, flip flops, memory cells and the like. In addition, one or both may serve to “digitize” (or further digitize) its received bit value. That is, the output, Da, at the summing amplifier 208 may not yet be fully digitized (e.g., converted to a suitable signal for a given logic type such as CMOS). Thus, either or both flip-flops 210 and 212 may act as decision slicers (or comparators) to appropriately digitize a received bit value. Along these lines, the depicted blocks could be implemented with any suitable combination of circuits and/or circuit components. For example, the summing amplifier could actually be implemented with one or several amplifiers, e.g., cascaded together. The same general principles apply to the other blocks.
With reference to
The equalizer sections 401A and 401B may be implemented similarly to equalizer 200 from
The evaluated first-summer cursor bit signal (taken at D0b) from the 401A data path is fed into an input of the second summing circuit 408B as its first postcursor bit signal (the second-summer first postcursor bit signal). Likewise, the evaluated second-summer cursor bit signal (taken at D1b) from the 401B data path is fed into an input of the first summing circuit 408A as its first postcursor bit signal (the first-summer first postcursor bit signal). This allows for the first postcursor energies to be reduced. (With some systems, DFE analysis shows that reducing the first postcursor energies may be responsible for half of the distortion reduction within a receiver, e.g., in terms of received signal margin improvement.) In each section, the second latch (412A/412B) holds the value for another half cycle and feeds it back to an additional input of its associated summing circuit as its second postcursor bit value thereby enabling the second postcursor energies to be cancelled.
If a preamplifier is utilized, a single preamplifier 402 can be used to amplify the received bit stream (IN) and provide the amplified signal to the separate equalizer paths (that is, separate pre-amps, if any, are not required.) In addition, with either equalizer section 401A/B, a single S/H switch 404 can be used to sample the bit stream (IN). This is so because the summing of the input signals at a summing amplifier (408A, 408B) occurs in a half-cycle before the resultant sum (D0a, D1a) is clocked through its first latch (410A, 410B). On the other hand, this implies that each summing amplifier should be capable of summing its signals and suitably providing its result within a relatively shorter time duration (i.e., within a half-cycle, as opposed to the full cycle allowed with the summing circuit of equalizer 200). Using a one-stage current summer (such as that shown in
With reference to
As indicated in
(It should be appreciated that while single phase (SDR) and 2-phase (DDR) systems have been shown and discussed, any suitable multi-phase system (e.g., 4, 8, 16) could also be implemented consistent with the principles disclosed herein.)
In some embodiments, to avoid losing output signal swing at settings when not all the current legs are at their maximum values and have the same sign, the gate bias of the current sources (and hence the currents) may be automatically generated by an amplifier that monitors a half circuit replica (not shown). In this way, the output swing is adjustable, and can be set at maximal values regardless of what the equalizer settings are. In some embodiments, the output swing may be limited to ensure sufficient amplifier gain. The gain of the current summer stage can also be adjustable, which may be desirable since higher gain may be needed when a received input signal is small. In the depicted circuit, the current summer is implemented using PMOS differential pairs because the received input signals are referenced to Vss (the lower supply, or “Ground”) in an implemented signaling system. However, with some embodiments, an NMOS equivalent implementation may deliver equal or higher performance.
Because the current summer receives both digital (or digital like) and analog inputs, care may be taken to enhance the accuracy of an offset calibration procedure. For example, separate offset-trim-enable signals may be used for calibrating the offsets due to the differential pairs driven by analog inputs and those driven by digital inputs.
With reference to
It should be noted that the depicted system could be implemented in different forms. That is, it could be implemented in a single chip module, a circuit board, or a chassis having multiple circuit boards. Similarly, it could constitute one or more complete computers or alternatively, it could constitute a component useful within a computing system.
The invention is not limited to the embodiments described, but can be practiced with modification and alteration within the spirit and scope of the appended claims. For example, it should be appreciated that the present invention is applicable for use with all types of semiconductor integrated circuit (“IC”) chips. Examples of these IC chips include but are not limited to processors, controllers, chip set components, programmable logic arrays (PLA), memory chips, network chips, and the like. Similarly, embodiments of the invention may be implemented in a variety of applications including but not limited to short-distance applications such as multiprocessor interconnections, networking and communication switches, memory interfaces, and consumer products with extensive multimedia applications.
Moreover, it should be appreciated that example sizes/models/values/ranges may have been given, although the present invention is not limited to the same. As manufacturing techniques (e.g., photolithography) mature over time, it is expected that devices of smaller size could be manufactured. In addition, well known power/ground connections to IC chips and other components may or may not be shown within the FIGS. for simplicity of illustration and discussion, and so as not to obscure the invention. Furthermore, arrangements may be shown in block diagram form in order to avoid obscuring the invention, and also in view of the fact that specifics with respect to implementation of such block diagram arrangements are highly dependent upon the platform within which the present invention is to be implemented, i.e., such specifics should be well within purview of one skilled in the art. Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the invention, it should be apparent to one skilled in the art that the invention can be practiced without, or with variation of, these specific details. The description is thus to be regarded as illustrative instead of limiting.