The present invention relates to communication circuits, and more particularly, a circuit for adaptive equalization of a communication channel.
Communication of signals over backplane 110 may be modeled by transmission line theory. Often, the signaling is based upon differential signaling, whereby a single bit of information is represented by a differential voltage. For example,
For short-haul communication, such as for the computer server in
However, every transmission line has a finite bandwidth, and for signal bandwidths that are comparable to or exceed the transmission line (channel) bandwidth, intersymbol interference may present a problem. Furthermore, actual transmission lines may have dispersion, whereby different spectral portions of a signal travel at different speeds. This may result in pulse spreading, again leading to intersymbol interference. As a practical example, for high data rates such as 10 Gbps (Giga bits per second), the transmission lines used with backplanes or motherboards are such that intersymbol interference is present.
Channel equalization is a method in which one or more filters are employed to equalize the channel to help mitigate intersymbol interference. These filters may be sampled-data (discrete-time) filters, where a time index t is a discrete variable, or they may be continuous-time filters, where the time index is a continuous variable. Many channel equalizers are realized by a Finite Impulse Response (FIR) filter employed at the receiver. A FIR is a sampled-data filter.
In many adaptive equalization schemes, the filter vector is updated during a training time interval, and then remains fixed for some period of time. During training, a known sequence is transmitted over a communication channel to the receiver, and the filter vector is synthesized during the training time interval. Many algorithms have been developed to synthesize the filter vector. The well-known LMS (Least Mean Square) algorithm is an iterative technique based upon the method of steepest descent (gradient) to minimize a squared error.
The LMS algorithm may be written as the following iterative procedure performed during the training time interval:
{overscore (h)}(t+1)={overscore (h)}(t)+μ[Kd(t)−z(t)]{overscore (x)}(t),
where {overscore (x)}(t) is a n dimensional received data vector with components given by [{overscore (x)}(t)]i=x(t−i) for i=0, 1, . . . , n−1, μ is a positive weight determining the filter “memory” or “window size” and may be viewed as the step-size in the steepest descent method, d(t) represents the known transmitted data during the training time interval (the training sequence), and K is a positive scale factor. The above iteration is performed during a training time interval t=1, . . . , T, where an initial {overscore (h)}(0) is chosen and at the end of the training time interval, the filter vector {overscore (h)} is set equal to {overscore (h)}(T), i.e., {overscore (h)}={overscore (h)}(T).
In many analog implementations the filter weights assume discrete values limited to some fixed range, and the scale factor K takes into account this finite range of the filter weights as well as practical implementations of the filtering. For example, for bipolar differential signaling the known training sequence d(t) may take on either of the values VCC or −VCC, where VCC is a supply voltage, but in practice the filtered output z(t) is always in magnitude less than the supply voltage. In this case, K<1.
To simplify the computations needed to synthesize the filter vector, the so-called sign-sign-LMS algorithm has been used:
{overscore (h)}(t+1)={overscore (h)}(t)+μsgn{[Kd(t)−z(t)]}sgn{{overscore (h)}(t)},
where sgn{ } is the sign function. The sign-sign-LMS algorithm, although widely used due to its simplicity of implementation, has been found to have some undesirable properties when used in adaptive high-speed equalizers with relatively low to moderate length words (e.g., four to six bits) for the filter weights. Because the probability of filter weight update during adaptation is high, there is a significant amount of residual noise in the filter weights, even after convergence of the algorithm. This residual noise may be reduced by choosing a longer window (smaller μ), but this increases the convergence time, or in other words, the training time interval. Another disadvantage found in many instances is that the converged filter weights are relatively sensitive to the scale factor K, whose optimum value has been found to be difficult to determine.
a illustrates differential signaling on two transmission lines.
b shows typical voltage curves representing differential signaling on the transmission lines of
{overscore (h)}(t+1)={overscore (h)}(t)+μ[sgn{d(t)}−sgn{z(t)−Kd(t)}]sgn{{overscore (x)}(t)}, (1)
which may be termed the conditional update sign-sign LMS algorithm. Equalizer 402 is a FIR filter with filter weights [{overscore (h)}(t)]i, i=0, 1, . . . , n−1, and provides the filtered output
Each filter weight is represented by a set of discrete voltages taking on either 0 or VCC (LOW or HIGH), so that each filter weight may be viewed as a discrete variable in the digital domain represented by a finite number of bits. However, the filtering is performed in the analog domain so that the filtered output z(t) is an analog voltage signal. If differential signaling is employed, then both analog voltages x(t) and z(t) are differential signals. An example of FIR 402 at the circuit level will be described later. Not shown (for simplicity) in
Data symbol generator 404 provides a sequence of voltages in the digital domain representing sgn{d(t)}. Because sgn{ } obviously assumes only one of two values, the sequence of voltages representing sgn{d(t)} may be viewed as a sequence of binary values 0 or 1, which may be stored in a memory unit or generated by a finite state machine. Data symbol generator 404 also provides a sequence of voltages in the digital domain representing d(t). In digital communications, the transmitted symbols range over a finite set, so that (remembering that d(t) is the transmitted training sequence) each d(t) may be represented by a finite number of bits. These bits may be stored in memory, or generated by a finite machine. In the case of bipolar signaling, only one bit of information is needed to represent d(t), so the same sequence used to represent sgn{d(t)} may also represent d(t). Circuits for data symbol generator 404 are relatively straight forward to synthesize. For example, in the bipolar signaling case, it is well known that the use of feedback shift registers may be used to generate the binary bits representing the data d(t).
Symbol-to-voltage converter 406 converts the digital domain voltages representing d(t) into analog voltages representing d(t). For example, consider the case in which differential bipolar signaling is employed. In this case, only one bit of information, for example a voltage 0 or VCC, is needed to represent d(t) and is provided to the input of symbol-to-voltage converter 406, and two voltages are used to represent the differential output voltage of symbol-to-voltage converter 406. For example, in this case, symbol-to-voltage converter 406 may provide the differential voltage [0, VCC] in response to one of its input voltages, say 0, and the differential voltage [VCC, 0] in response to the input voltage VCC. Such a circuit is of course obvious, and for the more general case, circuits for symbol-to-voltage converter 406 are straightforward to implement.
Multiplier 408 multiplies the analog voltage d(t) by the negative of the scale factor, −K. This multiplication is performed in the analog domain, but just as for the filter {overscore (h)}(t), the scale factor is represented by a set of discrete voltages taking on the values 0 or VCC (LOW or HIGH), so that K is represented by a finite number of bits. Summer 410 adds the analog voltage −Kd(t) to the filtered output z(t), which is provided to the input of comparator 412. The multiplier 408 and summer 410 have the same structure as a filter tap, so that an example circuit for multiplier 408 and summer 410 is described later in connection with FIR 402.
Comparator 412 provides a logic output signal indicative of the difference of its two input signals (each input signal is part of a differential voltage in the differential signaling case), so that it outputs sgn{z(t)−Kd(t)} as a discrete voltage 0 or VCC (LOW or HIGH) in the digital domain. (It is immaterial whether comparator 412 evaluates sgn{0} as a LOW or HIGH voltage.) Comparator 412 also includes a latch for latching its output voltage. (For simplicity, clock signals are not shown in
At the end of the training sequence, the filter weights for equalization are given by {overscore (h)}(T), in which case FIR 402 may be disconnected from the circuit so as to provide the equalized output z(t) for t>T.
Multipliers 414 and 418, summers 416 and 420, and unit delay shift 422 operate in the digital domain, whereas the multiplications and summation performed within FIR 402, as well as summer 410 and multiplier 408, are performed in the analog domain. It should be remembered that the multiplication weights {overscore (h)}(t) and K in
A circuit to implement multiplication in the analog domain utilizing differential signals is shown in
The voltage developed at nodes 510 and 512 are, respectively, ZLγ[{overscore (h)}]iI− and ZLγ[{overscore (h)}]iI+, where ZL is the impedance of loads 506 and 508. The difference in voltages developed at nodes 512 and 510 is given by αZLγ[{overscore (h)}]i(V+−V−). The difference in voltages developed at node 512 and 512 is seen to be proportional to the desired multiplication [{overscore (h)}]i(V+−V−) where the proportionality is the dimensionless scalar αZLγ. This dimensionless scalar is not of theoretical concern because it is taken into account by the scale factor K when performing the filter weight update.
An example of voltage-to-current converter 502 and current steering DAC 504 at the circuit level is shown in
Referring now to voltage-to-current converter 502 in
The betas of pMOSFET 608 and 610 may be chosen such that the active cascode configuration of pMOSFETs 608 and 610 forces pMOSFET 608 to operate in the triode region when pMOSFET 610 is in its active region. A similar statement applies to the combination of pMOSFETs 612 and 614. This may be observed as follows. Let VS2 denote the source voltage of pMOSFET 610 and VS1 denote the source voltage of pMOSFET 608. With pMOSFET 610 in its active region, VS2>Vg+|VT|, where Vg is the gate voltage and VT is the threshold voltage. (For simplicity, we take the threshold voltage to be the same for pMOSFET 608 and 610.) Simple manipulation of the previous inequality yields VGT>VSD, where VGT is defined as VS1−Vg−|VT| and VSD is the source-drain voltage VS1−VS2, which indicates that pMOSFET 608 operates in its triode region.
With pMOSFETs 608 and 612 operating in the triode region, they act approximately as resistors to degenerate pMOSFETs 610 and 614, respectively. Degeneration provides a relationship between the voltage difference x+(t)−x−(t) and the current difference I+−I− that is linear over a wider range than if degeneration was not present. (I+ is the drain-source current for pMOSFETs 608 and 610, and I− is the drain-source current of pMOSFETs 612 and 614, respectively.) This is seen by considering a simple low frequency small-signal T-model for the active cascode voltage-to-current converter of
To simplify the discussion of how multiplication is performed, we may without loss of generality normalize the filter weights to integers. This is so because γ[{overscore (h)}]i will always be bounded by one. That is, y absorbs any normalization constant. With this convention, [{overscore (h)}]i=2b1+b0. Referring now to current steering DAC 504 in
The multiplier circuit structure represented by current source 602, voltage-to-current converter 502, and current steering DAC 504 is repeated for each tap weight in FIR 402 of
The addition operation indicated by summer 410 is implemented by connecting the output ports of each current steering DAC for FIR 402 and multiplier 408 to loads 506 and 508. Which particular loads these output ports are connected to determine the sign of the multiplier. For example, without loss of generality, we may take a positive multiplicative weight to be implemented by connecting the output port in the I+ current path to load 508 and the output port in the I− current path to load 506. Then, to implement multiplication by a negative weight, the output port in the I+ current path is connected to load 506 and the output port in the I− current path to load 508.
The combination of loads 506 and 508 (
If more current is sourced to input port 702 than is sourced to input port 704, then the output voltage at output port 706 increases, and the output voltage at output port 708 decreases. Cross coupled pMOSFETs 718 and 720 are connected as a latch, so that the differential voltages developed at output ports 706 and 708 are amplified to a complementary logic levels. The resulting complementary voltages may both be used in subsequent digital processing, or only one of the complementary voltages may be used. For example, dual rail logic may be employed in some or all of the subsequent digital processing.
An argument similar to that which was made with respect to the voltage-to-current converter in
As discussed earlier, the scale factor K takes into account various scaling factors due to the communication channel and circuit implementation. The scale factor K affects the available noise margin, so it is important to set its value appropriately. The optimal value for K depends on the communication channel characteristics, which usually are not known a priori. A relatively simple method for calibrating K making use of a received training sequence is shown in the flow diagram of
In block 802, K is initialized to zero. In block 804, the filter is updated over a training sequence. During this update, a count is made of the number of overflows in the components of the filter vector. In block 806, the overflow count is compared to a threshold λ. If the overflow count is greater than λ, control is brought to block 808, in which case the current value for K is decremented by Δ and the calibration ends. If the overflow count is less than λ, control is brought to block 810 where K is incremented by Δ, and control is brought to block 804 to begin another update sequence for the filter weights with the new value for K. Block 806 may be modified to shift control to block 808 if the overflow count is greater than or equal to the threshold λ.
The overflow count threshold λ and scale factor calibration increment Δ are chosen before calibration is performed. These scalars may be determined offline via simulations with expected communication channels. In some embodiments, the overflow count threshold may be set to a value less than 1% of the length of the training sequence. The calibration increment trades off coarseness with the number of times block 804 is performed. In practice, calibration of the scale factor need not be performed very often, and may in some instances be performed only once for a communication channel.
A hardware implementation for the calibration method of
Summer 914 is similar to summer 420 in
Functional units 908, 910, and 912 perform either block 808 or block 810. Delay element 912 indicates a delay of L, which denotes the length of a training sequence. If at the end of a training sequence the overflow counter indicates that the overflow count is greater than the threshold λ, FSM 906 sends a control signal to multiplexer 908 so that −Δ is added to the current value of K for use in the next training sequence. Otherwise, Δ is added to the current value of K. As discussed with respect to block 806, this procedure may be modified so that −Δ is added when the overflow count is greater than or equal to the threshold λ.
The method outlined in
Many modifications may be made to the disclosed embodiments without departing from the scope of the invention as claimed below. For example, in the case of single-ended signaling, one of the input voltages to the various voltage-to-current converters is held constant instead of being part of a differential signal. Furthermore, the circuit structure disclosed here to implement weight multiplication in the analog domain merely serves as an example, and is not meant to limit the scope of the invention in any way. For example, it is to be noted that the channel width-to-length ratios for the pMOSFETs used in the current multiplier need not be powers of two. That is, it is not necessary that multiplication be performed in binary arithmetic. Also, the duals to the disclosed circuits may be implemented, where nMOSFETs replace pMOSFETs, and pMOSFETs replace nMOSFETs. Other circuit structures for performing multiplication and summation in the analog domain may be employed. Indeed, the multiplication and summation may also be performed entirely in the digital domain, although such circuit implementations, which would require analog-to-digital converters, may at this time not be practical for high-speed communication channels.
The partitioning of a circuit into simpler, functional units is somewhat arbitrary, and the particular functional units disclosed here are not meant to limit the scope of the invention. Various functional units may be combined into more complicated functional units, and functional units may be partitioned into simpler functional units. An example is the combining of functional units 404 and 406 of
It is also to be understood in the claims below that the various summers and multipliers, whether realized in the digital domain or analog domain, may perform their indicated operations only approximately. For example, in the analog domain, it is not possible to match transistors exactly, so that a multiplication is not exact. Or, for example, in the digital domain there may be a numerical overflow.
Regarding calibration of the scale factor K, training sequences for the calibration may be a repetition of a training sequence, or segments of one long training sequence. Furthermore, the increment used in block 808 of