The present invention relates to data processing and transmission. More particularly, it relates to parallel Tomlinson-Harashima precoding of data and parallel Tomlinson-Harashima precoders.
Tomlinson-Harashima precoding (TH preceding) is a transmitter equalization technique where equalization is performed at the transmitter side, and has been widely used in many communication systems. It can eliminate error propagation and allows use of capacity-achieving channel codes, such as low-density parity-check (LDPC) codes, in a natural way.
Recently, TH precoding has been proposed to be used in 10 Gigabit Ethernet over copper (10GBASE-T). The symbol rate of 10GBASE-T is 800 Mega Baud. However, a TH precoder contains feedback loops, and it may be impossible to clock the straightforward implementation of the TH precoder at such high speed. Thus, high speed design of TH precoders is of great interest.
How to design a fast TH precoder is a challenging task. The architecture of a TH precoder is similar to that of a DFE (decision feedback equalizer). The only difference is that a quantizer in the DFE is replaced with a modulo device in the TH precoder. In a PAM-M (M-level pulse amplitude modulation) system, the number of different outputs of the quantizer in the DFE is finite, which is usually equal to the size of the symbol alphabet, i.e., M. However, theoretically, the number of different outputs of the modulo device in the TH precoder is infinite for a floating-point implementation. For a fixed-point implementation, it is exponential with the wordlength. In some applications, the wordlength can be very large. Thus, many known techniques exploiting the property of finite-level outputs of the nonlinear element in the DFE, such as the pre-computation technique (See, e.g., in K. K. Parhi, “Pipelining in algorithms with quantizer loops,” IEEE Trans. on Circuits and Systems, vol. 37, no. 7, pp. 745-754, July 1991), cannot be directly applied to pipeline the TH precoder. Furthermore, the use of look-ahead techniques in the TH precoder, such as those for pipelining IIR filters (See, e.g., K. K. Parhi and D. G. Messerschmitt, “Pipeline interleaving and parallelism in recursive digital filters, Part I and Part II,” IEEE Trans. Acoust., Speech, Signal Processing, pp. 1099-1135, July 1989), is not straightforward as the TH precoder contains nonlinear elements in its feedback loops.
What is needed is a fast TH precoder and a method for designing the same, which can fully exploit the properties of a TH precoder.
The present invention provides a fast TH precoder through parallel processing and a method for designing the parallel TH precoder.
In accordance with the present invention, a TH precoder is first converted to its equivalent form where the TH precoder can be viewed as an IIR filter with an input equal to the sum of the original input to the TH precoder and a finite-level compensation signal. Next, a parallel IIR filter is obtained by applying classical look-ahead techniques to the equivalent IIR filter. Then, the resulting parallel IIR filter is reformulated as an intermediate parallel Tomlinson-Harashima precoder by removing the compensation signal as an explicit input to the IIR filter. Finally, precomputation technique is applied to the intermediate design, resulting in a parallel Tomlinson-Harashima precoder.
Further embodiments, features, and advantages of the present invention, as well as the structure and operation of the various embodiments of the present invention are described in detail below with reference to accompanying drawings.
The present invention is described with reference to the accompanying figures. In the figures, like reference numbers indicate identical or functionally similar elements. Additionally, the left-most digit or digits of a reference number identify the figure in which the reference number first appears. The accompanying figures, which are incorporated herein and form part of the specification, illustrate the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the relevant art to make and use the invention.
a illustrates a zero-forcing pre-equalization function.
b illustrates a Tomlinson-Harashima (TH) precoding function.
c illustrates an equivalent form of a TH precoder function.
Consider a discrete-time channel
where LH is the channel memory length. We assume that the model is known at the transmitter side. We also assume that the transmitted symbols are PAM-M symbols, where the symbol set is {±1, ±3, . . . , ±(M−1)}. To remove inter-symbol interference (ISI), we can use zero-forcing pre-equalization, which basically implements the inverse of the channel transfer function at the transmitter side, as illustrated in
Tomlinson and Harashima (See, M. Tomlinson, “New automatic equalizer employing modulo arithmetic,” Electron. Lett., vol. 7, pp. 138-139, March 1971; and H. Harashima and H. Miyakawa, “Matched-transmission technique for channels with intersymbol interference,” IEEE Trans. Commun., vol. 20, pp. 774-780, August 1972) proposed to limit the output dynamic range by using a nonlinear modulo device in the feedforward path of the pre-equalizer, as shown in
The received signal is
and X(z) can be recovered from R(z) by performing a modulo operation. An important property of v(n) is that it only has finite levels since v(n) is a multiple of 2M and |v(n)|≦(1+Σi=1L
TCritical=2Ta+Tm+Tmod, EQ.(4)
where Ta, Tm and Tmod denote the computation times of an addition, a multiplication and a modulo operation, respectively (Note: Tmod=0 when M is a power of 2). From the figure, we can see that the iteration bound, T∞ (For the definition of iteration bound, please see K. K. Parhi, VLSI Digital Signal Processing Systems Design and Implementation, John Wiley & Son, Inc., New York, 1999), of the architecture is also equal to TCritical, i.e.,
T∞=TCritical=2Ta+Tm+Tmod. EQ.(5)
The achievable minimum clock period of this architecture is limited by T∞, i.e., we cannot operate the precoder at a speed higher than 1/T∞. Classical high-speed design techniques such as retiming and unfolding cannot be used to achieve higher speed since the iteration bound is a fundamental limit. Thus it is important to develop techniques to design a fast TH precoder.
A Method to Design Parallel Tomlinson-Harashima Precoders
As shown in
In the present invention, the first step to design a parallel TH precoder is to convert the original TH precoder to its equivalent form. Next, the classical clustered look-ahead technique is applied to the equivalent form to obtain a parallel IIR filter. The parallel IIR filter requires the compensation signal as an explicit input. To explicitly remove the compensation signal as an input, modulo devices are re-introduced to the parallel IIR filter, resulting in an intermediate parallel TH precoder. The intermediate parallel precoder still has a very long critical path. To reduce the critical path, the precomputation technique is applied. For a 2-parallel design, the resulting final architecture can achieve a speedup of about 2.
Let us look at an example where we want to design a 2-parallel TH precoder. Consider a 2nd-order inter-symbol interference (ISI) channel described by an FIR (finite impulse response) model
H(z)=1+h1z−1+h2z−2. EQ.(6)
The corresponding FIR TH precoder can be described as
t(n)=MOD(−h1t(n−1)−h2t(n−2)+x(n),2M), EQ.(7)
where MOD(*, 2M) is a modulo operation by 2M.
The equivalent form of the TH precoder in EQ. (7) can be represented as:
t(n)=−h1t(n−1)−h2t(n−2)+x(n)+v(n), EQ.(8)
where v(n) is a compensation signal. The 2-stage look-ahead equation of EQ. (8) can be obtained by the clustered look-ahead technique (See, e.g., K. K. Parhi, VLSI Digital Signal Processing Systems Design and Implementation, John Wiley & Son, Inc., New York, 1999):
The parallel IIR (infinite impulse response) system can be obtained by substituting n=2k+1 and n=2k+2 into EQ. (8) and EQ. (9), respectively, and is described by:
v(2k+1) and v(2k+2) can be removed as explicit inputs to the above parallel IIR filter by re-introducing a modulo operation as follows, resulting in an intermediate parallel TH precoder:
TCritical=2Ta+Tm+Tmod+Tmux, EQ.(12)
where Tmux is the computation time of a multiplexer. The critical path in the parallel design is only one multiplexing operation longer than that in the straightforward architecture in
The present method to design parallel TH precoders can be used to design parallel precoder for order more than 2 and parallelism level more than 2. It can be also used to design parallel IIR TH precoders.
Let us look at an example where we want to design a 2-parallel TH precoder for a 2nd-order ISI IIR channel
The corresponding TH precoder can be described as
t(n)=MOD(x(n)−f(n),2M), EQ.(14)
where f(n) is the inverse z-transform of (H(z)−1)T(z). Its straightforward architecture is shown in
TCritical=4Ta+2Tm+Tmod, EQ.(15)
and the iteration bound, T∞, of the architecture is
T∞=3Ta+Tm+Tmod. EQ.(16)
The inherent speed is limited by the iteration bound.
The equivalent form of the IIR TH precoder in EQ. (14) can be represented as:
If we define w(n)≡x(n)+a1x(n−1)+a2x(n−2), then EQ. (17) becomes
The 2-stage look-ahead equation of EQ. (18) can be obtained by substituting t(n−1) into EQ. (18):
The corresponding parallel IIR system can be obtained by substituting n=2k+1 and n=2k+2 into equation EQ. (18) and EQ. (19), respectively, and is described by:
v(2k+1) in EQ. (20) and v(2k+2) in EQ. (21) can be removed by re-introducing modulo operations as follows:
If the compensation signal v(2k+1) in
Tcritical=4Ta+Tm+2Tmod+Tmux. EQ.(24)
The parallel design every time processes two samples and computes two outputs, so we can achieve a sample period
TSample=2Ta+Tm/2+Tmod+Tmux/2. EQ.(25)
The computation of a multiplier is usually much longer than those of an adder and a multiplexer, and hence speedup is achieved.
Complexity and Critical Path Comparison
In this section, we compare the complexity and critical path for a straightforward L-tap FIR TH precoder (Straightforward-THP), its corresponding 2-parallel design (2-Para-THP) and 3-parallel design (3-Para-THP).
Table 1 compares the complexity for the straightforward L-tap FIR THP, 2-Para-THP and 3-Para THP. In this table, we assume that the number of possibilities of the compensation signal is N. The straightforward THP needs L multipliers, 2 adders and one modulo device. The 2-Para-THP needs 2L+1 multipliers. Among the 2L+1 multipliers, 2L multipliers are used for loop update for the two-parallel outputs t(2k−1) and t(2k). In
For a 3-parallel TH precoder, we need 3L+2 multipliers, 3L+2N+2N2 adders, 1+N+N2 modulo devices, one W-bit N-to-1 mux and one W-bit N2-to-1 mux.
Table 1 also lists the critical paths for the straightforward THP, 2-Para-THP and 3-Para-THP, which are 2Ta+Tm+Tmod, 2Ta+Tm+Tmod+Tmux, and 3Ta+Tm+Tmod+2Tmux, respectively.
Table 2 compares the complexity and the critical path for the straightforward L-th order IIR TH precoder (Straightforward-THP), its corresponding 2-parallel design (2-Para-THP) and 3-parallel design (3-Para-THP).
A method to design parallel Tomlinson-Harashima precoders based on classical look-ahead and precomputation techniques and properties of Tomlinson-Harashima precoders. The resulting parallel TH precoders can be used for high-speed communication applications, such as 10 Gigabit Ethernet over copper.
While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be understood by those skilled in the art that various changes in form and details can be made therein without departing from the spirit and scope of the invention as defined in the appended claims. Thus, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.
This invention was made with Government support under the SBIR grant # DMI-0441632, awarded by the National Science Foundation. The Government has certain rights in this invention.
Number | Name | Date | Kind |
---|---|---|---|
20020122503 | Agazzi | Sep 2002 | A1 |
20030086515 | Trans et al. | May 2003 | A1 |
20060056521 | Parhi | Mar 2006 | A1 |
20070014345 | Gu et al. | Jan 2007 | A1 |
Number | Date | Country | |
---|---|---|---|
20070014380 A1 | Jan 2007 | US |