The invention relates to computer networks and, more specifically, to precoding data for transmission across computer networks.
Tomlinson-Harashima Precoding (THP) was invented by Tomlinson and Harashima for channel equalization, and has been widely used in DSL systems, voice band and cable modems. Unlike decision feedback equalization where channel equalization takes place at the receive side, THP is a transmitter technique where equalization is performed at the transmitter side. It may eliminate error propagation and allow the use of current capacity-achieving channel codes, such as low-density parity-check (LDPC) codes in a natural way.
THP converts an inter-symbol interference (ISI) channel to a near additive white gaussian noise (AWGN) channel and allows the system to take full advantage of current capacity-achieving error correction codes. Like decision feedback equalizers, Tomlinson-Harashima (TH) precoders contain nonlinear feedback loops, which limit their use for high speed applications. The speed of TH precoders is limited by the sum of the computation times of two additions and one multiplication. Unlike decision feedback equalization where the output levels of the nonlinear devices (quantizers) are finite, in TH precoders the output levels of the modulo devices are infinite, or finite but very large. Thus, it is difficult to apply look-ahead and pre-computation techniques to pipeline TH precoders, which were successfully applied to pipeline Decision Feedback Equalizers (DFEs) in the past.
Recently, TH precoding has been proposed to be used in 10 Gigabit Ethernet over copper. The symbol rate of 10GBASE-T is expected to be around 800 Mega Baud. However, TH precoders contain feedback loops, so it is hard to clock them at such high speed. Thus, the high speed design of TH precoders is of great interest.
In general, the invention relates to techniques for implementing high-speed precoders, such as Tomlinson-Harashima (TH) precoders. In one aspect of the invention, look-ahead techniques are utilized to pipeline a TH precoder, resulting in a high-speed TH precoder. These techniques may be applied to pipeline various types of TH precoders, such as Finite Impulse Response (FIR) precoders and Infinite Impulse Response (IIR) precoders. In another aspect of the invention, parallel processing multiple non-pipelined TH precoders results in a high-speed parallel TH precoder design. Utilization of high-speed TH precoders may enable network providers to for example, operate 10 Gigabit Ethernet with copper cable rather than fiber optic cable.
In one embodiment, a precoder comprises a plurality of pipelined computation units to produce a precoded symbol, wherein one of the computation units performs a modulo operation and feeds back a compensation signal for use as an input to the modulo operation for precoding a subsequent symbol.
In another embodiment, a method comprises performing a modulo operation within one of a plurality of pipelined computational units of a pipelined precoder to produce a precoded symbol, wherein the modulo operation feeds back a compensation signal for use as an input to the modulo operation for precoding a subsequent symbol, and sending a network communication in accordance with the precoded symbol.
In another embodiment, a parallel precoder comprises a plurality of computation units to output signals for at least two precoded symbols in parallel, wherein the computation units includes a first modulo operation unit and a second first modulo operation unit, and wherein the first one of the modulo operation units performs a modulo operation for precoding a first one of the symbols and forwards a compensation signal to the second modulo operation unit for use as an input to a modulo operation for precoding a second one of the symbols.
In another embodiment, a method comprises performing, in parallel, at least two modulo operations for outputting at least two precoded symbols, wherein at least one of the modulo operations produces a compensation signal for use in precoding a subsequent symbol, and outputting a network communication in accordance with the precoded symbols.
In another embodiment, a transceiver for network communications, wherein the transceiver comprises a parallel Tomlinson-Harashima precoder that performs at least two modulo operations in parallel for outputting at least two precoded symbols.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
In the example of
In one embodiment, transmitter 6, located within a first network device (not shown), may transmit data to receiver 14, which may be located within a second network device (not shown). The first network device may also include a receiver substantially similar to receiver 14. The second network device may also include a transmitter substantially similar to transmitter 6. In this way, the first and second network devices may achieve two way communication with each other or other network devices. Examples of network devices that may incorporate transmitter 6 or receiver 14 include desktop computers, laptop computers, network enabled personal digital assistants (PDAs), digital televisions, or network appliances generally.
Precoder 8 may be a high-speed precoder, such as a pipelined Tomlinson-Harashima (TH) precoder or a parallel TH precoder. Utilization of high-speed TH precoders may enable network providers to operate 10 Gigabit Ethernet with copper cable. For example, network providers may operate existing copper cable networks at higher speeds without having to incur the expense of converting copper cables to more expensive media, such as fiber optic cables. Furthermore, in certain embodiments of the invention, the high-speed TH precoder design may reduce hardware overhead of the precoder. Although the invention will be described with respect to TH precoders, it shall be understood that the present invention is not limited in this respect, and that the techniques described herein may apply to other types of precoders.
As described herein, the next step of the pipelining process is to replace IIR filter
(
For example, either a clustered look-ahead approach or a scattered look-ahead approach may be utilized. In both of the approaches, the pipelined filter Hp(z) is obtained by multiplying an appropriate polynomial
to both the numerator and the denominator of the transfer function of the original IIR filter:
The pipelined filter Hp(z) consists of two parts, a FIR filter N(z) and an all-pole pipelined IIR filter
In the case of the clustered look-ahead approach, D(z) may be expressed in the form of
and, for the scattered look-ahead approach D(z) may be represented as
where K is the pipelining level.
(
In order to implement the pipelined design process illustrated in
Next, signal e(z) 66 (
The modulo operation may be described by
t(z)=e(z)+v(z) (7)
thus,
e(z)=t(z)−v(z) (8)
Substitute equation (8) into (6), results in
t(z)−v(z)=N(z)x(z)+(N(z)−1)v(z)+(1+H(z)N(z))t(z). (9)
From equation (9), t(z) is expressed as
At the receiver side, the received signal r(z) is given by
As in a straightforward application of a TH precoder, the received signal consists of the PAM-M signal x(z) and the compensation signal v(z) of multiple of 2M. Thus, the design in
Thus, the error probability performance is the same as a straightforward implementation.
As illustrated in
In a first example of a pipelined precoder, a channel transfer function is H(z)=1+h1z−1+h2z−2. The transfer function He(z) of a zero-forcing pre-equalizer is
A 2-level scattered look-ahead pipelined design of the IIR filter He(z) may be obtained by multiplying N(z)=1−h1z−1+h2z−2 to the numerator and the denominator of He(z)
Apply the technique in
where Tmux is the operation time of a multiplexer. Although described with respect to a multiplexer, other types of selection elements may be used. Assume Tm dominates the computation time, then pipelined precoder 80 may achieve a speedup of 2. In general, for a K-level scattered pipelined design, the iteration bound is given by
For a large-enough pipelining level K, the speed is limited by the term Ta+Tmod+Tmux.
In a second example of a pipelined precoder, the iteration bound of pipelined precoder 80 is improved by reformulating the design of pipelined precoder 80.
For a large K, the speed of pipelined precoder 90 is limited by the term Tmod+Tmux.
For a clustered K-level pipelined design, the iteration bounds in equations (16) and (17) become
respectively.
One drawback associated with precoder pipelined architecture 70 (
Then, the precoder pipelined architecture 70 (
Often, it is more compact to describe a channel with an IIR model
Utilization of the IIR model may be another method of reducing the hardware complexity. However, the corresponding IIR precoder also suffers from a timing problem. The iteration bound of the precoder is
T∞=2Ta+Tm. (22)
Thus, it is desirable to develop pipelined designs to reduce the iteration bound of IIR precoders.
As described above, pipelining techniques, such as the clustered and the scattered look-ahead approaches are applied to remove this bound, resulting in a pipelined equivalent form.
is a pipelining polynomial. Then, the same techniques described in
As an alternative to pipelining precoders, a parallel precoder design may be implemented to generate a high-speed precoder. The parallel precoder design may be one of two designs, either a (1) parallel precoder where the parallelism level L is less or equal to the order of the channel, or a (2) parallel precoder where the parallelism level is larger than the order of the channel.
For the parallel precoder design where the parallelism level L is less or equal to the order of the channel, an inter-symbol interference (ISI) channel is
H(z)=1+h1z−1+h2z2+h3z−3+h4z−4 (23)
and its corresponding pre-equalizer
t(n)=−h1t(n−1)−h2t(n−2)−h3t(n−3)−h4t(n−4)+x(n). (24)
From equation (24), the 2-stage and 3-stage look ahead equations are derived as
Substitute n=3k+3, n=3k+4, and n=3k+5 into equations (25), (26), and (27), respectively, and the following three loop update equations are obtained
From equations (24), (31), (32), and (33), the received signals at time n=3k, n=3k+1, n 3k+2, are
r(3k)=x(3k)+v(3k) (34)
r(3k+1)=x(3k+1)+v(3k+1)+h1v(3k) (35)
r(3k+2)=x(3k+1)+v(3k+2)+h1v(3k+1)+h2v(3k). (36)
In the presence of additive noise, the received signals become
r(3k)=x(3k+3)+v(3k+3)+n(3k+3) (37)
r(3k+1)=x(3k+1)+v(3k+1)+h1v(3k)+n(3k+1) (38)
r(3k+2)=x(3k+2)+v(3k+2)+h1v(3k+1)+h2v(3k)+n(3k+2) (39)
The error probability of
where f is the error probability function for PAM-M modulation. For {circumflex over (x)}(3k+4), there are two causes for a decision error. One is due to the noise n(3k+4), and the corresponding error rate is
The other is due to the decision error on {circumflex over (v)}(3k+3). Since the minimum distance between different levels of the compensation signal is M times that between the transmitted symbols x, the error rate of v(3k+3) may be roughly calculated as
Thus, the error due to n(3k+4) dominates. Furthermore, the error rate of {circumflex over (x)}(3k+4) may be approximated by
Similarly, for a large enough M, the error rate of {circumflex over (x)}(3k+5) may be approximated by
Hence, the performance of the parallel precoder is close to that of a straightforward TH precoder.
For the parallel precoder design where the parallelism level L is larger than the order of the channel, a second order ISI channel is
H(z)=1+h1z−1+h2z−2 (40)
and its corresponding zero-forcing pre-equalizer is
t(n)=−h1t(n−1)−h2t(n'2)+x(n). (41)
Its 2-stage, 3-stage, and 4-stage look-ahead equations may be derived as
Substituting n=4k+4 and n=4k+5 into equations (44) and (45), results in the following loop update equations
The outputs t(4k+2) and t(4k+3) are computed incrementally as follows
t(4k+2)=−h1t(4k+1)−h2t(4k)+x(4k+2) (48)
t(4k+3)=−h1t(4k+2)−h2t(4k+1)+x(4k+3) (49)
The outputs of TH precoder architecture 220 are
At the receiver side, the received signals are derived as:
r(4k+2)=x(4k+2)+v(4k+2) (54)
r(4k+3)=x(4k+3)+v(4k+3) (55)
r(4k+4)=x(4k+4)+v(4k+4)+h1v(4k+3)+h2v(4k+2) (56)
r(4k+5)=x(4k+5)+v(4k+5)+h1v(4k+4)+h2v(4k+3)−h1h2v(4k+2) (57)
In the presence of additive noise, the received signals become
respectively.
The error probability of {circumflex over (x)}(4k+2) and
The error probability of {circumflex over (x)}(4k+4) and {circumflex over (x)}(4k+5) may also be approximated by
Hence, the performance of the parallel precoder is close to that of a straightforward TH precoder.
Various embodiments of the invention have been described. These and other embodiments are within the scope of the following claims.
This application claims the benefit of U.S. Provisional Application No. 60/609,289 to Parhi et al., entitled “PIPELINED AND PARALLEL TOMLINSON-HARASHIMA PRECODERS,” filed Sep. 13, 2004, and U.S. Provisional Application No. ______, to Parhi et al., entitled “PIPELINED AND PARALLEL TOMLINSON-HARASHIMA PRECODERS,” having attorney docket no. 1008-031USP2, filed Sep. 9, 2005, the entire contents of each being incorporated herein by reference.
The invention was made with Government support from the National Science Foundation No.CCF-0429979. The Government may have certain rights in this invention.
Number | Date | Country | |
---|---|---|---|
60609289 | Sep 2004 | US | |
60715672 | Sep 2005 | US |