Method and apparatus for the efficient implementation of a totally general convolutional interleaver in DMT-based xDSL systems

Description

FIELD OF THE INVENTION

The present invention relates generally to the telecommunications and networking fields. More specifically, the present invention relates to a method and apparatus for the efficient implementation of a totally general convolutional interleaver in a discrete multi-tone (DMT)-based digital subscriber line (xDSL) system, such as a modem or the like, that uses forward error correction (FEC) and convolutional interleaving to combat the effects of impulse noise and the like.

BACKGROUND OF THE INVENTION

Conventional high-speed communications on copper media (e.g. standard telephone lines) and the like utilize DMT technology and are bundled under the umbrella of xDSL. Several variants of this technology are currently deployed, namely asymmetric digital subscriber line (ADSL), asymmetric digital subscriber line 2 (ADSL2), asymmetric digital subscriber line 2 plus (ADSL2plus), and very high-speed digital subscriber line (VDSL). Some of these technologies are standardized by the International Telecommunications Union (ITU), Geneva, as follows: “ITU-T Recommendation G992.1, Asymmetric Digital Subscriber Line (ADSL),” “ITU-T Recommendation G992.3, Asymmetric Digital Subscriber Line Transceivers 2 (ADSL2),” “ITU-T Recommendation G992.5, Asymmetric Digital Subscriber Line (ADSL) Transceivers—Extended Bandwidth ADSL2 (ADSL2plus),” and “ITU-T Recommendation G993.1, Very High-Speed Asymmetric Digital Subscriber Line (VDSL) Transceivers.” Future technologies are the subject of ongoing standardization efforts.

One key feature of such xDSL systems is the use of FEC to combat the effects of impulse noise and the like. To enhance the effectiveness of FEC, a convolutional interleaver is utilized to spread error patterns over a plurality of DMT symbols, thus allowing for the correction of errors without introducing excessive redundancy, and hence overhead. The convolutional interleaver is defined by the following relationship:

Δj=(D−1)j, j=1, . . . ,I−1,

where Δj is the distance between two interleaved bytes, D is the interleaver depth in bytes, and I is the interleaver block size in bytes.

A necessary condition of such a convolutional interleaver is that D and I must be co-prime (i.e. have no common divisor). This is enforced in several different ways:

in ADSL D=2ⁿ, I=N=odd integer, and
in VDSL D=M·I+1, with N=q·I,

where q is an integer. A generalized form of the above VDSL convolutional interleaver has also been considered where:

in any DSL D=M·I+x, with N=q·I, x=1, . . . ,I−1,

with the constraint that x is chosen such that D and I are co-prime.

The VDSL form of the convolutional interleaver wherein:

D=M·I+1, with N=q·I

has been referred to as “triangular” due to an implementation known to those of ordinary skill in the art utilizing shift registers of varying sizes in a triangular pattern. Such a convolutional interleaver needs only (D−1)*(I−1)/2 memory locations. However, in all other cases, and in the most general case where there is no structural relationship between N and D (for example, when N and D are co-prime, or when N is prime and is greater than D), this method cannot be applied.

Thus, what is needed is an improved method and apparatus for implementing a general convolutional interleaver, with no constraints, in an efficient manner, using (D−1)*(I−1)/2 memory locations for the interleaved data in all cases.

BRIEF SUMMARY OF THE INVENTION

In various exemplary embodiments, the present invention provides an improved method and apparatus for implementing a general convolutional interleaver, with no constraints, in an efficient manner, using (D−1)*(I−1)/2 memory locations for the interleaved data in all cases.

In one exemplary embodiment of the present invention, a method for implementing a general convolutional interleaver, with no constraints, in an efficient manner, using (D−1)*(I−1)/2 memory locations for the interleaved data in all cases, includes: dividing an incoming data stream into blocks of I bytes; mapping each member of a block into a set of first-in, first-out shift registers (FIFOs) arranged in rows, wherein the number of elements in a row j is given by:

nd(j)=int(j·D/I), j=0, . . . ,I−1

wherein int(x) is an integer part of x; wherein, as each element is entered, a FIFO is shifted to the right and a last element is read out to an output stream; and wherein the order in which the elements are read is different from the order in which they are written.

In another specific embodiment of the present invention, an apparatus for implementing a general convolutional interleaver, with no constraints, in an efficient manner, using (D−1)*(I−1)/2 memory locations for the interleaved data in all cases, includes: means for dividing an incoming data stream into blocks of I bytes; means for mapping each member of a block into a set of first-in, first-out shift registers (FIFOs) arranged in rows, wherein the number of elements in a row j is given by:

nd(j)=int(j·D/I), j=0, . . . ,I−1,

wherein int(x) is an integer part of x; wherein, as each element is entered, a FIFO is shifted to the right and a last element is read out to an output stream; and wherein the order in which the elements are read is different from the order in which they are written.

Preferably, the apparatus of the present invention is an xDSL modem or the like, and the method of the present invention is implemented thereon.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides an improved method and apparatus for implementing a general convolutional interleaver, with no constraints, in an efficient manner, using (D−1)*(I−1)/2 memory locations for the interleaved data in all cases.

Considering the general case where I=N, it is assumed that D and I are given and that they are co-prime. The method starts by dividing an incoming data stream into blocks of I bytes. Each member of a block is mapped into a set of first-in, first-out shift registers (FIFOs) arranged in rows, where the number of elements in row j is given by:

nd(j)=int(j·D/I),j=0, . . . ,I−1,

where int(x) is the integer part of x.

As the next element is entered, the FIFO is shifted to the right and the last element is read out to the output stream. However, the order in which the elements are read is different from the order in which they are written. The indices of the rows read is given by id(j):

r(j)=rem(j·D/I)=j·D−nd)·I, j=0, . . . ,I′1, and
id(r(j))=j, j=0, . . . ,I−1.

For those rows where nd(j)=0, no data is stored, but the input data is directly passed to the output. This process is illustrated in the following simple example. Let D=4 and N=I=7. In this case:

nd=0 0 1 1 2 2 3,
id=0246135.

Let the input data stream be x₀, x₁, . . . , and the output data stream be y₀, y₁, . . . . A read-before-write strategy is implemented, where the FIFO output is read before the next element is input and the FIFO is shifted. Assuming that the FIFO is empty at the beginning, ors 0, 2, 4, 6, 1, 3, and 5 are read, in that order. Since nd(0)=0, the input data is directly passed to the output, so the first output sample is y₀=x₀. Rows 2, 4, and 6 have nothing in the last element of the FIFO, so y is zero for these. Row 1 is read next and nd(1)=0, so again the input is passed to the output for this case. Afetr one cycle of seven samples:

y₀:y₆=[x₀0 0 0 x₁0 0].

The next seven samples of x are then input to the FIFO, where the first and second rows contain zero elements. Thus, these are not stored as they have already been passed to the output. After this cycle, the FIFO looks like this:

row012x₂3x₃4x₄05x₅06x₆00,

where the numbering of rows includes the zero-length FIFOs. Reading out the next set of samples provides:

y₇:y₁₃=[x₇x₂0 0 x₈x₃0]

which corresponds to reading the last elements in rows 0, 2, 4, 6, 1, 3, and 5 and passing the next input for rows 0 and 1 directly to the output. This is followed by a write cycle of seven elements, resulting in the following FIFO contents:

row012x₉3x₁₀4x₁₁x₄5x₁₂x₅6x₁₃x₆0.

The next read cycle would then give the following output:

y₁₄:y₂₀=[x₁₄x₉x₄0 x₁₅x₁₀x₅].

Note that the total umber of non-zero FIFO locations is (D−1)*(N−1)/2=9, as expected.

It will be apparent to those of ordinary skill in the art that the above method could be implemented directly in an integrated circuit device using shift registers, as defined above. In such an implementation, the shift registers have to be defined for the worst case of D and I, and if smaller values are used, the extra stages are not used. This leads to a complicated control mechanism for controlling the size of the individual shift registers used as the convolutional interleaver is reconfigured. A more flexible implementation is obtained if the shift registers are mapped to a general memory structure, as described below.

To map the contents of the FIFOs to a linear memory array, two pointers are formed—a write pointer offset to write the data to the memory and a read pointer offset to read the data. For each block, the pointers cycle through I values. The write pointer offset is defined simply as the number of elements in each row of the FIFOs:

dwp(j)=int(j·D/I), j=0, . . . ,I−1,

and the read pointer offset is defined as:

drp(k)=summation (j=0 to id(k)−1) dwp(j), if id(k)≧1, and =0 if id(k)=0.

In addition, a flag is defined to indicate if the target row to be read has zero elements, as follows:

fl(k)=1 if wp(id(k))≠0, and =0 if wp(id(k))=0.

The process starts by setting wp to zero. I bytes are then read from the memory at the locations specified by the read pointer, except that reads corresponding to rows with zero bytes (dwp=0) are taken directly from the input stream.

Designating the next input from the input stream as “in” and the next output to the output stream as “out”, the read operation becomes:

for j = 0 : I − 1 if (fl(j) = 0) out = in; endif rp = b + (wp + drp(j))_ml out = mem(rp)endfor,

where ml is the size of the memory (D−1)*(I−1)/2, b is the first location of the memory, and (x)_mstands for the modulo operation—the remainder after x is divided by m.

I bytes are next written to the memory at locations specified by a write pointer, with the exception that no data is written for rows corresponding to dwp=0. Thus, the write operation becomes:

for j = 0 : I − 1 if (dwp(j) ≠ 0) wp = b + (wp + dwp(j)))_ml mem(wp) = in endifendfor.

Note that, at the end of the write cycle, wp returns to its original value because:

summation(j=0 to I−1)int(D/I)=ml.

At this point, wp is incremented by 1 modulo ml and the cycle is repeated.

Illustrating this process with the above example:

D=4
I=7
ml=9
dwp=0 0 1 1 2 2 3
drp=0 0 2 6 0 1 4
fl=0 1 1 1 0 1 1
b=0.

During the first read cycle, wp=0 and the read pointers and flags are:

pr=[0 0 2 6 0 1 4]
fl=[0 1 1 1 0 1 1].

Using the same input and output streams as above, the first read cycle passes the input to the output for the first read pointer value of zero (fl=0), reads locations 0, 2, and 6 from the memory, then passes the next input value to the output (fl=0) and reads locations 1 and 4. The first seven samples of the output are:

y₀:y₆=[x₀0 0 0 x₁0 0].

The write pointer for the first write cycle is:

pw=[0 0 1 2 4 6 0],

and the memory contains:

index 0 1 2 3 4 5 6 7 8
content x₆x₂x₃0 x₄0 x₅0 0.

During the second read cycle, wp=1 and the read pointer and flags are:

pr=[1 1 3 7 1 2 5]
fl=[0 1 1 1 0 1 1],

which provides the next seven output samples:

y₇:y₁₃=[x₇x₂0 0 x₈x₃0].

The write pointer for the second write cycle is:

pw=[1 1 2 3 5 7 1],

and the memory contains:

index 0 1 2 3 4 5 6 7 8
content x₆x₁₃x₉x₁₀x₄x₁₁x₅x₁₂0.

During the third read cycle, wp=2 and the read pointer and flags are:

pr=[2 2 4 8 2 3 6]
fl=[0 1 1 1 0 1 1],

which provides the next seven output samples:

y₁₄:y₂₀=[x₁₄x₉x₄0 x₁₅x₁₀x₅].

This is the same result as obtained above for the shift register implementation. It should be noted that every cycle I bytes are read, followed by a write of I bytes, and the memory is reused in such a manner that more than (D−1)*(I−1)/2 memory locations are never needed.

It should also be noted that the pointers for read and write, and the flag, can be computed in line. Optionally, the read pointer offsets and the flags are pre-computed and stored in an array of maximum size I by 2, where each array address contains two values—the read pointer offset and the flag. An efficient way of doing this is by attaching the flag bit (the flag only having a value of 0 or 1) to the read pointer offset as an extra bit, separating the two before use. Another implementation inverts the read pointer offset values when the flag is zero, testing for such negative values in the loop as these offsets are actually never used.

The complete loop for both the read and write cycles, as well as the pointer update, is as follows:

ml = (D − 1) * (I − 1) / 2wp = 0b = start of memorydo forever for j = 0 : I − 1 if (fl(j) = 0) out = in; endif rp = b + (wp + drp(j))_ml out = mem(rp) endfor for j = 0 : I − 1 if (dwp(j) ≠ 0) wp = b + (wp + dwp(j))_ml mem(wp) = in endif endfor wp = (wp + 1)_mlenddo.

The read pointer is computed using the following procedure:

ml = (D − 1) * (I − 1) / 2;for i = 0 : I − 1 rowindx = 0; Dsum = 0; for j = 0 : I − 1 dw = int(Dsum / I) rd = Dsum − I * dw dr = (rowindx)_ml rowindx = rowindx + dw if (rd = i − 1) dpr(i, 0 : 1) = [dr(int(Dsum / I) ˜= 0)] break else Dsum = Dsum + D; end endend.

The write pointer for an index n can be computed in line using:

Dsum = 0;for i = 0 : n − 1 dw = fix(Dsum / I) Dsum = Dsum + Dend.

The final step is the implementation of this method in an xDSL modem. Typically, the memory of such devices is implemented as a rectangular array of n rows by m columns. Thus, the memory addresses in the read and write pointers have to be translated to these coordinates. This is readily accomplished by methods well known to those of ordinary skill in the art. Once the number of rows (or columns) of the array are determined as nrows (or ncolumns), the indices are computed as:

row address=int(pointer), and
column address=(pointer)_nrows.

In the example above, a memory of nine locations is used. This can be mapped to a square memory of three rows by three columns. Thus, address 4 maps to memory location (1,1), while address 8 maps to memory location (2,2), and so on. Mapping the pointer addresses to the address memory locations provides the following array:

columnrow01200121345267 8.

Although the present invention has been illustrated and described herein with reference to specific examples and preferred embodiments thereof, it will be readily apparent to those of ordinary skill in the art that other examples and embodiments may perform similar functions and/or achieve similar results. All such equivalent examples and embodiments are within the spirit and scope of the present invention, are contemplated thereby, are intended to be covered by the following claims.

Claims

1. A method for implementing a general convolutional interleaver, with no constraints, in an efficient manner, using (D−1)*(I−1)/2 memory locations for the interleaved data in all cases, the method comprising: dividing an incoming data stream into blocks of I bytes; mapping each member of a block into a set of first-in, first-out shift registers (FIFOs) arranged in rows, wherein the number of elements in a row j is given by: nd(j)=int(j·D/I), j=0, . . . ,I−1, wherein int(x) is an integer part of x; wherein, as each element is entered, a FIFO is shifted to the right and a last element is read out to an output stream; and wherein the order in which the elements are read is different from the order in which they are written.
2. The method of claim 1, wherein indices of rows read is given by id(j):
3. The method of claim 1, wherein, for those rows where nd(j)=0, no data is stored and input data is directly passed to an output.
4. The method of claim 1, wherein the method is implemented in an integrated circuit device.
5. The method of claim 1, wherein the shift registers are mapped to a general memory structure.
6. The method of claim 1, wherein the shift registers are mapped to a linear memory array comprising two pointers, a write pointer offset to write data to the linear memory array and a read pointer offset to read the data.
7. The method of claim 6, wherein, for each block, the pointers cycle through I values.
8. The method of claim 6, wherein the write pointer offset is defined as the number of elements in each row of the FIFOs:
9. The method of claim 8, wherein a flag is defined to indicate if a target row to be read has zero elements, as follows:
10. The method of claim 9, wherein the process is started by setting wp to zero and reading I bytes from the memory at locations specified by the read pointer, except that reads corresponding to rows with zero bytes (dwp=0) are taken directly from an input stream.
11. The method of claim 10, further comprising designating a next input from the input stream as “in” and a next output to the output stream as “out”, wherein a read operation is:
12. The method of claim 11, further comprising writing I bytes to the memory at locations specified by the write pointer, with the exception that no data is written for rows corresponding to dwp=0, wherein a write is:
13. The method of claim 12, further comprising incrementing wp by 1 modulo ml and repeating a cycle.
14. The method of claim 6, wherein the read and write pointers are computed in line.
15. The method of claim 6, wherein the read pointer and a flag are pre-computed and stored in an array of maximum size I by 2, and wherein an array address comprises two values−the read pointer and the flag.
16. The method of claim 1, wherein the method is implemented in a digital subscriber line (xDSL) modem.
17. An apparatus for implementing a general convolutional interleaver, with no constraints, in an efficient manner, using (D−1)*(I−1)/2 memory locations for the interleaved data in all cases, the apparatus comprising: means for dividing an incoming data stream into blocks of I bytes; means for mapping each member of a block into a set of first-in, first-out shift registers (FIFOs) arranged in rows, wherein the number of elements in a row j is given by: nd(j)=int(j·D/I), j=0, . . . ,I−1, wherein int(x) is an integer part of x; wherein, as each element is entered, a FIFO is shifted to the right and a last element is read out to an output stream; and wherein the order in which the elements are read is different from the order in which they are written.
18. The apparatus of claim 17, wherein indices of rows read is given by id(j):
19. The apparatus of claim 17, wherein, for those rows where nd(j)=0, no data is stored and input data is directly passed to an output.
20. The apparatus of claim 17, wherein the apparatus is an integrated circuit device.
21. The apparatus of claim 17, wherein the shift registers are mapped to a general memory structure.
22. The apparatus of claim 17, wherein the shift registers are mapped to a linear memory array comprising two pointers, a write pointer offset to write data to the linear memory array and a read pointer offset to read the data.
23. The apparatus of claim 17, wherein the apparatus is a digital subscriber line (xDSL) modem.

CROSS-REFERENCE TO RELATED APPLICATION(S)

The present non-provisional patent application/patent claims the benefit of priority of U.S. Provisional Patent Application No. 60/631,775, filed on Nov. 30, 2004, and entitled “METHOD AND APPARATUS FOR THE EFFICIENT IMPLEMENTATION OF A TOTALLY GENERAL CONVOLUTIONAL INTERLEAVER IN DMT-BASED xDSL SYSTEMS,” which is incorporated in-full by reference herein.

Provisional Applications (1)

	Number	Date	Country
	60631775	Nov 2004	US

Method and apparatus for the efficient implementation of a totally general convolutional interleaver in DMT-based xDSL systems

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims

CROSS-REFERENCE TO RELATED APPLICATION(S)

Provisional Applications (1)