Digital data transmitted over communication channels with impairments such as noise, distortions, and fading is inevitably delivered to the user with some errors. A similar situation occurs when digital data is stored on devices such as magnetic or optical media or solid-state memories that contain imperfections. The rate at which errors occur, referred to as the bit-error rate (BER), is a very important design criterion for digital communication links and for data storage. The BER is usually defined to be the ratio of the number of bit errors introduced to the total number of bits. Usually the BER must be kept smaller than a given preassigned value, which depends on the application. Error correction techniques based on the addition of redundancy to the original message can be used to control the error rate.
The amount of redundancy inserted by the code employed by the encoder is usually expressed in terms of the code rate R. This rate is the ratio of the number of information symbols (e.g., bits) l in a block to the total number of transmitted symbols n in the codeword. That is, n=l+number of redundant symbols. Or in other words, n>l, or equivalently, R=l/n<1.
The most obvious example of redundancy is the repetition of the bit in a message. This technique, however, is typically unpractical for obvious reasons. Accordingly, more efficient coding mechanisms for introducing redundancy have been developed. These include block codes and convolutional codes. With block codes, the encoder breaks the continuous sequence of information bits into l-bit sections or blocks, and then operates on these blocks independently according to the particular code used. In contrast, convolutional codes operate on the information sequence without breaking it up into independent blocks. Rather, the encoder processes the information continuously and associates each long (perhaps semi-infinite) information sequence with a code sequence containing more symbols.
Block codes are characterized by three parameters: the block length n, the information length l, and the minimum distance d. The minimum distance is a measure of the amount of difference between the two most similar codewords. Ideally, the minimum distance d is relatively large.
Conceptually, for block codes the encoder 12 of
There are several known techniques for generating the generator matrix G. These include Hamming codes, BCH codes and Reed-Solomon codes. Another known code is a low density parity check (LDPC) code, developed by Gallager in the early 1960's. With block codes, a parity check matrix H of size (n−l)×n exists such that the transpose of H (i.e., HT), when multiplied by G, produces a null set; that is: G×HT=0. The decoder multiplies the received codeword c (m×G=c) by the transpose of H, i.e., c×HT. The result, often referred to as the “syndrome,” is a 1×(n−k) matrix of all 0's if c is a valid codeword.
For LDPC codes, the parity check matrix H has very few 1's in the matrix. The term “column weight,” often denoted as j, refers to the number of 1's in a column of H, whereas the term “row weight,” denoted as k, refers to the number of 1's in a row. An LDPC code can be represented by a bipartite graph, called a Tanner graph, that has as many branches as the number of non-zero elements in the parity check matrix. Gallager showed that with a column weight j≧3, which means three or more 1's in each column of matrix H, the minimum distance d increases linearly with n for a given column weight j and row weight k, and that the minimum distance d for a column weight of j=2 can increase at most logarithmically with the block length.
For data storage applications, the corrected bit-error rate (BER) (i.e., BER after error correction) is preferably on the order of 10−12 to 10−15. Possible bit errors can be introduced in data storage applications because of mistracking, the fly-height variation of the read head relative to the recording medium, the high bit density, and the low signal-to-noise ratio (SNR). Today, the goal of data storage applications is to realize storage densities of 1 Tbit/in2 and higher. Such a high bit density generates greater intersymbol interference (ISI), which complicates the task of realizing such low BERs. Further, with such high bit densities, the physical space each bit takes up on the recording medium becomes increasingly smaller, resulting in low signal strengths, thereby decreasing the SNR. In addition, computationally complex encoding schemes make the associated decoding operation computationally complex, making it difficult for the decoder for such a scheme to keep up with desired high data rates (such as 1 Gbit/s).
Accordingly, there exists a need for a code that can lead to corrected BERs of 10−12 to 10−15 despite the complications of large ISI and low SNR associated with going to higher bit densities, such as 1 Tbit/in2. Further, there exists a need for such a coding scheme to permit encoding and decoding at high data rates.
In one general respect, the present invention is directed to a method for encoding binary data. The encoding may be part of, for example, a data storage system or a data communications system. According to one embodiment, the method includes multiplying a message word with a generator matrix, wherein the generator matrix multiplied by the transpose of a parity check matrix for a low density parity check code yields a null set, and wherein the parity check matrix has a column weight of two. Further, the parity check matrix may be quasi-cyclic. The quasi-cyclic nature of the parity check matrix can simplify and thus speed up the encoder and decoder hardware. Such a quasi-cyclic parity check matrix, with a column weight of two, permits high rate codes of moderate codeword lengths and associated graphs that are free of 4-cycles and 6-cycles. In addition, utilizing such a quasi-cyclic parity check matrix with a column weight of two seems to offer more compatibility with, for example, outer Reed-Solomon codes. According to one embodiment, the parity check matrix may have a girth of twelve, where “girth” refers to the number of branches in the shortest cycle in the Tanner graph representing the code.
In another general respect, the present invention is directed to a coded data system. According to one embodiment, the system includes an encoder for encoding a message word by multiplying the message word with a generator matrix, wherein the generator matrix multiplied by the transpose of a parity check matrix for a low density parity check code yields a null set, and wherein the parity check matrix has a column weight of two. The parity check matrix may be quasi-cyclic. In addition, the system may further include a decoder in communication with the encoder via a channel. According to one embodiment, the parity check matrix may have a girth of twelve.
In another general respect, the present invention is directed to a method of encoding binary data including, according to one embodiment, receiving a message word and adding a plurality of redundancy bits to the first message word to thereby generate a codeword. The redundancy bits are added based on a three-tier Tanner graph having a girth of twelve. Such an encoding scheme facilitates pipelined processing.
Embodiments of the present invention will be described in conjunction with the following figures, wherein:
a–c are histograms from simulations showing the number of blocks having different numbers of errors using a LDPC code with a column weight of j=2 as a function signal-to-noise ratio, bit error rate, and the total number of blocks simulated;
a–c are histograms from simulations showing the number of blocks having different numbers of errors using a LDPC code with a column weight of j=3 as a function signal-to-noise ratio, bit error rate, and the total number of blocks simulated;
The input binary data may be a message word m of length l; that is, m is a 1×l matrix. The LDPC encoder 22 multiplies a generator matrix G by m to produce codeword c. The generator matrix G is a l×n matrix, where n>1. For certain applications, n may be on the order of several thousand, such as on the order of 4000. The code rate R=l/n. According to one embodiment, the LDPC encoder 22 may be implemented with a series of shift registers to perform encoding.
The codeword c is transmitted over the channel 24, which can include, for example, a digital communication link (such as a microwave link or a coaxial cable) or a data storage system (such as a magnetic or optical disk drive). The sampler 26 may periodically sample the analog signal received over the channel 24, based on a clock signal received from the clock 28, to generate digital samples of the received signal. The digital samples are provided to the LDPC decoder 30, which decodes the digital sample to, ideally, generate the exact data bit sequence m provided to the LDPC encoder 22. The LDPC decoder 30 decodes the received codeword c based on preexisting knowledge regarding the parity check matrix H. According to one embodiment, the LDPC decoder 30 may be implemented with a digital signal processor (DSP) employing soft iterative decoding according to, for example, a sum-product (sometimes referred to as a message passing) algorithm, as described in, for example, Kschischang et al., “Factor Graphs and the Sum-Product Algorithm,” IEEE Transactions on Information Theory, 2001, which is incorporated herein by reference.
For LDPC systems, G×HT=0, where H is the parity check matrix. This is the case for all linear block codes. According to an embodiment of the present invention, H is an (n−l)×n matrix having a column weight of two (i.e., j=2). That is, the parity check matrix H has two, and only two, 1's per column. In addition, the parity check matrix H may have the 1's placed in the matrix according to a predetermined distribution such that the 1's are not randomly located in the matrix.
Consider a parity check matrix H having v rows (0 to v−1) and n columns (0 to n−1), where n=rv and r is an integer greater than zero. That is, H may be considered to comprise r number of v×v sub-matrices, as illustrated in
s={a1, a2, . . . , ar, 0<a1<a2< . . . <ar<v}.
s={a1, a2, . . . , ar, 0<a1<a2< . . . <ar<v}
where the elements of s are the location of the second 1's in the n=0 column of each sub-matrix Mi, 0<i<r. As illustrated in
At block 240, i is set to equal 3. Next, at step 250, a1=3 is chosen using the above-two constraints from step 230, with the additional constraint that:
Without loss of generality, choose α1=1 at step 210. Then a v×v square sub-matrix M1 is obtained according to the process of
Step 1. Calculate a vector p using, for example, a linear shift register.
Step 2. Compute the parity bits x using sub-matrix M1 and vector p as follows:
where ⊕ stands for XOR operation.
The above calculation of parity bits x may be readily implemented using, for example, a flip-flop circuit by initializing the register with information bit m1 and input sequence p.
Using a parity check matrix H where the column weight j=2, as per the above construction, has the advantage of eliminating 4-cycles and 6-cycles in the associated Tanner graph. Typically, the larger the girth, the better because the decoder is using more iterations to decode the data.
In addition, because of the quasi-cyclic nature of the parity check matrix H, the present invention may permit the matrix H to be completely described by a small set of numbers, which may greatly reduce the memory and bandwidth issues involved in the hardware implementation of the encoder/decoder. Further, utilizing a column weight of two potentially results in less computation and less memory accesses by the encoder 22 and decoder 30 than with systems where j≧3. Additionally, simulation has indicated that using a parity check matrix H with a column weight of j=2 provides acceptable performance in terms of bit-error-rate (BER) at low signal-to-noise ratios (SNRs), at higher storage densities for digital recording channels, and at higher transmission rates for digital communication channels.
According to one embodiment, the outer encoder 40 may be a Reed-Solomon encoder, i.e., an encoder that employs a Reed-Solomon error correction code. Reed-Solomon codes are described in Wicker et al., eds., Reed-Solomon Codes and Their Applications, IEEE Press, 1994, which is incorporated herein by reference. In addition, the outer decoder 42 may be a Reed-Solomon decoder that is provisioned to decode the redundancy introduced by the Reed-Solomon outer encoder 40.
According to another embodiment, the outer encoder 40 may be LDPC code encoder where the column weight j≧3. For such an embodiment, the outer decoder 42 may be a LDPC decoder provisioned to decode the redundancy introduced by the outer LDPC encoder 40.
a–c and 7a–c illustrate the compatibility of utilizing a LDPC encoder 22 with a column weight of j=2 in conjunction with an outer Reed-Solomon decoder 40.
a–c illustrate similar block statistics for a LDPC code with a column weight of j=3. These figures illustrate that some block have more than 100 errors. For example,
As another aspect of the present invention, consider a p-tier Tanner graph for any (n, j, k) LDPC code, as shown in
n≧k(k−1)p−1(j−1)p−1+ . . . +k(k−1)(j−1)+k (1)
Similarly, to construct graph of girth g=4p+2, all the check nodes on the p-tier graph must be distinct, which gives the following lower bound on the codeword length,
n≧[k2(k−1)p−1(j31 1)p+ . . . +k2(j31 1)+k]/j (2)
To construct graphs having girth g=12, all the bit nodes on the 3-tier graph must be distinct, as shown in
If k−1 is a prime number, square matrices Qi,i=1,2, . . . ,k of size (k−1)×(k−1) constructed following the steps described below, for example, can be used to establish the connections to avoid short cycles of length 10 or less.
Step 1. Find a primitive element α for the Galois Field GF(k−1). Primitive elements can be found in references such as Error Control Coding, by S. Lin and D. Costello, Prentice-Hall, 1983, which is incorporated herein by reference.
Step 2. Let
Step 3. Form column vectors {overscore (ω)}i, i=3,4, . . . ,k of size (k−1)×1.
Step 4. Construct matrices Qi,i=3,4, . . . ,k
Qi=Q2Θ{overscore (ω)}i,
where Θ denotes left circular shift operation, i.e., the first row in Qi is obtained by {overscore (ω)}i,1 left circular shifts of the first row in Q2, the second row in Qi is obtained by {overscore (ω)}i,2 left circular shifts of the second row in Q2, etc.
Step 5. Connections between the bit nodes in the ith group and the check nodes on the third tier are established according to the mapping matrices Qi, i=1,2, . . . ,k. Without loss of generality, the positions of the check nodes in the bottom tier can be ordered as 1, 2, . . . , (k−1)2 from left to right. We read out the (k−1)2 numbers in matrix Qi column by column to get a 1×(k−1)2 vector [q1 q2 . . . q(k−1)
Starting with an arbitrary bit node, the Tanner graph in
pi=xi
where xi
qi′=pi′
where pi′
Assume the (k−1)3 bit nodes on the third tier are information bits, such as, for example, from a received message word. Suppose the parity bit pi on the second tier share the same check node with bit nodes xi
As described above, the encoding of cycle codes is based on the parity check matrix. This is particularly important for iterative soft decoding, where the decoding process is also based on the parity check matrix. Thus, the encoding and decoding can be unified and performed more efficiently in hardware implementation without allocating additional resources to compute the generator matrix which is often used for encoding.
Consider the following example with reference to
Step 1. Find a primitive element α for the GF(k−1=3). Easy to check α=2 is a primitive element for GF(3).
Step 2. Construct 3×3 matrices Q1 and Q2 as follows:
Step 3. Form column vector
mod (k−1), i=3,4.
Therefore,
Step 4.
i.e., [5 6 4] is obtained by 1 left circular shift of [4 5 6], [9 7 8] is obtained by 2 left circular shifts of [7 8 9].
i.e., [6 4 5] is obtained by 2 left circular shift of [4 5 6], [8 9 7] is obtained by 1 left circular shifts of [7 8 9].
Step 5. Make the connections according to the mapping matrices.
i=1: connect the bit nodes in the 1st group to the check nodes.
Read out the (k−1)2=9 numbers in matrix Q1 column by column, resulting in [1 2 3 4 5 6 7 8 9], and connect the 1st check node with the 1st bit node, the 2nd check node with the 2nd bit node, . . . , the 9th check node with the 9th bit node.
i=2: connect the bit nodes in the 2nd group to the check nodes.
Read out the (k−1)2=9 numbers in matrix Q2 column by column, resulting in [1 4 7 2 5 8 3 6 9], and connect the 1st check node with the 1st bit node, the 4th check node with the second bit node, the 7th check node with the 3rd bit node, . . . , the 9th check node with the 9th bit node.
i=3: connect the bit nodes in the 3rd group to the check nodes.
Read out the (k−1)2=9 numbers in matrix Q3 column by column, resulting in [1 5 9 2 6 7 3 4 8], and connect the 1st check node with the 1st bit node, the 5th check node with the 2nd bit node, the 9th check node with the 3rd bit node, . . . , the 8th check node with the 9th bit node.
Finally, for i=4, connect the bit nodes in the 4th group to the check nodes according to Q4, i.e., using vector [1 6 8 2 4 9 3 5 7].
Once the connections are established, we may label the check nodes and bit nodes as shown, for example, in
As is evident from the above example, an LDPC encoder can add redundancy bits to a received message word based on such a three-tier Tanner graph with a girth g=12. Moreover, the three-tier Tanner graph encoding scheme may facilitate pipelined processing by the encoder. That is, the encoder may operate on a first received message word at the lowest (third) tier of the Tanner graph (see
Although the present invention has been described herein with respect to certain embodiments, those of ordinary skill in the art will recognize that many modifications and variations of the present invention may be implemented. The foregoing description and the following claims are intended to cover all such modifications and variations.
Number | Name | Date | Kind |
---|---|---|---|
4295218 | Tanner | Oct 1981 | A |
4547882 | Tanner | Oct 1985 | A |
5563897 | Pyndiah et al. | Oct 1996 | A |
6145111 | Crozier et al. | Nov 2000 | A |
6163870 | Luby et al. | Dec 2000 | A |
6195777 | Luby et al. | Feb 2001 | B1 |
6539367 | Blanksby et al. | Mar 2003 | B1 |
6567465 | Goldstein et al. | May 2003 | B1 |
6789227 | De Souza et al. | Sep 2004 | B1 |
20020186759 | Goldstein et al. | Dec 2002 | A1 |
20020188906 | Kurtas et al. | Dec 2002 | A1 |
Number | Date | Country | |
---|---|---|---|
20040093549 A1 | May 2004 | US |