This application claims priority from European patent application No. 03425172.8, filed Mar. 19, 2003, which is incorporated herein by reference.
In its more general aspect, embodiments of the present invention relate to methods and systems for applying the self-corrector code theory to digital information coded as symbol sequences, for example in the Boolean logic, stored in electronic memory systems or transmitted from and to these systems.
More particularly, an embodiment of the invention relates to a method as above providing the transmission of sequences incorporating a portion of error corrector code allowing the sequence, which is more probably the original transmitted through the calculation of an error syndrome by using a parity matrix, to be restored when received.
In the specific technical field of communication systems, such as communication system 100 shown in
In substance, a sequence x of Boolean symbols transmitted by a transmitter 102 through a communication channel 104 undergoing noise can be received at a receiver 106 as a different sequence y from which it is necessary to go back to the initial sequence x.
Traditionally, the sequence x of symbols to be transmitted comprises an additional or redundant portion including an error corrector code allowing the message, which is more probably the original even with errors, to be restored when received.
These error corrector codes are based on well known mathematical theories, such as for example the Hamming code theory, which are presently applied in several contexts wherein it is necessary to remedy noise in communication channels.
For a better understanding of all aspects of the present invention, a detailed description of the most used methods for correcting errors in digital information coded as symbol sequences in the Boolean logic is illustrated hereinafter.
0.1 Basic Definitions
Definition 1 Given m·n real numbers, a table like the following one is called matrix of the type [m×n]:
Definition 2 The transpose of the above matrix, indicated with MT, is the matrix:
obtained from M by exchanging, in order, rows with columns.
Definition 3 A n-×-n-order square matrix M is considered. Fixing an element aik of the matrix M and eliminating therein the row and the column crossing in the element (the i-th row and the k-th column) a square matrix of order (n−1)×(n−1) is obtained, whose determinant is called complementary minor of aik and will be indicated with Mik.
Definition 4 The determinant of the second order matrix is the number:
a11a22−a12a21
Definition 5 The determinant of a n-order matrix is:
Definition 6 The square matrix having 1 as elements aii and 0 elsewhere is called identity matrix and is indicated with I.
Definition 7 A group G is a set in which an operation * is defined, for which G is closed for *, i.e. if g ε G and h ε Gg*h ε G;
Definition 8 If the operation * is the sum the group is called additive
Definition 9 A group is called abelian if the operation * is commutative
Definition 10 The set {0,1,2, . . . , p−1} is called remainder class (mod p) and is indicated with Zp, the property being that in these classes p=identity.
Definition 11 A Boolean group is a binary group, i.e. a group containing only the numbers 0 and 1 and 1+1=0.
Definition 12 A set of vectors v1, . . . , vk is linearly dependent if and only if there are some scalars c1, . . . , ck≠0 so that c1v1+c2v2+ . . . +ckvk=0.
Definition 13 A family of vectors is called base of the area if it is a generating family, i.e. any other vector of the area is a linear combination of these vectors, and it is composed of linearly independent vectors.
0.1.1 Codes
The aim of the self-corrector code theory, a branch of the information theory, was originally born to solve some practical problems in the communication of coded digital information. A message is considered as a block of symbols of a finite alphabet; it is usually a sequence of 0 and 1 but it can be also any number, a letter or a complete sentence. The message is transmitted through a communication channel undergoing a noise. The aim of the self-corrector code theory is to add redundant terms to the message so that it is possible to go back to the original message if the transmitted message has been damaged. First of all, a difference must be made between diagnosing and correcting errors. Diagnostics detects the presence of an error, while the correction detects and corrects the error.
Each message called c consists of k information digits. The coding turns, according to certain rules, each input message c into a binary nth number x with n>k.
This binary nth number x is the code word of the message c. During the transmission some errors can occur, the binary nth number y being thus received
c→x→channel→y
The area V of all nth numbers of 0 and 1 will be now considered adding component vectors per module component 2.
Definition 14 A linear binary code [n,k] is the set of all linear combinations of k(≠0) independent vectors in V. Linear means that if two or more vectors are in the code, also their sum is therein.
Definition 15 A generating matrix G for a linear code is a matrix k×n whose rows are a base for C.
Definition 16 A parity matrix H of a linear code is a matrix n×k so that G·H=0.
Definition 17 H is the parity matrix of a code C w·ε C if and only if wHT=0.
Definition 18 G is called in standard form if G=(IkP) where Ik is the identity matrix k×k and P is a matrix k×(n−k). If G is in the systematic or standard form, then the first k symbols of a word are called information symbols.
Theorem 19 If a code C [n,k] has a matrix G=(IkP) in the standard form, then a C parity matrix is H=(−PTIn−k) where pT is the transpose of P and is a matrix (n−k)×k and In−k is the identity matrix (n−k)×(n−k)
Systematic codes have the advantage that the data message is in the code word and it can be read before decoding. For codes in the non-systematic form the message is no more recognizable in the coded sequence and an inverter is needed to recognize the data sequence.
Definition 20 Being C a linear code with parity matrix H, then, given x a binary nth number xHT, is called syndrome of x.
Definition 21 The weight of a vector u is the number of component being different from 0.
Definition 22 The code minimum weight d is the weight of the vector different from 0 having the lowest weight in the code.
d is thus a measure of the “quality” of a code.
Defined a sphere Sr(u) with radius r around a vector u like Sr(u)={vεV|d(u,v)≧r}
Theorem 23 If d is the minimum weight of a code C, then C can correct at most
errors and vice versa.
Corollary 24 C has a minimum weight d if d is the highest number so that each d−1 columns of the parity matrix H are independent.
Supposing for example that a code in the systematic form correcting 2 errors is to be produced. The matrix H will be composed of the identity matrix and of a matrix PT having 4 linearly independent columns, i.e. so that the determinant of the sub-matrix composed of these four columns ≠0. Therefore, according to the number of errors to be corrected, a matrix H with d−1 linearly independent columns is searched. Therefore, given n and k, a code with d being the widest possible is searched in order to correct more errors.
It is however possible to have vectors in V which are not comprised in any of these spheres.
Definition 25 A minimum-weight-d code C is called perfect if all vectors in V are comprised in
around the code words. In this case it can be said that the spheres cover the area.
For the given n and k they are the best codes.
Theorem 26 For a perfect binary code [n,k] to exist, n, k and t must satisfy the following equation
Generally,
Theorem 27 For a code [n,k] to exist, n, k and t must satisfy the following inequality known as Hamming inequality:
When the word y is received the word x being sent and afterwards the data message c are to be searched. With the following formula: y=x+ξtH(m+ξt)=Hξt where ξt is a particular error class. If Hξt ε H, then it can be said which is the wrong position.
Supposing that an error occurs:
m+ξiH(m+ξi)=Hξi
Hξi ε H?→wrong position: i
Supposing now that two errors occur:
m+ξi+ξjH(m+ξ+ξj)=Hξi+Hξj=s
∀ξi→Hξi+Hξj ε H?→wrong positions: i and j
The following practical example for corrector codes of one error (Hamming codes) is now examined: the Hamming code [7,4] described by the following generating matrix is considered:
The first 4 positions are considered as the information positions and the last 3 positions as redundancy positions. Therefore the first row is the message 1 0 0 0 and so on. All words are obtained by adding (mod 2) those rows. For example the message u=(1011) is coded as x=
(1011010). The parity matrix H is considered:
It must be noted that the matrix columns have been written so that the i-th column is composed of 2-based i-development coefficients, in case completed by 0.
Supposing to send the message x above and that an error occurs. The message y=(1010010) is thus received. The syndrome is calculated:
HyT=(100)
(1 0 0) is the binary representation of 4; the wrong bit is therefore the fourth.
The ideal is thus to search perfect codes, but they are not always found, moreover codes recognizing an error of the 0→1 type from 1→0 are wished.
Although advantageous under many aspects, the methods presently used require adding a redundancy information portion which, the size of the single message to be coded being fixed, cannot be lower than a minimum indicated. A technical problem underlying embodiments of the present invention is to provide a linear code protecting digital information coded like binary symbol sequences and overcoming the limits of the solutions presently provided by the prior art.
According to one aspect of the invention, a coding is identified for a binary alphabet in non Boolean groups, i.e. in non binary groups.
More particularly, a method according to one embodiment of the invention allows error corrections to be performed on digital information coded as symbol sequences x, for example digital information stored in electronic memory systems or transmitted from and to these systems and providing the transmission of sequences x incorporating an error corrector code portion allowing the sequence x, which is more probably the original transmitted through the calculation of an error syndrome using a parity matrix, to be restored when received.
Advantageously, the method provides that the error code incorporated in the original sequence x belongs to a non Boolean group.
The error code used is a linear code, as it will be apparent from the following detailed description of the method embodiments.
0.2 Codes on Different Groups
Additive groups are considered. The group of operation with the previous codes is Boolean, i.e. being x a field element it results that x+x=identity with respect to the sum. Now additive groups are considered (mod p) with p ε N.
Similar codes to the above-described codes are searched, i.e. codes for which, being H the code parity matrix and y the received word it results:
y·HT=0
if y is a code word. Linear codes are thus searched. Moreover if y is affected by one or more errors, it results:
(y+ξi+ξj)·HT=ξi·HT+ξj·HT=si+sj
where si and sj are the i-th and j-th columns of the matrix HT. The code being searched must therefore belong to an Abelian group to have this property.
Codes in a systematic form are searched and the method for forming the identity matrix is analyzed. Columns are considered as 10-base-written numbers. The matrix will then become a number vector and the product matrix by message received will become a scalar product. Operating in a group (mod p) the numbers composing the identity matrix must be such that the matrix composed of their binary representation has a determinant ≠0. The parity bit number n−k being fixed, p is chosen so that:
2n−k+1≦p≦2n−k+1−1
The identity matrix is composed of the numbers p-1, p-2, . . . , p-2n−k. A code C [7,4] with p=8 is considered, the identity matrix will be composed of the numbers 7, 6 and 4. The binary-written matrix will then have the form: opposite to the usual identity matrix
represented by the 10-based numbers: 1, 2 and 4
It must be noted that any matrix could be chosen, having a “determinant” ≠0, i.e. a number belonging to that matrix is not a linear combination of other numbers belonging to that matrix. This choice is particularly effective. It can be seen with an example.
Supposing that the product of a data vector by a certain matrix P (H=(P,I)) has given the result 1, which, binary-written as 100, will compose the code part to be added to the word. m is seen as a weight vector ci; thus being xi the numbers composing the matrix H (seen as a vector):
Where the sum is done (mod p). When the message is received, the multiplication m·HT must occur, i.e. (mk, mn−k)·(P,I)=mk·P+mn−k·I. In this case the first value is 1 and so that the message is correct it must be:
1+mn−k·I=0 (mod p)
The usual matrix i.e. [1,2,4] is chosen as identity matrix. It results:
[1,2,4] (c1, c2, c3)+1=0
Working in a field Z8, instead of having 0 as second member, 8k can be obtained with k ε N. The solution is (c1, c2, c3)=(111).
The suggested matrix, i.e. [7,6,4], is now chosen as the matrix. It results:
[7,6,4] (c1, c2, c3)+1=0
The solution is (c1, C2, C3)=(100), i.e. the same value as the calculated code. This fact is not random, with the identity matrix suggested the calculated code is always equal to the code received if errors have not occurred.
The numbers composing the parity matrix P columns must be chosen according to similar criteria to those of the Boolean group.
With codes in these groups the error 1→0 is distinguished from 0→1, thus the channel is no more symmetrical. In fact:
if the syndrome returns a value x with x ε H the error occurred is 0→1;
if the syndrome returns a value x with x∉ H, but p−x ε H, then the error occurred is 1→0;
An error +1 is allocated to the first case and an error −1 to the second case.
A code [6,1] with p=22 is considered.
H=(11|21 20 18 14 6)
In binary this matrix will be:
The code words will then be:
0|00000
1|11010
The second code word is sent, but 111110 is received, i.e. an error +1 has occurred in the fourth position. Calculating: (111110)·H=1·11+1·21+1·20+1·18+1·14=84 which, in the group considered, is 18. 18 is in the matrix H and thus the error occurred is 0→1, moreover 18 is in the fourth position of the matrix, which is the wrong message position.
Supposing now that 101010 is received, i.e. an error −1 has occurred in the second position. It must be calculated: (101010)·H=1·11+1·20+1·14=45 which, in the group considered, is 1. 1 is not in the matrix H, but 22−1=is therein and therefore the error occurred is 1→0, moreover 21 is in the matrix second position which is the wrong position in the message.
It must be observed that the errors in the message received can be only of one type, or +1 or −1 in each position, if the corresponding bit is 0 or 1 in the message received. If an impossible error is detected, it means that the code could diagnose but not correct the errors.
A contradictory example is now described.
A code [3,1] in a group (mod 4) is considered, in which the matrix H=(1|32). The code words will be:
0|00 1|10
The message 000 is sent and 010 is received.
(010)·H=3
3 is in the matrix and this would indicate an error +1 in the second position. 4−3=1 is also in the matrix and this would indicate an error −1 in the first position. In fact 010 can be obtained also from 110 with an error in the first position. Therefore a code cannot be found on Z4. Sometimes, in order to correct the errors, it is necessary not only to calculate the syndrome but also to compare the bits received. A code [3,1] is considered on Z5 with matrix H=(3|43). The code words will be:
0|00 1|11
The word 000 is sent, all errors which may occur and the decoding are considered.
001 syndrome=3. Possible errors:
1) +1 in the first position;
2) +1 in the third position;
Given that a 0 is received in the first position, the case 1 is not possible.
010 syndrome=4. Possible error: +1 in the second position.
100 syndrome=3. Possible errors:
1) +1 in the first position;
2) +1 in the third position;
Given that a 0 is received in the third position, the case 2 is not possible. The word 111 is now sent, all errors which may occur and the decoding are considered.
011 syndrome=2. Possible errors:
1) −1 in the first position;
2) −1 in the third position;
Given that a 1 is received in the third position, the case 2 is not possible.
101 syndrome=1. Possible error: −1 in the second position.
110 syndrome=2. Possible errors:
1) −1 in the first position;
2) −1 in the third position;
Given that a 1 is received in the first position, the case 1 is not possible.
Therefore the type of error occurred is distinguished by comparing the syndrome with the values actually received. The manufacture of a circuit describing this method involves the creation of non-binary adders as shown in
The error correcting code methodology described herein may be utilized in a variety of different types of electronic systems, such as communications, digital video, memory and computer systems, as will be appreciated by those skilled in the art.
From the foregoing it will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
03425172 | Mar 2003 | EP | regional |
Number | Name | Date | Kind |
---|---|---|---|
3665396 | Forney, Jr. | May 1972 | A |
4566105 | Oisel et al. | Jan 1986 | A |
4841300 | Yoshida et al. | Jun 1989 | A |
5040179 | Chen | Aug 1991 | A |
5297153 | Baggen et al. | Mar 1994 | A |
5343426 | Cassidy et al. | Aug 1994 | A |
5343481 | Kraft | Aug 1994 | A |
5389835 | Yetter | Feb 1995 | A |
5459742 | Cassidy et al. | Oct 1995 | A |
5754753 | Smelser | May 1998 | A |
5884304 | Davis et al. | Mar 1999 | A |
6092233 | Yang | Jul 2000 | A |
6516407 | Suga et al. | Feb 2003 | B1 |
Number | Date | Country | |
---|---|---|---|
20050050434 A1 | Mar 2005 | US |