Embodiments of the present disclosure are directed to the calculation of inverses and logarithms in error correction codes (ECCs).
Bose-Chaudhuri-Hocquenghem (BCH), Reed Solomon (RS) and other algebraic ECC decoders perform a relatively small number of discrete division and log operations. Due to the large computational complexity of these operations, they are currently executed with the help of two straightforward permanent lookup tables (LUTs) of size |F|×r bits each, where F=GF(2r) is the associated field and |F|=2r.
Let F=GF(2r) be a Galois field of 2r elements, where r is an integer, and fix α, a primitive element of F. A primitive element of a finite field GF(q) where q=2r is a generator of the multiplicative group of the field, so that each non-zero element of GF(q) can be written as αi for some integer i. For each β∈F*±F†{0}, define log(β)=j to be the unique integer in {0, . . . , q−2} such that αi=β. Note that since the log(0) is undefined, there are q−1 values of the log for the q−1 non-zero elements of F. The standard decoders of RS and BCH perform this operation following the Berlekamp Massey (BM) algorithm. If there are i errors, then t log operations are performed. These decoders also perform an inversion (i.e., division) operation in some implementations of the BM algorithm and in the application of Forney's algorithm to the RS decoder. These are notable examples where log and inversion are used in algebraic ECCs. The overall number of times these operations are performed is typically proportional to the number of errors and erasures. The common practice is to perform these operations with 2 permanent straightforward lookup tables (LUTs), one for the log and the other for inversion. Each of these tables is of size r×2r bits.
Large LUTs are currently being used for a relatively small number of log and inversion operations performed for useful algebraic ECCs. The HW cost is excessive, as a sizable hardware is currently dedicated to these operations. In addition, each performance of an inversion or log requires an access to a large LUT.
Exemplary embodiments of the present disclosure are directed to systems and methods for performing division and log operations with a few small tables whose overall size is 3.5×|F|1/2×r bits, for even r.
According to an embodiment of the disclosure, there is provided a computer implemented method of performing division operations in an error correction code that includes the steps of: receiving an output m of an error correction code (ECC) algorithm, wherein ω∈F*=F\{0} wherein F=GF(2r) be a Galois field of 2r elements, r=2s, r and s are integers, ω=Σ0≤i≤r−1βi×αi wherein α is a fixed primitive element of F, and βi∈GF(2), wherein K=GF(2s) is a subfield of F, and {1, α} is a basis of F in a linear subspace of K; choosing a primitive element δ∈K, wherein ω=ω1+α×ω2, ω1=Σ0≤i≤s−1γi×δi∈K, ω2=Σ0≤i≤s−1γi+s×δi∈K, and γ=[γ0, . . . , γr−1]T∈GF(2)2; accessing a first table with ω1 to obtain ω3=ω1−1 where ω3∈K, and computing ω2×ω3 in field K, when ω1≠0; accessing a second table with ω2=ω3 to obtain (1+α×ω2×ω3)−1=ω4+α×ω5, wherein ω4, ω5∈K, and wherein ω−1=(ω1×(1+α×ω2×ω3))−1=ω3×(ω4+α×ω5)=ω3×ω4+α×ω3×ω5; and computing products ω3×ω4 and ω3×ω5 in field K to obtain ω−1=Σ0≤i≤s−1θi×δi+α·Σi≤i≤s−1 θi+s=δi where θi∈GF(2).
According to a further embodiment of the disclosure, α is of minimal hamming weight.
According to a further embodiment of the disclosure, δ is of minimal hamming weight.
According to a further embodiment of the disclosure, the method includes computing a linear transformation A that transform every β=[β0, . . . , βr−1]T of GF(2)r to γ=A×β=[γ0, . . . , γr−1]T of GF(2)r wherein: β*=Σ0≤i≤s−1γi×δi+α·Σ0≤i≤s−1γi+s×δi=Σ0≤i≤r−1βi×αi, wherein ω=ω1+α×ω2, ω1=Σ0≤i≤s−1γi×δi∈K and ω2=Σ0≤i≤s−1γi+s×δi∈K, and applying an inverse linear transformation A−1 on θ=[θ0, . . . , θr−1]T wherein λ=A−1×θ=[λ0, . . . , λr−1]T∈GF(2)r and ω−1=Σ0≤i≤r−1λi×αi.
According to a further embodiment of the disclosure, the linear transformation A is computed offline.
According to a further embodiment of the disclosure, the first table T1 is T1={(β,γ): β=(β0, . . . , βs−1)∈V, γ=(γ0, . . . , γs−1)∈V: Σ0≤i≤s−1γi×δi=(Σ0≤i≤s−1βi×δi)−1}, wherein V=GF(2)s , W=GF(2)r and β is an index of an entry that contains γ, wherein a size of the first table is s×2s.
According to a further embodiment of the disclosure, the second table T2 is T2={(β,γ): β=(β0, . . . ,βs−1)∈V, γ=(γ0, . . . , γr−1)∈W: Σ0≤i≤s−1γi×δi+α·Σ0≤i≤s−1γi+s×δi=(1+α·Σ0≤i≤s−1βi×δi)−1}, wherein V=GF(2)s, W=GF(2)r and β is an index of an entry that contains γ, wherein a size of the second table is r×2s.
According to a further embodiment of the disclosure, the method includes, when ω1=0, accessing the first table with ω2=Σ0≤i≤s−1γi+s×δi to obtain θ=(θ0, . . . , θs−1)∈V such that ω2−1=Σ0·i≤s−1θi×δi; and calculating ω−1 from β·α−1=Σ0≤i≤r−1βi·αi−1=Σ1≤i≤r−1βi·αi−1+β0·Σ0≤i≤r−1ai+1·αi wherein ω1=α−1×ω2−1, wherein ai's are coefficients of a minimal polynomial of α with a minimal Hamming weight, and β is an arbitrary element of F*F\{0}.
According to another embodiment of the disclosure, there is provided a computer implemented method of performing log operations in an error correction code that includes the steps of: receiving an output t of an error correction code (ECC) algorithm, wherein ω∈F*=F\{0} wherein F=GF(2r) be a Galois field of 2r elements, r=−2s, r and s are integers, ω=Σ0≤i≤r−1βi×αi wherein α is a fixed primitive element of F, and βi∈GF(2), wherein K=GF(2s) is a subfield of F, and {1, α} is a basis of F in a linear subspace of K; choosing a primitive element δ∈K, wherein ω=ω1+α×ω2, ω1=Σ0≤i≤s−1γi×δi∈K, ω2=Σ0≤i≤s−1γi+s×δi∈K, and γ=[γ0, . . . , γr−1]T ∈GF(2)r; accessing a first table with ω1 to obtain ω3=ω1−1 where ω3∈K, and computing ω2×ω3 in field K, when ω1≠0; accessing a third table with ω1 to obtain log(ω1); accessing a fourth table with 1+α×ω2×ω3 to obtain log(1+α×ω2×ω3); and computing log(ω)=log(ω1×(1+α×ω2×ω1−1))=mod(q−1, log(ω1)+log(1+α×ω2×ω1−1)).
According to a further embodiment of the disclosure, α is of minimal hamming weight.
According to a further embodiment of the disclosure, β is of minimal hamming weight.
According to a further embodiment of the disclosure, the method includes computing a linear transformation A that transform every β=[β0, . . . , βr−1]T of GF(2)r to γ=A×β=[γ0, . . . , γr−1]T of GF(2)r wherein β*=Σ0≤i≤s−1γi×δi+α·Σ0≤i≤s−1γi+s×δi=Σ0≤i≤r−1βi×αi, wherein ω=ω1+α×ω2, ω1=Σ0≤i≤s−1γi×δi∈K and ω2=Σ0≤i≤s−1γi+s×δi∈K.
According to a further embodiment of the disclosure, the third table T3 is T3={(β,j): β=(β0, . . . , βs−1)∈V, 0≤j≤q−2: j=log(Σ0≤i≤s−1βi×δi)}, wherein V=GF(2)s, and β is an index of an entry that contains j, wherein a size of the third table is r×2s.
According to a further embodiment of the disclosure, the fourth table T4 is T4={(β,j): β=(β0, . . . , βs−1)∈V, 0≤j≤q−2: j=log(1+α·Σ0≤i≤s−1βi×δi)}, wherein V=GF(2)s, and β is an index of an entry that contains j, wherein a size of the third table is r×2s.
According to a further embodiment of the disclosure, the method includes, when ω1=0, and ω=α×ω2, accessing the third table with ω2 to obtain j=log(ω2), and calculating log(ω)=log(α×ω2)=mod(q−1, 1+j), wherein log(α)=1.
According to another embodiment of the disclosure, there is provided a computer implemented method of performing division and log operations in an error correction code that includes the steps of: receiving an output ω of an error correction code (ECC) algorithm, wherein ω∈F* wherein F=GF(2r) be a Galois field of 2r elements, r is an integer, ω=Σ0≤i≤r−1βi×αi wherein α is a primitive element of F and βi∈GF(2); when μ(β)=0, wherein μ(β)=min{0≤i≤r−1: βi=1}, reading 1/β from a first table and reading log(β) from a second table.
According to a further embodiment of the disclosure, the first table is T′1={1/β: β∈F* and μ(β)=0}, wherein a size of the first table is |F|/2.
According to a further embodiment of the disclosure, the second table T′2 is T′2={log(β): β∈F*} and μ(β)=0, wherein a size of the second table is |F|/2.
According to a further embodiment of the disclosure, the method includes, when μ(β)=s≥1, reading 1/γ from the first table, wherein γ=β/αs; and henceforth computing 1/β, applying the equation 1/β=γ·α−s, computing·α−1 from β·α−1=Σ0≤i≤r−1βi·αi−1=Σ1·i≤r−1βi·αi+1+β0·Σ0≤i≤r−1ai+1·αi wherein ω−1=α−1×ω2−1, wherein ai's are coefficients of a minimal polynomial of α with a minimal Hamming weight, and β is an arbitrary element of F*=F\{0}.
According to a further embodiment of the disclosure, further comprising, when μ(β)=s≥1, reading log(γ) from the second table, wherein γ=β/αs, and calculating log(β)=mod(q−1, s+log(γ)).
Exemplary embodiments of the disclosure as described herein generally provide systems and methods for performing division and log operations with a few small tables. While embodiments are susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that there is no intent to limit the disclosure to the particular forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the disclosure.
According to an embodiment, a discrete inverse and log operation can be computed with the aid of relatively small tables and relatively small computational complexity, wherein the overall cost of these operation is considerably reduced, in comparison to current practice. According to one embodiment, the overall size of the tables is 3.5×|F|1/2, with a moderate increase of computational complexity per operation. In another embodiment, the overall size of the tables is |F|=2r with little added computational complexity. Here F is the associated finite field. In current practice of algebraic ECC implementation, the overall size of both tables is 2·|F|. Operations according to embodiments of the disclosure are suitable for low-power 6-bit cell or mobile ECCs.
Embodiments of the present disclosure provide smaller tables for the inverse and log operations at the cost of increased arithmetic complexity per operation. One embodiment is for even r. In this case the overall tables' size is 3.5×r×2r/2 bits as opposed to 2×r×2r in current practice. The resulting computational complexity cost of a single division or a single log operation is not much greater than the complexity of single product operation in the field. Denote by n the code length and by t the number of error after going through the channel. In RS and BCH, there are O(n2) products and O(t) divisions and log operations. In another embodiment, the tables for the inverse and log are each of half the field size, when there are near zero computations. In this embodiment the computational complexity is relatively negligible. This embodiment is suitable for every r.
Suppose that, r=2×s, for positive integer s. Let F=GF(2r), and α∈F be a primitive element of minimal Hamming weight, where the Hamming weight is the number of one's in the bit representation of α. That is,
p(x)=Σ0≤i≤r−1ai×αi,ai∈GF(2), (1)
is the minimal polynomial of a with minimal hamming weight,
Note that by EQ. (1):
αr=Σ0≤i≤r−1ai×αi and also α−1Σ0≤i≤r−1+1ai×αi.
Let r′=ham(p(x))−1. It appears that r′=2 would be available in many real world applications. Note that a0=ar=1. Then there are r′ nonzero coefficients in the right side of both equations. The fact that this is typically a small number enables us to achieve further complexity reduction. Take now an arbitrary element β∈F*=F\{0} and consider its binary representation with respect to the basis of F over GF(2), {1, α, . . . , αr−1}:
β=Σ0≤i≤r−1βi·αi
where βi∈GF(2). Define hamα(β)=ham(β0, . . . , βr−1). Note that
β·α=Σ0≤i≤r−1βi·ai+1=Σ0≤i≤r−2βi·αi+1+βr−1Σ0≤i≤r−1ai·αi. (2)
Thus a cyclic shift and r′ GF(2) additions are required for the computation of β−α, when βr−1=1 and when βr−1=0, for half the fields elements, there are no additions.
Similarly
β·α−1=Σ0≤i≤r−1βi·αi−1=Σ1≤i≤r−1βi·αi−1+β0·Σ0≤i≤r−1ai+1·βi. (3)
Hence, here too, a cyclic shift and r′ GF(2) additions are required for the computation of β·β−1, when β0=1 and when β0=0, for half the fields elements, there are no additions.
For each β∈F*, define log(β)=j to be the unique integer in {0, . . . , q−2} such that αi=β. A following scheme according to an embodiment proposes reduced size LUTs for inverse and log operations at the cost of increased computational complexity per each inverse or log operation. It embodies a profitable tradeoff of computational complexity vs. hardware.
With the reduced tables according to an embodiment, the complexity cost of a single division or single log operation does not greatly exceed the complexity of single product operation in the field F. In RS and BCH, there are O(n2) products and O(t) divisions and log operations, where t is the number of errors caused by the channel. The overall tables' size according to an embodiment is 3.5×2s F-symbols as oppose to 2r+1 in current practice. Consider the field K=GF(2s) as a subfield of F. Note that [F:K]=2, i.e., |F|=2|K|, and that {1, α} is a basis of F over K.
Let δΣK be a primitive element of minimal hamming weight. Then, according to an embodiment, an arbitrary element τ of K can be written with the representation:
τ=Σ0≤i≤s−1τi×δi,τi∈GF(2). (4)
Given this representation, note that the product of two elements from K has ¼ of the complexity of a product in F, and is far easier to implement.
According to an embodiment, an element β* in F can be written uniquely by:
β*=Σ0≤i≤r−1βi×αi,βi∈GF(2), (5)
where r=2×s, as defined above. That is, β* is represented by the basis {αi: 0≤i≤r−1} of F as a linear space over GF(2).
Alternatively, according to an embodiment, similar to EQS. (2) and (3), above, β* can be written uniquely in the form:
β*=Σ0≤i≤s−1γi×δi+α·Σ0≤i≤s−1γi+s×δi,γi∈GF(2). (6)
That is, β* is represented by the basis {δi, α×δi:0≤i≤s−1} of F as a linear space over GF(2). Note that the first term on the right hand size is purely in the field K, while the second term is α multiplied by an element in field K.
Hence, there is an invertible linear transformation over GF(2) that takes β=[β0, . . . , βr−1]T to γ=[γ0, . . . , γr−1]T, when EQS. (2) and (3) hold. This linear transformation can be represented by an r×r matrix A∈GF(2)r×r such that A×β=γ. According to an embodiment, matrices A and B=A−1 are stored in permanent memory, allocating for this purpose 2×r2 bits. Note that the matrices are easy to compute, can be precomputed offline and stored in memory, and require relatively little storage as compared to the inverse and log tables that will be introduced below.
According to an embodiment, let V=GF(2)s and W=GF(2)r. In addition to A and B, the tables stored in memory include the following LUTs: {Ti}1≤i≤4. The LUTs are structured so that β is the input and γ or j are the output. According to an embodiment, β is not stored explicitly, but is rather the index of the entry that contains γ or j, and is written in the form: (β,γ) or (β,j).
T
1={(β,γ):β=(β0, . . . ,βs−1)∈V,γ=(γ0, . . . ,γs−1)∈V:Σ0≤i≤s−1γi×δi=(Σ0≤i≤s−1βi×δi)−1},
T
2={(β,γ):β=(β0, . . . ,βs−1)∈V,γ=(γ0, . . . ,γr−1)∈W: Σ0≤i≤s−1γi×δi+α·Σ0≤i≤s−1γi+s×δi=(1+α·Σ0≤i≤s−1βi×δi)−1},
T
3={(β,j):β=(β0, . . . ,βs−1)∈V,0≤j≤q−2:j=log(Σ0≤i≤s−1βi×δi)},
T
4={(β,j):β=(β0, . . . ,βs−1)∈V,0≤j≤q−2:j=log(1+α·Σ0≤i≤s−1βi×δi)}.
For a table T, denote by |T| the number of bits in T. It holds that |T1|=s×2s, |T2|=|T3|=|T4|=r×2s.
In the following exposition, ωu, 1≤i≤5, is a notation for an element in the small field K. All standard arithmetic operations can be performed in K. A first step 112, according to an embodiment, is to apply the linear transformation A to β=[β0, . . . , βr−1]T. The result is γ=[γ0, . . . , γr−1]T, such that, γ=A×β, which implies:
ω=ω1+α×ω2, (7)
which follows from EQ. (6), above, where ω1=Σ0≤i≤s−1γi×δi∈K and ω2=Σ0≤i≤s−1γi+s×δi∈K.
At step 114, test for whether ω1≠0. If ω1≠0, then, at step 116, by accessing the table T1, a procedure according to an embodiment finds ω3=ω1−1 where ω3∈K is given by EQ. (1). The procedure then computes the product of ω2×ω3 in the small field K, based on the representation of EQ. (1). Note that
ω=ω1×(1+α=ω2×ω3). (8)
According to an embodiment, a next step 118 is to access T2 to find the inverse of 1+α×ω2×ω1−1 written by:
ω4+α×ω5=(1+αω2×ω3)−1,
where ω2×ω3∈K and ω4, ω5∈K are given by EQ. (1). Observe that:
ω−1=(ω1×(1+α×(ω2×ω3))−1=ω3×(ω4+α×ω5)=ω3×ω4+α×ω3×ω5.
At step 120, an inversion procedure according to an embodiment computes the products ω3×ω4 and ω3×ω5 in the small field K. Thus of ω−1 has been found in the form of EQ. (3):
ω−1=Σ0≤i≤s−1θi×δi+α·Σi≤i≤s−1θi+s=δi,θi∈GF(2).
Applying the linear transformation B on θ=[θ0, . . . , θr−1]T at step 122, a result λ=[λ0, . . . , λr−1]T∈GF(2)r is obtained, such that, λ=B×θ and hence:
ω−1=Σ0≤i≤r−1λi×αi.
The overall complexity of this inversion procedure is 2 products of a fixed r×r-bit-matrix by r-bit-vectors and 2 table reads and 3 products in the small field K.
According to an embodiment, if, at step 114, ω1=0, then ω=α×ω2, and hence ω−1=α−1×ω2−1. Recall that EQ. (7) provides the following representation for ω2:
ω2=Σ0≤i≤s−1γi+s×δi.
At step 130, accessing table T, yields θ=(θ0, . . . , θs−1)∈V such that
ω2−1Σ0≤i≤s−1θi×δi.
Next, at step 132, α−1 can be calculated from EQ. (3):β·α−1=Σ0≤i≤r−1βi·αi−1=Σ1≤i≤r−1βi·αi−1+β0·Σ0≤i≤r−1ai+1·αi, so that ω1=α−1×ω2−1.
Suppose again that EQ. (4) is computed so that {ωi}1≤i≤3 is already computed wherein ω3=ω1311 and ω2×ω3 is computed, and ω=ω1×(1+α×ω2×ω3). A log algorithm according to an embodiment proceeds by transitioning to step 124 after step 116, instead of transitioning to step 118. At step 124, table T3 is accessed to find log(ω1) and table T4 is accessed at step 126 to find log(1+α×ω2×ω3). Finally, at step 128, the log of w is found by computing
log(ω)=log(ω1×(1+α×ω2×ω1−1))=mod(q−1,log(ω1)+log(1+α×ω2×ω1−1)),
where the mod(q−1, *) reflects the q−1 non-zero elements in the field.
According to an embodiment, if, at step 114, ω1=0, then ω=α×ω2, and a simplified version of the above can be performed. In particular, at step 134, table T3 can be accessed to find ω3=log(ω2), after which a procedure according to an embodiment jumps to step 128 to compute log(ω)=log(α×ω2)=mod(q−1, 1+ω3), where log(α)=1
Embodiments of the disclosure also provide a scheme for inverse and log operations with tables of half the field size and very small computational cost per log or inverse operation. These operations are suitable for any finite field of characteristic 2, i.e., a finite field GF(2r).
β=Σ0≤i≤r−1βi×αi,βi∈GF(2).
Define
μ(β)=min{0≤i≤r−1: βi=1},
which is the index of the first 1-bit in β.
According to an embodiment, the following table is stored:
T′
1={1/β:β∈F* and μ(β)=0}
which is the first bit in β.
Since the first bit of β is 1, i.e., β0=1, the size of the table is reduced by 2, so it holds that: |T′1|=|F|/2. Noting that when β0=1, the address of 1/β on the table is given by (β1, . . . , βr−1). Let β∈F*. If, at step 212, μ(β)=0, then 1/β can be read from T1 at step 224, and hence in this case no is computations are needed. Otherwise, μ(β)=s≥1. Let γ=β/αs∈T′1 and note that μ(γ)=0. Thus 1/γ is in T′1. Now, β=α−s×γ−1 and hence, 1/γ can be read from T1 at step 216.
Note that a repetitive application of EQ. (2) provides a computation of the scalars {β·αi}1≤i≤N where N≥1 with at most N·r′ GF(2) additions and an average over all β∈F of N·r′/2 additions. Likewise it follows from EQ. (3) that the scalars {β·α−i}i≤i≤N can be computed with at most N·r′ GF(2) additions and an average over all β∈F of N·r′/2 additions. Thus, 1/β can be computed at step 218, with s×r′/2 XORs, in a straightforward manner, applying the aforementioned recursion. Noting that for r>s≥0, Pr(μ(β)=s)<2−(s+1), it holds that that on average there are fewer than r′/2 XORs to find 1/β for random β∈F*.
Similarly, for the log operation, the following table is stored:
T′
2={log(β):β∈F* and μ(β)=0}.
Similar to T′1, it then holds that |T′2|=|F|/2. Let β∈F*. If μ(β)=0 at step 212, log(β) can be read from T′2 at step 220. Otherwise, μ(β)=s≥1. Then, γ=β/αs∈T′2 and hence log(γ) can be read from T′2 at step 222, from which the log(β) can be calculated at step 224:
log(β)=mod(q−1,s+log(γ)).
Observe that log(β)=log(γ)+s, when log(γ)+s≤q−2 and log(β)=log(γ)+s−(q−2) when log(γ)+s>q−2.
It is to be understood that embodiments of the present disclosure can be implemented in various forms of hardware, software, firmware, special purpose processes, or a combination thereof. In one embodiment, the present disclosure can be implemented in hardware as an application-specific integrated circuit (ASIC), or as a field programmable gate array (FPGA). In another embodiment, the present disclosure can be implemented in software as an application program tangible embodied on a computer readable program storage device. The application program can be uploaded to, and executed by, a machine comprising any suitable architecture.
The computer system 41 also includes an operating system and micro instruction code. The various processes and functions described herein can either be part of the micro instruction code or part of the application program (or combination thereof) which is executed via the operating system. In addition, various other peripheral devices can be connected to the computer platform such as an additional data storage device and a printing device.
It is to be further understood that, because some of the constituent system components and method steps depicted in the accompanying figures can be implemented in software, the actual connections between the systems components (or the process steps) may differ depending upon the manner in which the present invention is programmed. Given the teachings of the present invention provided herein, one of ordinary skill in the related art will be able to contemplate these and similar implementations or configurations of the present invention.
While the present invention has been described in detail with reference to exemplary embodiments, those skilled in the art will appreciate that various modifications and substitutions can be made thereto without departing from the spirit and scope of the invention as set forth in the appended claims.