The invention relates generally to error correction and encryption coding of data in digital communications using finite fields, and particularly to a method and apparatus for efficient multiplication in finite fields and a method for construction of arbitrarily large finite fields.
A multiplier for complex numbers may be implemented by combining the outputs of smaller multipliers operating over the subfield of real numbers. A complex number, A, may be represented as a two-component vector {a1, a0} in a hypothetical computer, with the understanding that complex A may be regarded as a polynomial over the real numbers,
A(j)=a1j+a0=Im[A]j+Re[A]
where a0 and a1 are real. Recall that the complex product C=AB is given by
C(j)=c1j+c0={a1b0+a0b1}j+{a0b0−a1b1}.
The relationship may be expressed as
C(j)=A(j) B(j)modulop(j),
where p(x) is an irreducible polynomial of degree two over the real numbers,
p(x)=x2+1,
and j is assumed to be a root of p(x).
A first method of determining the complex product determines four real products {a1b0, a0b1, a0b0, and a1b1} and combines the products using a real addition and a real subtraction. In the hypothetical computer, m binary bits represent a real number, and the space-time complexity of a real m-bit multiplier is approximately m2, whereas the complexity of real addition, km, is relatively small. The space-time complexity of the complex 2m-bit multiplier by this first method is approximately 4 m2 for larger in.
Methods of determining a complex product using only three real multiplications have been known since the 1950s. A discussion is in Fast Algorithms for Digital Signal Processing, Richard E. Blahut, pp. 1-19, ISBN 0-201-10155-6, Addison-Wesley, Reading Mass. (1985). A second method of determining the complex product computes two real additions, three real multiplications, and two real subtractions, s0=a1+a0, s1=b1+b0, m1=s0 s1,m2=a1 b1, m3=a0b0, c0=m3 m2, and c1=m1 c0. The space-time complexity using this second method is approximately 3 m2 for larger in.
A similar algorithm may be used to reduce the complexity of multipliers for finite fields, which are also known as Galois fields, in honor of the mathematician Evariste Galois. Early references include Sur la theorie des nombres, Bull Sci. Math. de M. Ferussac 13, 428-435 (1830), J. Math. Pures Appl. 11, 398-407 (1846), and Oeuvres math., pp. 15-23, Gauthier-Villars, Paris, 1987.
A field with q elements is denoted GF(q); the smallest finite field is the field GF(2). The finite fields constructed here are extension fields of GF(2) with m-bit symbols, denoted GF(2m). These fields are known as fields of characteristic two, defined as a field where A+A=0 for any field symbol A. In these fields, addition is the same as subtraction.
It turns out that a minimal complexity multiplier for a finite field with a small number of bits per symbol, i.e. in <6, typically uses a standard field representation, sometimes referred to in the literature as an “alpha-basis” or “canonical” representation. In a canonical representation for GF(2m), a symbol B is represented by in bits, denoted b0 to bm-1 here, and a distinguished element alpha (α) is defined with the understanding that
B=b
0
+b
1
α+b
2α2+ . . . +bm-1αm-1.
A small canonical multiplier for m-bit symbols requires (4m2−3) gate-area units as counted here. For example, a one-bit multiplier for GF(2) is implemented as a logical AND gate, whose complexity is counted as one gate-area unit here. A one-bit adder for GF(2) is assumed to have greater complexity; it is implemented as a logical exclusive-or (XOR) gate,
a+b=aXORb=(aANDb)NOR(aNORb),
and counted as three gate-area equivalent units here. Prior art implementations for subfields with m=2, 3, 4 or 5 are detailed further below and their complexity is summarized in Table 1.
A non-standard “split-field” multiplier may become a less complex alternative when the number of bits per symbol is even and at least six. A lower bound on the complexity of split-field multipliers is the combined complexity of three subfield multipliers and four subfield adders. If six bit symbols for GF(64) are split into two three-bit symbols over the subfield GF(8), for example, the lower bound using three GF(8) multipliers and four GF(8) adders is 135 gate-area units. A canonical multiplier for GF(64) is larger, using 141 gate-area units. In order to achieve the potential savings, an improved split-field multiplier whose complexity meets the lower bound is desired.
A prior art split-field multiplier is used to develop the lower bound and compared with an improved multiplier below. The prior art multiplier is shown as FIG. 8c in U.S. Pat. No. 4,958,348, Hypersystolic Reed-Solomon Decoder, Berlekamp et al. (1988), and discussed on pp. 4-5 of U.S. Pat. No. 5,689,452, Method and apparatus for performing arithmetic in large Galois field GF(2n), Cameron (1994). The multiplier uses a split-field representation, where an element (or “symbol”) in a finite field G with 2m-bit symbols has each symbol represented as a polynomial over a subfield F with m-bit symbols. It is known that if a quadratic polynomial
p(x)=p2x2+p1x+p0
is irreducible over the field F, i.e. it has no roots in F, an irreducible polynomial of the form
q(x)=x2+x+β
may be derived from p(x), where β is an element of F. The prior art multiplier uses an irreducible polynomial of the q(x) form. According to the teaching of the '452 patent, the limitation of form is not significant because an arbitrary primitive polynomial of degree two may be converted to the desired form through an algebraic transformation.
Let ω be a root of q(x). Symbols A and B from G are represented as
A(ω)=a1ω+a0
B(ω)=b1ω+b0
where a1, a0, b1, and b0 are elements of F. The polynomial product
A(ω)B(ω)=a1b1ω2+{a1b0+a0b1}ω+a0b0
is reduced modulo q(ω) to a polynomial of degree one or less. Because ω is a root of q(x), ω2+ω+β=0, and it follows that C(ω)=c1ω+c0, where
c
1
=a
1
b
0
+a
0
b
1
+a
1
b
1, and
c
0
=a
0
b
0
+βa
1
b
1.
The desired product may be determined as follows:
t
0
=a
1
+a
0,
t
1
=b
1
+b
0,
m
1
=t
0
t
1,
m
2
=a
1
b
1,
m
3
=a
0
b
0,
c
0
=m
3
+βm
2, and
c
1
=m
1
+m
3.
The multiplier for the field G using this prior art method has the complexity of three full multipliers and four adders for the field F plus the additional complexity, if any, of the constant multiplier used to multiply by β.
Field construction is discussed in “A New Architecture for a Parallel Finite Field Multiplier with Low Complexity Based on Composite Fields,” C. Paar, IEEE Trans. Computers, pp. 856-861, Vol. 45, No. 7, July 1996. Paar attributes the prior art method discussed above to V. Afanasyev, “On the Complexity of Finite Field Arithmetic,” Proc. Fifth Joint Soviet-Swedish Int'l. Workshop Information Theory, pp. 9-12, Moscow, USSR, January 1991.
The prior art method may be applied repeatedly to produce large finite fields as discussed further below. As a simple example, consider an m-bit symbol field F which has been extended to a 2m-bit symbol field G using a first irreducible polynomial q(x) of degree two over F. A second, 4m-bit symbol extension field H is to be constructed using a second application of the method. Paar teaches that the field G is exhaustively searched to determine those primitive polynomials q(x) with a minimum complexity with respect to constant multiplication by β (see p. 859).
Repeated application of the prior art method requires an ability to repeatedly search and identify a next member in a sequence of successive irreducible quadratic polynomials over larger and larger fields. To select the next sequence member, Paar further requires that all primitive polynomials in the set of possible irreducible quadratic polynomials be identified and that these polynomials are sorted for minimum multiplier complexity. He does not teach or suggest a method of repeatedly constructing extension fields without a plurality of searches for suitable polynomials. The search process becomes exponentially time consuming for large finite fields, limiting the size of finite fields which can be practically constructed using this prior art method. Instead, a general method to provide a sequence of extension polynomials facilitating minimal complexity multiplication without repeated searching is desired.
The invention incorporates an improved method of representing a finite field as an extension field, facilitating minimally complex multipliers for GF(22m). The improved methods are implemented in improved integrated circuits with low gate-area and are suitable for efficient implementations in software on a general-purpose computer. A “spit-optimal” multiplier meets a lower bound on the gate-area complexity, constructed with the gate area of three full subfield multipliers and four subfield adders, and no additional gates. An improved method and apparatus for multiplying provide improved support for split-optimal multipliers and efficient multiplication. The method of multiplication facilitates efficient multiplicative inversion.
A related method of repeatedly extending a small finite field to construct an arbitrarily large finite field is also disclosed. Split-optimal and nearly split-optimal solutions are disclosed for a wide variety of finite fields, in the range of four to 512 bits per symbol. The improved method facilitates construction of minimally complex multipliers for large finite fields by explicitly providing improved resource sharing to implement constant multipliers, and by utilizing particular polynomials with almost all-zero constants. The use of these constants facilitates efficient software implementations. Other desirable properties are incorporated in the constructed finite fields.
An example first (or bottom) level of hierarchy for a finite field multiplier is shown in
An example last (or top) level of hierarchy for a finite field multiplier is shown in
An example middle level of hierarchy for a finite field multiplier containing three or more levels of hierarchy is shown in
A.1. Improved Split-Field Multiplication
Assume that finite field G has a split-field representation where each 2m-bit symbol is represented as a polynomial over a subfield F with m-bit symbols. In the field F, select an irreducible polynomial of the form
r(x)=x2+γx+y=x2+γ(x+1)
where γ is an element of F. Preferably, the polynomial r(x) is selected so that the coefficient γ facilitates low complexity constant multiplication, as shown further below.
Let ω be a root of r(x). Symbols A and B from G are represented as
A(ω)=a1ω+a0
B(ω)=b1ω+b0
where a1, a0, b1, and b0 are elements of F. The polynomial product
A(ω) B(ω)=a1b1ω2+{a1b0+a0b1}ω+a0b0.
is reduced modulo r(ω) to obtain C(ω)=C1ω+c0, where
c
1
=a
1
b
0
+a
0
b
1
+γa
1
b
1, and
c
0
=a
0
b
0
+γa
1
b
1.
The desired product may be determined as follows:
m
1
=a
0
b
1,
t
0
=γb
1
+b
0,
t
1
=a
1
+a
0,
m
2
=a
1
t
0
m
3
=b
0
t
1
c
0
=m
3
+m
2, and
c
1
=m
1
+m
2.
These equations incorporate the complexity of three full subfield multipliers and four subfield adders plus the additional complexity, if any, of a constant multiplier for γ. All operations are performed over the subfield F.
In
Note that the first subfield multiplier 209 has an input operand, b1 207, and a first subfield adder 210 has the same input operand b1 207, but scaled by γn−1 in signal 208. Often, an auxiliary output 208 of the first subfield multiplier can be used as a source for the scaled operand with negligible additional cost, as demonstrated in the following sections.
A.2. Resource Sharing with a Canonical Subfield
Lets first consider a finite field G in a split-field representation where the subfield F is an m-bit subfield in a canonical representation, with m=2, 3, 4, or 5. Each symbol A in the field F is represented by m binary coefficients {am-1, . . . , a1, a0} and associated with a polynomial
A(α)=a0+a1α+ . . . +am-1αm-1,
where α is a root of p(x), an irreducible polynomial of degree m over GF(2). Lists of suitable binary irreducible polynomials may be found in W. Wesley Peterson and E. J. Weldon, Jr., Error-Correcting Codes, Second Edition, Appendix C, pp. 472-492, ISBN 0-262-16-039-0, The MIT Press, Cambridge, Mass. (1980).
Preferably, the polynomial p(x) has a minimum number of nonzero coefficients, resulting in simpler reduction modulo p(x). Preferred trinomials of the form
p(x)=xm+x+1
are irreducible over GF(2) and result in minimal complexity multipliers with minimal delay for the field F when m=2, 3, or 4. When m=5, a preferred trinomial, p(x)=x5+x3+1, may be used instead.
In some applications, it is preferred that the polynomial p(x) is a primitive polynomial, defined as follows. Let polynomial p(x) be an irreducible over a field F, and let ω be a root of p(x). The polynomial is used to generate a field G, each element of G representing an equivalence class of polynomials modulo p(ω) over F. Suppose that G has N distinct symbols. The polynomial p(x) is considered primitive over F if the powers of ω modulo p(ω), i.e. ω1 modulo p(ω), ω2 modulo p(ω), ω3 modulo p(ω), and so on, are the N−1 distinct nonzero elements of the field G. In this case, the polynomial root, w, is known as a primitive element of the field G and can be used as a base for logarithm and antilog tables. Each of the example polynomials above, for m in the range of two to five, is primitive over the field GF(2).
A minimal complexity subfield multiplier for a canonical subfield F is modified to be suitable for the purposes here in building larger fields. An example modified subfield multiplier 100 is shown in
U(α)=u0+αu1+ . . . +αm-1um-1,
then it follows that the product of U and T,
U(α)T(α)modulop(α)=u0T(α)
+u1{αT(α)modulop(α)}
+ . . .
+um-1{αm-1T(α)modulop(α)}.
The coefficients of the term [αk T(α) modulo p(α)] may be determined from the coefficients of the previous term, [αk-1 T(α) modulo p(α)], by multiplying by α and reducing modulo p(α). For example, if the binary m-tuple
{vm-1, . . . ,v1,v0}
represents an element V of F with m=2, 3, or 4, the element {αV modulo p(α)} is represented by
{vm-2, . . . ,v1,vm-1+v0,vm-1}
The scaled element can be implemented using one XOR gate and a rearrangement of bits. Each circled “α” represents an α-multiplier 103 in
Each sub-product symbol, {ukαkT(α) modulo p(α)}, can then be implemented as a one-by-m product using m parallel AND gates with a common input uk and an m-bit input {αkT(α) modulo p(α)}. In
For example,
When a canonical multiplier is used as a subfield multiplier in a larger field multiplier, the subfield multiplier is explicitly modified to support resource sharing in the larger multiplier by providing useful auxiliary outputs, such as those shown in
In various examples below, one or more auxiliary outputs may be left unused, or there may be additional auxiliary outputs referred to but not shown in
αT(α)modulop(α),
—the vector {t1+t0, t1}—is an internally available scaled input that can be explicitly provided as a first auxiliary output, AUX1={t1+t0, t1}. In addition, another low α-power scaling of the input T,
α2T=α2T(α)modulop(α)=t0α+(t1+t0),
can be provided as a second auxiliary output, AUX2={t0, t1+t0}, at negligible gate-area cost by reusing the output of the (t1+t0) XOR gate and arranging output bits accordingly.
To continue with this example, suppose that a GF(16) multiplier is then constructed using the split-field representation over GF(4). An irreducible polynomial r(x) over GF(4) of the form
r(x)=x2+γx+γ
is chosen to generate G as an extension field of F, preferably with multiplication by γ facilitated by one or more auxiliary outputs of the subfield multiplier. Here, the selection of a polynomial r(x) with either {γ0=α} or {γ0=α2} provides a primitive polynomial for constructing G. By using a modified canonical subfield multiplier for GF(4) 100 with two corresponding auxiliary outputs as multiplier 209 in
Note that, as a first approximation of complexity, only additional gates are counted here. Additional complexity costs of buffering signals, of providing additional outputs, and of routing additional signals are mostly ignored here.
This example split-optimal multiplier is considered the best design here for a split-field representation of GF(16), meeting the lower bound by using only three GF(4) multipliers and four GF(4) adders to implement the GF(16) multiplier. The complexity of the improved split-field design is 63 gate-area units.
As a final complexity check, the best split-field design for GF(16) is compared to other multipliers for GF(16), such as a smaller canonical GF(16) multiplier using 61 gate-area units. When the gate area is equal or nearly equal, other issues may arise. In some applications, implementations using only primitive polynomials may be preferred or required. A circuit for a low complexity multiplicative inverter may be required as well. The suitability of the multiplier for G as a building block in a split-field multiplier for a larger field in a hierarchical design may also be considered. The hierarchical approach is explored further in the following section, and inversion is in the section after that.
A.3. Resource Sharing with a Split-Field Subfield
In the previous section, a first extension field G is constructed as a split-field representation over a canonical field F. In this section, lets denote the first field F as G0, and the first extension field, G, as G1. The approach advocated here provides optimal and near-optimal split-field multipliers for fields further extended from G1, providing a sequence of fields, G2, G3, and so on, each with a successive doubling of the field symbol size. In a multi-layer hierarchical design,
For example, G1 may be constructed with a split-field multiplier as in the previous section with 4, 6, 8, or 10 bit symbols, as an extension field of G0, a canonical subfield F. In this case, a first extension polynomial r0(x)=x2+γ0x+γ0 with root ω0 is assumed to generate G1. The G1 multiplier is modified to explicitly support a G2 multiplier with 8, 12, 16, or 20 bit symbols. In this case, the G2 hierarchical design would have three layers.
The 2m-bit split-field multiplier of
A(ωn−1)=a0+ωn−1a1,
where ωn−1 is a root of rn−1(x)=x2+γn−1x+γn−1, an irreducible polynomial of degree two over a subfield Gn−1.
A polynomial of the form
r
n(X)=x2+γnx+γn
is irreducible over Gn and is used to generate Gn+1. Generally, the polynomial rn(x) is selected so that the constant multiplication by γn is easily implemented.
In preferred embodiments, the constant γn has a minimum number of nonzero coefficients. The constant γn is an element of Gn, with components {f0,f1} and associated polynomial representation
γn(ωn−1)=f0+ωn−1f1
where f0 and f1 are symbols of Gn−1. A constant γn with f0=0 is preferably selected, simplifying multiplication. It turns out that a constant of this form is always available for the fields of interest here.
For example, if n=1, a preferred γ1 is of the form
γ1(ω0)=s1ω0
where s1 is a scalar in the field G0. To explicitly support a G2 multiplier, the G1 multiplier is augmented to provide an auxiliary output corresponding to γ1B,
If an auxiliary output AUX is given by
AUX(ω0)=aux0+ω0aux1
then the two components of AUX are
aux1=s1(γ0b1+b0), and
aux0=s1γ0b1.
These components are often available without adding gates to the G1 multiplier, providing a split-optimal G2 multiplier. As one example, let G0 be a canonical representation of the five bit symbol field GF(32), generated by the polynomial
p(x)=x5+x3+1,
a primitive polynomial over GF(2). Let α be a root of p(x). Let G1 be a split-field representation of the 10-bit symbol field GF(1024), generated by the polynomial
r
0(x)=x2+α3x+α3,
a primitive polynomial over GF(32). A split-optimal multiplier for the field GF(1024) is constructed as shown in
r
1(x)=x2+γ1x+γ1
where s1=1 and γ1=s1ω0=ω0. The polynomial
r
1(x)=x2+ω0x+ω0
is primitive over the split-field GF(1024) and can be used to generate GF(220) with a doubly split-optimal multiplier. The first component
aux0=γ0b1
is available at auxiliary output 208 of
aux1=s1(γ0b1+b0)=γ0b1+b0
is available at the output t0 of the first adder 210, equal to the sum of auxiliary output 208 and b0 206. The two components in this case can be combined in an auxiliary output (not shown in
Another special case (not shown in
aux0=s1γ0b1=b1
is available as signal 206, one component of input B 202. The other component
aux1=s1(γ0b1+b0)
may be available as an auxiliary output of the second subfield multiplier 209 of
A third split-optimal case (not shown in
In general, the split-field multiplier for G, provides resources for multiplication by the constant γn by supplying one or more auxiliary outputs. An augmented split-field multiplier circuit 300 is shown in
In
γn−1=sn−1Πn−1
where sn−1 is a scalar from G0, and the product symbol Πi is defined by Π0=1 and
Πi=ωi−1Πi−1
for i>0. The multiplier for Gn is modified to provide an auxiliary output
In a preferred embodiment, the two components of γnB,
aux0=snΠn−1sn−1Πn−1b1 and
aux1=snΠn−1{(sn−1Πn−1b1+b0),
are available without adding additional gates to the multiplier for Gn, providing an auxiliary output to support a split-optimal multiplier for Gn+1. Alternatively, one or more auxiliary outputs of the multiplier Gn are modified or combined to facilitate easy multiplication by γn in the multiplier for Gn+1.
When the field extension method is applied repeatedly, the potential gate area savings of providing multiple auxiliary outputs may be outweighed by the need to accommodate additional bus area and routing for each additional auxiliary output, and the assumption that additional auxiliary outputs can be added without additional cost becomes less valid.
The auxiliary output 303 of multiplier 209 of
γn−1t0=sn−1Πn−1t0=sn−1Πn−1(sn−1Πn−1b1+b0)=sn−1aux1/sn.
v
n
=s
n
/s
n−1.
If vn is not one, the component aux1 can be obtained by re-scaling signal 303 by vn in a constant multiplier. Similarly, auxiliary output 302 is a scaling of the T input of the third multiplier 209,
γn−1b0=sn−1Πn−1b0.
The sum of auxiliary output 302 and auxiliary output 303 in a fifth adder 210 of
s
n−1aux0/sn.
The component aux0 can be obtained by re-scaling the output of the fifth adder 210 by vn in a constant multiplier. The two pre-scaled components of the auxiliary output are combined in bus 304, re-scaled in constant multiplier 305, and output on AUX 306.
As discussed above, a few first layers in a hierarchical design can be split-optimally crafted by appropriately selecting values for γ1, γ2, and so on to use available resources, and, if necessary, a plurality of auxiliary outputs may be added to explicitly provide resource sharing for one or more additional layers in a similar manner. However, as the number of hierarchical layers increases and the constructed field grows exponentially, so does the additional bus area for additional auxiliary output. For higher levels of hierarchy, using a relatively small number or extra gates to facilitate a chain of constant multiplications from a single auxiliary output, as in
A.4. Matching Inverter for a Split-Field Multiplier
When G is in a split-field representation as described here, a low complexity inverter for the field G is available. Let A be a nonzero symbol in a G with 2m-bit split-field symbols, generated by an irreducible polynomial r(x)=x2+γx+γ over an m-bit subfield F. Let ω be a root of r(x), and let A be such that
A(ω)=a1ω+a0.
Let B be the element associated with
B(ω)=a1ω+(a0+γa1)
Note that d=AB is given by
If A is nonzero, then d is nonzero, and d is a member of the subfield F. Let e be the multiplicative inverse of d in the subfield F,
e=1F/d.
It follows that C=eB is the multiplicative inverse of A in G. The following equations can be used to determine C(ω), the multiplicative inverse of A(ω):
s=a
0
+γa
1,
d=a
0
s+γa
1
2,
e=1/d,
c
0
=es,
c
1
=ea
1,
where
C(ω)=C1ω+c0.
In these equations, all operations are performed over the subfield F. In particular, the formulas express the inverse for field G in terms of the simpler inverse for subfield F. If G is GF(16) implemented as a split-filed over GF(4), for example, nonzero d is an element of GF(4), and d has two binary components {d1, d0}. The inverse of d has components
{e1,e0}={d1,d1+d0}.
In comparing the inverter for a split-field representation to the inverter for a canonical representation, the equations for a multiplicative inverse for the latter tend to contain a larger number of terms in a large finite field and are not easily simplified.
B.1. Construction of Arbitrarily Large Finite Fields
Consider the problem of constructing multipliers for a fairly large finite field G, such as one with 512 bit symbols. A problem with prior art methods is that the identification of one or more irreducible polynomials needed for construction of very large finite fields may be impractically difficult. For example, a prior art construction method for a field with 512 bit symbols as a canonical representation over GF(2) requires finding an irreducible polynomial of degree 512 over GF(2). Because tabulated polynomials are limited, the field constructor must typically conduct one or more polynomial searches. To check if an arbitrary binary polynomial of degree 512 is irreducible, a searcher determines if the arbitrary polynomial has any binary polynomial factors of degree 256 or less. A search of this magnitude is impractically time-consuming.
An improved method for constructing arbitrarily large finite fields is as shown in a Field Construction flowchart of
In step 401, various initializations occur. The index i in G1 is initialized to zero, the variable symbits is initialized to km, and an initial product Π0 is initialized to 1. The fields constructed here are extension fields of a field F, represented as a canonical GF(2m), with m an integer greater than zero. An extension field of F is selected as an initial “search” field G0. Typically, a relatively small field, such as GF(16), is selected as the search field. The field G0 may be the same as F, or may be constructed as an extension field of F by any known method, such as by selecting an irreducible polynomial of degree k over F to generate G0. The number of bits used to represent an element in the field G0 is km, where k is an integer greater than zero. Thereafter, each successive field in the sequence of finite fields doubles the symbol size.
The only search in the field construction method occurs once in step 402. The field G0 is searched to find a set of elements S. An element s of G0 becomes a member of S if and only if the polynomial
r(x)=x2+s(x+1)
is irreducible over G0. The results of example searches are shown below.
A sequence of extension fields is then constructed from G0, each successor subfield constructed using an irreducible polynomial of degree two, ri(x), over the predecessor subfield. Determination of a successor field begins in step 403. In step 403, a particular preferred irreducible polynomial is selected by choosing a particular value si in S. The coefficients of the preferred irreducible polynomial have a deterministic product term and a scaling by the chosen member of S. Preferred polynomials help to minimize multiplier complexity by having only one non-zero search field component. The constructed finite fields may incorporate other preferred characteristics, such as being generated solely from primitive polynomials. If so, the choice of a particular value s1 may depend in whole or in part on the desired characteristics. For example, if only primitive polynomials are desired, each potential polynomial ri(x) corresponding to a choice for si in S may be tested to check if it is a primitive polynomial.
When a suitable irreducible polynomial has been selected, successor field construction is completed in step 404. The variable ωi is an assumed root of the selected polynomial ri(x). An element C of Gi+1 is represented as a two-component vector
C=[c
0
,c
1]
where c0 and c1 are elements of G1. The element C is associated with the polynomial
C(ω1)=c0+c1ωi.
Also in step 404, the running product
Πi+1=ωiΠi
is updated, the constructed field index i is incremented, and the variable symbits is doubled.
Step 405 checks if the most recent successor field is sufficiently large for the purposes at hand. For example, the largest field generated may be used for error correction coding to protect data. In the case of error correction coding using Reed Solomon codes, the amount of data that may be protected by a given codeword is limited by the size of the constructed finite field, and step 405 may check to see if a sufficient amount of data can be protected.
If the constructed field is sufficiently large, the field construction method is complete and step 405 proceeds to termination of the Field Construction method in step 406. Otherwise, the method returns to step 403 to select a polynomial for a next successor field. Note that a successor polynomial is selected by choosing a value si in the previously found set S, without the need for a successive search. The flowchart loop of steps 403 to 405 continues until the constructed field present at step 405 is sufficiently large.
The method is demonstrated with various examples. In the examples, two preferred forms of search fields F are a field GF(2m) represented with a canonical basis, or a field GF(2m) in a split-field representation. The examples demonstrate efficient multipliers with symbol sizes up to 512 bits, some generated exclusively from primitive polynomials. The examples were all found on my low horsepower home computer, demonstrating the practicality of the improved field generation method.
B.2. Proof of the Validity of the Method
Proposition: The Polynomial
r
n(x)=x2+γnx+γn
is irreducible over G, and can therefore be used to extend field Gn to successor field Gn+1.
Proof: The proof proceeds by induction on n. A first field, G0, is searched to find a subset of field elements, S, such that
p(x)=x2+sx+s
is irreducible over G0 if and only ifs is a member of S. An arbitrary first member of S, s0, is selected to generate an extension field G1 using a first irreducible polynomial
p
0(x)=x2+s0x+s0.
Let ω0 be a root of p0(x). The extension field G1 is in a split-field representation, where an arbitrary element R of G1 is represented as a two-component vector with
R=r
1ω0+r0.
where r0 and r1 are elements of G0. Consider a second polynomial
p
1(x)=x2+s1ωOx+s1ω0=x2+s1Π1(x+1)
where s1 is an element of G0. The polynomial p1(x) is irreducible over G1 if and only if p1(x) has no root R in G. It may be observed that
It follows that p1(R)=0 if and only if the two components of p1(R) in G0 are both zero. If the two components are zero, it follows that the sum of the components is zero, i.e.
r
0
2
+s
1(r0+1)=0.
This equation cannot be satisfied in the first field G0 if s1 is an element of S. Therefore, with s1 an element of S, p1(x) has no roots and is irreducible.
By inductive hypothesis, assume that an arbitrary sequence of members of S,
{s0,s1, . . . ,sn−1},
has been selected as scalars to produce a sequence of irreducible polynomials
{p0(x),p1(x), . . . ,pn−1(x)},
where the polynomial
p
k(x)=x2+skΠk(x+1)
is irreducible over the field Gk and is used to generate a split-field Gk+1.
Let ωn−1 be a root of pn−1(x). The extension field Gn−1 is in a split-field representation, where an arbitrary element R of Gn−1 is represented as a two-component vector with
R=r
1ωn−1+r0.
where r0 and r1 are elements of Gn−2. Consider an nth polynomial
p
n(x)=x2+snΠn(x+1)
where sn is an element of G0. The polynomial pn(x) is irreducible over Gn−1 if and only if pn(x) has no root R in Gn−1. It may be observed that
It follows that pn(R)=0 if and only if the two components of pn(R) in Gn−2 are both zero. If both components are zero, the sum of the components is zero, i.e.
r
0
2
+s
nΠn−1(r0+1)=0.
By inductive hypothesis, this equation cannot be satisfied in the field Gn−2 if sn is an element of S. Therefore, pn(x) has no roots and is irreducible.
B.3. Examples of Application of the Method
If the search field is GF(2), the set S={1}. By definition, the constants {sn} are all members of S, with sn=1 for all n. Extension fields of search field GF(2) are then constructed as shown in Table 2.
The first line in Table 2 indicates that the first extension, with n=0, uses the polynomial r0(x)=x2+x+1 to generate G1=GF(4) as an extension field of G0=GF(2). Let ω0 be a root of r0(x). The second line indicates that the polynomial
r
1(x)=x2+102x+102
is irreducible over G1 and is used to generate G2=GF(16). Here, the notation 102 is shorthand used to indicate that γ1, as a member of GF(4), is a two component vector,
[a1,a0]=[1,0]
over GF(2), with the understanding that γ1=a1ω0+a0=ω0. The third line indicates that the polynomial
r
1(x)=x2+10002x+10002
is irreducible over G2 and is used to generate G3=GF(256). Here, the notation 10002 indicates that γ2, as a member of GF(16), is a two component vector,
[b1,b0]=[102,002]
over GF(4), with the understanding that γ2=b1ω1+b0=ω1ωO.
According to the proposition, an arbitrarily large finite field can be constructed by proceeding in a similar manner. Because each γn has only one nonzero component, multiplication by the coefficient γn is relatively easy, and scaling by the search field scalar, sn=1, is trivial. The schematics of
As discussed in the previous section, there are disadvantages for this construction over GF(2). The constructed multiplier for GF(16) with 63 gate-area units is 3% larger than a canonical multiplier for GF(16) with 61 gate-area units, and successor fields stem from the constructed GF(16) multiplier. On the other hand, successive multipliers may be made split-optimal with a minimal number of auxiliary outputs.
Another potential disadvantage of this example is that the third extension polynomial and successive polynomials are not primitive polynomials. In the fourth column of Table 2, a preferred primitive element αn for the field Gn+1 is listed. When ωn is the preferred primitive element of Gn+1, the polynomial rn(x) is primitive. In some applications, such as Reed Solomon coding over finite fields, a simple constant multiplier for a primitive element of the field is desired, implying a preference for primitive polynomials.
If the polynomial is not primitive, a primitive element of the field must typically be found and provided as in column 4 of Table 2. If the goal is to exclusively provide primitive polynomials at each construction stage, the choice of GF(2) as the search field is too constraining
As another example, let the search field F=GF(4), an extension field of GF(2) using the primitive polynomial p(x)=x2+x+1. Let a0 be a root of p(x). The set S is the set of all suitable search field values for γ in GF(4), so that
r(x)=x2+γx+γ
is irreducible if and only if γ is a member of S. Lets denote each of the four members of GF(4) as a duobinary digit, {04=002, 14=012, 24=102, 34=112}. In this notation, the set
S={24,34}={α0,α02}.
It turns out that either of the two choices for γ0 provides a primitive polynomial over GF(4). In Table 3, large fields are constructed using GF(4) as the search field. Each is constructed using only primitive polynomials.
Note that, in the example of Table 3, an arbitrary member of s0 of S may be selected as the value for γ0. Thereafter, a preference for primitive polynomials requires that the sequence of selected scalar values alternates between the two members of S. This may be expressed as s0=α0k where k is one or two, and si+1=si2 for all i.
The construction can continue in this manner to produce arbitrarily large finite fields. The constructed polynomials have been verified to be primitive with symbol sizes up to 512 bits. I conjecture that the alternating selection of scalar values in this example provides primitive polynomials for all values of n.
For more examples, let the search field F=GF(16), a canonical extension field of GF(2) using the primitive polynomial p(x)=x4+x+1. Let α be a root of p(x). Here, an element B of GF(16) is denoted as a 4-tuple {b3b2b1b0}2 with the understanding that
B(α)=b3α3+b2α2+b1α+b0.
Interpreting the 4-tuple as a hexadecimal digit, the powers of α in GF(16) are given by
AntilogTable={1,2,4,8,3,6,C,B,5,A,7,E,F,D,9,1},
where the ith entry of AntilogTable is αi, starting with i=0. The field F is searched to find the set S, where
S={2,3,4,5,8,A,C,F}16.
Note that S provides eight choices at each construction stage for s. Several low powers of α, including α=216, α2=416, and α3=816, are members of S and are available as auxiliary outputs of a modified canonical GF(16) multiplier.
One method of constructing arbitrarily large fields is to select members of S to provide a minimal complexity constant multiplication at each construction stage.
For example, one sequence of selections that simplifies implementation is to use a single constant as in Example 1 above, but with a sole value such as si=α for all i in this example for the search field GF(16). A disadvantage of this sequence is that the second extension field, GF(65536), and subsequent extension fields use polynomials that are not primitive.
In Table 4, two preferred sequences of selections are listed to provide examples with primitive polynomials at all construction stages. The first sequence of selections is listed as column sn in Table 4, whereas an alternative second sequence of selections is listed as column tn. The sequences were found using a computer program implementing the flowchart of
B.4. The Improved Construction Method with Prior Art Polynomials
As discussed in the introduction A.1, a prior art split-field construction method may be used to extend a finite field F to a field G using a quadratic irreducible polynomial of the form
q(x)=x2+x+β.
A prior art finite field multiplier for the extension field G may be implemented using three full multipliers for the field F, four adders for the field F, and a constant multiplier, multiplying by the constant I. Given a plurality of possible choices for β, a polynomial q(x) that facilitates simple constant multiplication is preferably selected. To minimize complexity, the field F is typically searched for all suitable values for β, and a polynomial q0(x) with a particular value β0 that minimizes complexity is selected.
It is known in the art that this extension method may be applied repeatedly. If an extension field H doubling the symbol size of G is desired, the field G is searched for a new set of suitable values for β, and a polynomial q1(x) with a particular value β1 that minimizes complexity is selected. A disadvantage of this approach is that it requires a new search at each stage of construction.
Instead, a method of selecting a sequence of irreducible polynomials for extending the field G without additional searches, as in the previous section, is desired. The flowchart of
Step 403 is replaced by a new step 503 (not shown in
Step 503:
Select a scalar si in S.
Let ri(x)=x2+x+Πi/si
Note that step 503 defines polynomial ri(x) differently than in step 403.
Step 404 is replaced by a new step 504 (not shown in
Step 504:
Let ωi be a root of ri(x).
Construct field Gi+1 as a split-field using a {1, ωi} basis and ri(x).
Let Πi+1=(ωi+1) Πi.
Increment i and double symbits.
Note that step 504 also redefines the running product R.
As a simple example, suppose that a multiplier for GF(65536) is to be constructed using the improved method with prior art polynomials over F=GF(16). The field F is in a canonical representation and is generated by the primitive binary polynomial,
p(x)=x4+x+1,
as above. Let α be a root of p(x). The field F is searched to find the set S, where
S={α,α
2,α3,α4,α6,α8,α9,α12}={2,4,8,3,C,5,A,F}16.
A first selection from S, s0=α2=416, is used to form a primitive quadratic polynomial over F,
q
0(x)=x2+x+s0−1=x2+x+α13=x2+x+D16.
A binary vector
{b3,b2,b1,b0}2
representing a symbol in a canonical GF(16) may be multiplied by the choice βO=D16 using two XOR gates and a rearrangement to obtain
{b0+b1,b0,b3,b0+b1+b2}2.
A multiplier for GF(256) using this selection is implemented using three GF(16) multipliers, four GF(16) adders, and a β-multiplier, with a total of 48 AND gates and 63 XOR gates. Let ω0 be a root of q0(x). A second selection from S, S1=α, is used to form a primitive quadratic polynomial over GF(256),
q
0(x)=x2+x+α14(ω0+1).
Multiplication by the choice β1=α14(ω0+1) in the sixteen-bit multiplier may be performed in two steps. Given that an eight-bit multiplier contains a constant multiplier providing α13b1, a split-field vector
B=b
1ω0+b0
may be multiplied by (ω0+1) to form
(ω0+1)B=b0ω0+(b0+α13b1),
using four XOR gates, and each of two components of this sub-product may be scaled by α14 using a single XOR gate. These six XOR gates may be added to one of three eight-bit multipliers in a sixteen-bit multiplier to provide an auxiliary output multiplying one eight-bit input by β1. The total number of gates for a sixteen bit multiplier using these selections and resource sharing through an auxiliary output is 144 AND gates and 227 XOR gates, or 825 gate-area units. The doubly split-optimal multipliers for GF(65536) disclosed in the previous section are more efficient, using 144 AND gates and 215 XOR gates, or 789 gate-area units.
By way of comparison, a prior art best example multiplier for GF(65536) is listed in Table 1 and shown in FIG. 1 of Paar, supra, p. 860. The prior art sixteen-bit multiplier uses 144 AND gates and 258 XOR gates, or 918 gate-area units. It is about 11% larger than the example above, and about 16% larger than the optimal multiplier for GF(65536).
A second advantage of the method disclosed here is that it allows for scalable implementations in software. Suppose, for example, that the sixteen-bit multiplier described in this section is to be implemented in software using known techniques for multiplication involving log and antilog tables. With the new construction, a software implementer may elect to use one of the three following alternatives. The first alternative allocates a storage space of 32 four-bit entries for log and antilog tables over GF(16), providing that a GF(65536) multiplication may be accomplished using 27 GF(16) log table lookups and relatively simple operations. The second alternative allocates a storage space of 512 eight-bit entries for log and antilog tables over GF(256), so that a GF(65536) multiplication may be accomplished using nine GF(256) log table lookups and simple operations. This second alternative provides a good compromise between throughput performance and storage requirements. The third alternative uses a storage space of 131,072 sixteen-bit entries for log and antilog tables for GF(65536), providing that a GF(65536) multiplication may be accomplished using three log table lookups and simple operations. Throughput may be flexibly traded off against required storage space to accommodate various needs. With the prior art construction, a best multiplier for GF(65536) is constructed directly as an extension field of GF(16), without the same alternative of supporting operations implemented over GF(256) with intermediate sized tables.
A further advantage of the improved construction method is that it provides for construction of a plurality of successor fields without requiring additional searches, using a preferred form of the constant βi for each successor field. If extension polynomials using the form of q(x) are preferred, the modified construction method can be used to produce arbitrarily large fields using this preferred form without consuming the additional time and resources of additional polynomial searches.
The embodiments shown and discussed here are for purposes of illumination and are not for purposes of limitation. As is well known in the art, various features of the methods discussed here may be implemented in other equivalent ways, and other combinations and permutations of the methods discussed herein may be utilized without departing from the true spirit of the invention, which is limited only by the claims.