This application claims the priority of Korean Patent Application No. 2003-72140, filed on Oct. 16, 2003, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.
1. Field of the Invention
The present invention relates to a method and apparatus for performing multiplication in a finite field.
2. Description of the Related Art
A finite field GF(2n) is a number system containing 2n elements. Based on the fact that each element of the finite field GF(2n) can be represented by n bits, practical applications of the finite field can be accomplished. Practical applications, such as implementation of an error correction code or elliptic curve cryptosystem in hardware, frequently perform calculations in GF(2n). An apparatus for encoding/decoding Reed-Solomon codes performs calculation in GF(2n), and an encryption/decryption apparatus of an elliptic curve cryptosystem performs calculation in GF(2n) where “n” is a large value.
The addition and multiplication rules of GF(2n), which contains only binary numbers 0 and 1, are defined by Equation (1).
0+0=1+1=0
0+1=1+0=1
0×0=1×0=0×1=0
1×1=1 (1)
Here, addition is a bitwise exclusive OR (referred to as XOR hereinafter) operation, and multiplication is a bitwise AND (referred to as AND hereinafter) operation.
Since the finite field GF(2n) (n>1) is a number system containing 2n elements, addition and multiplication correspond to arithmetic modulo for an irreducible nth-degree polynomial having coefficients in GF(2). The irreducible polynomial of degree n is referred to as a defining polynomial of the finite field. When a root of the defining polynomial is α, an element of the finite field has a standard representation given by Equation (2).
α0+α1α+α2α2+ . . . +αn-1αn-1=(α0,α1,α2, . . . ,αn-1), αi ∈GF(2) (2)
Multiplication of two elements in GF(2n) is given by polynomial multiplication of α and then a modulo operation by the defining polynomial. Addition of two elements in GF(2n) is performed by polynomial addition of α.
Multipliers, which perform multiplication in the finite field, can include a serial multiplier, a parallel multiplier, and a systolic multiplier. The serial multiplier has low area complexity, and the parallel multiplier performs multiplication using only a gate delay without latency. Accordingly, the parallel multiplier has high area complexity compared to the serial multiplier, but can reduce time complexity considerably. The systolic multiplier is designed to increase throughput, and thus, has relatively high area and time complexity.
For the serial multiplier among the multipliers, there is a demand for a method and apparatus for performing multiplication in a finite field by means of the serial multiplier that can reduce computational time and minimize the increase in area complexity.
The present invention provides a method and apparatus for performing multiplication through parallel processing by d bit unit when coefficients of the last d terms in a defining polynomial are assumed to be “0”.
According to an aspect of the present invention, there is provided a method of obtaining C=(c0, . . . , cn-1) of a product of two elements A and B of a finite field GF(2n) when a defining polynomial f(x) of degree n in the finite field GF(2n) is defined by f(x)=xn+h(x)=xn+(fn-1xn-1+ . . . +f1x+f0), fi ∈{0,1}, where fn-1= . . . =fn-d+1=0, d≧2 is an integer, α is a root of the defining polynomial, A and B of the finite field are expressed as
A=α0+α1α+α2α2+ . . . +αn-1αn-1=(α0,α1,α2, . . . ,αn-1),
B=b0+b1α+b2α2+ . . . +bn-1αn-1=(b0,b1,b2, . . . ,bn-1) with respect to the root α, and C of the product of A and B can be rewritten as C=A×B mod f(α), the method comprising: permuting the last d coefficients (an-1, . . . , an-d) of a multiplier A with predetermined variables (sn-1, . . . , sn-d); operating C:=C⊕(bi+j●A) for (i+j)th coefficient of a multiplicand B to update coefficients of C, where i and j are integers, and A:=(sn-1-j,α0, . . . ,αn-2)⊕(0,sn-1-j●f1, . . . ,sn-1-j●fn-d,0, . . . ,0) repeatedly for j=0 to (d−1 ) to update coefficients of A, where ⊕ represents an XOR operation and ● represents an AND operation; and repeatedly performing the permuting and operating by increasing i from 0 to (n-1) by d.
According to another aspect of the present invention, there is provided an apparatus for obtaining C=(c0, . . . , cn-1), of a product of two elements A and B of a finite field GF(2n) when a defining polynomial f(x) of degree n in GF(2n) is defined by f(x)=xn+h(x)=xn+(fn-1xn-1+ . . . +f1x+f0), fi ∈{0,1}, where fn-1= . . . =fn-d+1=0, d≧2, d is an integer, α is a root of the defining polynomial, A and B of the finite field are expressed as
A=α0+α1α+α2α2+ . . . +αn-1αn-1=(α0,α1,α2, . . . ,αn-1),
B=b0+b1α+b2α2+. . . +bn-1αn-1=(b0,b1,b2, . . . ,bn-1) with respect to the root α, and C of the product of A and B can be rewritten as C=A×B mod f(α), the apparatus comprising: a multiplier storage unit, which stores coefficients of a multiplier A; a multiplicand storage unit, which stores coefficients of a multiplicand B; a product storage unit, which stores C of the product of A and B; a multiplication unit, which performs operations of the following Equation C:=C⊕(bi+j●A), repeatedly, for j=0 to (d−1), where i and j are integers, ⊕ represents an XOR operation, and ● represents an AND operation, repeatedly performs the above steps by increasing the variable i from 0 to (n-1) by d to obtain updated coefficients of C, and outputs the updated coefficients of C to the product storage unit; and a multiplier updating unit, which performs operations of the following Equation A:=(sn-1-j,α0, . . . ,αn-2)⊕(0,sn-1-j●f1, . . . ,sn-1-j●fn-d,0, . . . ,0) to update the coefficients of A, where i and j are integers, ⊕ represents an XOR operation, and ● represents an AND operation, and outputs the updated coefficients of A to the multiplier storage unit.
According to still another aspect of the present invention, there is provided a method of obtaining C=(c0, . . . , cn-1), of the product of two elements A and B of a finite field GF(2n) when a defining polynomial f(x) of degree n in GF(2n) is defined as
f(x)=xn+h(x)=xn+(fn-1xn-1+ . . . +f1x+f0), fi ∈{0,1},
where fn-1= . . . =fn-d+1=0, d≧2, d is an integer, α is a root of the defining polynomial, A and B of the finite field have a standard representation with respect to
A=α0+α1α+α2α2+ . . . +αn-1αn-1=(α0,α1,α2, . . . ,αn-1),
the root α as shown in B=b0+b1α+b2α2+ . . . +bn-1αn-1=(b0,b1,b2, . . . ,bn-1), A′, a dual representation of A, is expressed as A′=(α0′,α1′,α2′, . . . ,αn-1′), and C can be written as C=A×B mod f(α), the method comprising: converting A into A′; operating of the following formulae
sj=αj′⊕(f1●αj+1′)⊕ . . . ⊕(fn-d●αn-d+j′)
ci+j′:=(b0●αj′)⊕ . . . ⊕(bn-1-j●αn-1′)⊕(bn-j●s0)⊕ . . . ⊕(bn-1●sj−1) repeatedly for i, j=0 to d−1 to update coefficients of C′, which is a dual representation of C, where i and j are integers, ci+j, is a dual representation of ci+j,; shifting A′ left d times to update A′; updating the last d coefficients of A′ with sj; repeatedly performing the operating, shifting, and updating by increasing the variable i from 0 to (n-1) by d to obtain updated coefficients of C′; and performing basis conversion on the updated C′.
According to yet another aspect of the present invention, there is provided an apparatus for obtaining C=(c0, . . . , cn-1), of a product of two elements A and B of a finite field GF(2n) when a defining polynomial f(x) of degree n in GF(2n) is defined as f(x)=xn+h(x)=xn+(fn-1xn-1+ . . . +f1x+f0), fi ∈{0,1}, where fn-1= . . . =fn-d+1=0, d≧2, d is an integer, α is a root of the defining polynomial, the two elements A and B of the finite field have a standard representation with respect to the root a as shown in
A=α0+α1α+α2α2+ . . . +αn-1αn-1=(α0,α1,α2, . . . ,αn-1),
B=b0b1α+b2α2+ . . . +bn-1αn-1=(b0,b1,b2, . . . ,bn-1), A′, a dual representation of A, is expressed as A′=(α0′,α1′,α2′, . . . ,αn-1′), and C of A and B is rewritten as C=A×B mod f(α), the apparatus comprising: a basis converting unit, which converts the standard representation into a dual representation, or converts the dual representation into the standard representation; a multiplicand storage unit, which stores coefficients of a multiplicand B; a multiplier storage unit, which stores coefficients of A′ obtained by converting basis, of a multiplier A by means of the basis converting unit; a multiplier updating unit, which updates the coefficients of A′ according to a predetermined Equation and outputs the updated coefficients to the multiplier storage unit; and an operation unit, which includes a plurality of multipliers multiplying each mth coefficient from the multiplicand storage unit by each (m+j)th coefficient from the multiplier storage unit where j varies from 0 to (d−1) and multiply the last j coefficients from the multiplicand storage unit by a part of outputs from the multiplier updating unit, and a plurality of logic operation members fort performing XOR operations on only outputs containing the (m+j)th coefficients from the plurality of multipliers and output the last d ci's, wherein after C′ is obtained by the operation unit, the basis converting unit converting basis of C′ to obtain C.
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings in which:
The present invention will now be described more fully with reference to the accompanying drawings, in which exemplary embodiments of the invention are shown.
A defining polynomial f(x) of a finite field GF(2n) is represented by Equation 3.
f(x)=xn+h(x)=xn+(fn-1xn-1+ . . . +f1x+f0), fi ∈{0,1} (3)
If α is a root of the defining polynomial, h′(a) is defined by Equation 4.
h(α)=(f0,f1,f2, . . . ,fn-1) (4)
Assume that ⊕ represents a bitwise XOR operation and ● represents a bitwise AND operation. An operation ● between a bit and a vector is defined by Equation 5.
α●(c0, . . . , cn-1)=(α●c0, . . . , α●cn-1), where α, ci={0,1} (5)
Shift operations designated by >> and << are defined as follows. (α0, . . . ,αn-1)>>1 means that each coefficient is shifted right once as shown in Equation 6.
For [i=n-1 to 1]
αi:=αi-1
α0:=0 (6)
(α0, . . . ,αn-1)<<1 means that each coefficient is shifted left once as shown in Equation 7.
For [i=0 to n-2]
αi:=αi+1
αn-1:=0 (7)
Two bases, i.e., a standard basis and a dual basis, are used for multiplication of two elements of the finite field according to an embodiment of the present invention.
First, multiplication using the standard basis will now be explained.
When the standard basis is used, two elements A and B of GF(2n) may be defined by Equation 8.
A=α0+α1α+α2α2+ . . . +αn-1αn-1=(α0,α1,α2, . . . ,αn-1),
B=b0+b1α+b3α2+ . . . +bn-1αn-1=(b0,b1,b2, . . . ,bn-1) (8)
A product C of A and B is defined by Equation 9.
C=A×B mod f(α) (9)
Here, × represents polynomial multiplication.
Equation 9 can be expanded as a code expression in Equation 10.
C:=(0, . . . , 0)
For [i=0 to n-1]
C:=C⊕(bi●A)
A:=(A>>1)⊕(αn-1●h(α))
Rename coefficients of the element A as α0, . . . ,αn-1 (10)
Multiplication according to Equation 10 will now be explained in detail. An ith coefficient of a multiplicand B is multiplied by each coefficient of a multiplier A, and an exclusive OR (XOR) operation is performed on the results of the multiplication and previous coefficients of the product C, thereby updating the respective coefficients of the product C. Also, a coefficient of a term with the highest power of A is multiplied by each coefficient of terms other than the term with the highest power in the defining polynomial of the finite field. The multiplication results are XORed with once rightly shifted coefficients of A, thereby updating the respective coefficients of the A. Final C is obtained by repeating the above-described process n times.
Changes of the multiplier A within the for loop in the multiplication algorithm of Equation 10 will now be explained. When i=k, it is assumed that
fn-1= . . . =fx-d+1=0inbk●A, and sn-1:=αn-1, . . . ,sn-d:=αn-d(d≧2).
When i=k+1, A may be defined by Equation 11, based on Equation 10.
A=(sn-1,α0, . . . ,αn-2)⊕(0,sn-1●fn-d,0, . . . ,0) (11)
When i=k+2, A may be expressed by Equation 12, based on Equation 10.
A=(sn-2,sn-1,α0, . . . ,αx-3)⊕(0,0,sn-1●f1, . . . ,sn-1●fn-d,0, . . . ,0)⊕(0,sn-2●f1, . . . ,sn-2●fn-d,0, . . . ,0) (12)
In the same manner, when i=k+d, A may be defined by Equation 13, based on Equation 10.
A=(sn-d, . . . ,sn-1,α0, . . . ,αn-d−1)⊕(0, . . . ,0,sn-1●f1, . . . ,sn-1●fn-d)⊕0 . . . ⊕(0,sn-d●f1, . . . ,sn-d●fn-d,0, . . . ,0) (13)
Accordingly, when i=k+d, d-bit parallel processing can be done by assumption. Further, if d is small enough, it satisfies most practical cases, that is, parameters of the elliptic curve cryptosystem according to SEC and ANSI X9.62 standards, thereby not violating practicability.
Multiplication using the standard basis on which the d-bit parallel processing can be performed can be represented using code expressions. That is, C, the product of A and B, can be represented by Equation 14 when fn-1= . . . =fn-d+1=0.
C:=(0, . . . ,0)
For [i=0 to n-1, i=i+d]
Let sn-1:=αn-1, . . . ,sn-d:=αn-d
For [j=0 to d−1, j++]
C:=C⊕(bi+j●A)
A:=(sn-1-j,α0, . . . ,αn-2)⊕(0,sn-1-j●f1, . . . ,sn-1-j●fn-d,0, . . . ,0)
Rename the coefficients of A as α0, . . . ,αn-1 (14)
In multiplication according to Equation 14, the process described with reference to Equation 10 are performed by d bit unit. Consequently, time complexity can be improved d times and the increase in area complexity can be minimized.
Here, each of the multiplicand storage units 21 and 22 includes d partial storage units. For a kth coefficient of the multiplicand B, when k is modulo operated by d, coefficients corresponding to the same modulo operation results are sorted out and stored into each partial storage unit. Since the shown multiplicand storage units 21 and 22 correspond to a case of d=2, each of them includes a first partial storage unit 21, which stores only coefficients of odd terms, and a second partial storage unit 22, which stores only coefficients of even terms.
The multiplication unit 4 includes a plurality of multipliers and XOR operators. The multipliers form d sets corresponding to multiplicand coefficients stored in the partial storage units 21 and 22. An mth multiplier of the respective d sets of multipliers multiplies an mth multiplicand coefficient output from the corresponding partial storage unit by an mth multiplier coefficient output from the multiplier storage unit 1.
Results of the multiplication are XORed by an mth XOR operator and then added to pertinent coefficients in the product storage unit 3. That is, the multiplication and XOR operation are performed by d bits according to C:=C⊕(bi+j●A) of Equation 14 and results of the multiplication and XOR operation are stored in the product storage unit 3. The multiplier updating unit 5 updates multiplier coefficients according to A:=(sn-1-j,α0, . . . ,αn-2)⊕(0,sn-1-j●f1, . . . ,sn-1-j●fn-d,0, . . . ,0) of Equation 14 to be stored in the multiplier storage unit 1, respectively.
After 1 cycle, A becomes A=(α3,α4,α0,α1,α2)⊕(0,0,0,α4,0)⊕(0,0,α3,0,0)=(α3,α4,α0⊕α3,α1⊕α4,α2) according to Equation 14.
Multiplication using a dual basis according to an embodiment of the present invention will now be explained.
In the multiplication using the dual basis, a product is obtained by performing multiplication of a multiplier in a dual basis and a multiplicand in a standard basis.
Consider two elements A and B in GF(2n) represented by Equation 15.
A=α0+α1α+α2α2+ . . . +αn-1αn-1=(α0,α1,α2, . . . ,αn-1),
B=b0+b1α+b2α2+ . . . +bn-1αn-1=(b0,b1,b2, . . . ,bn-1) (15)
Assuming that A′, a dual representation of A, is expressed as A′=(α0′,α1′,α2′, . . . ,αn-1′), C, the product between A and B, is expressed as C=(c0, . . . ,cn-1), and C′, a dual representation of C, is expressed as C′=(c0′, . . . ,cn-1′), C can be represented using code expression as shown in Equation 16.
A′←A (basis conversion)
For [i=0 to n-1]
ci′:=(b0●α0′)⊕ . . . ⊕(bn-1●αn-1′)
t:=(f0●α0′)⊕ . . . ⊕(fn-1●αn-1′)
A′:=A′<<1
Rename coefficients of A as α0′, . . . ,αn-1′
αn-1′:=k
C←C′ (basis conversion) (16)
Multiplication according to Equation 16 will now be explained in detail. First, the multiplier A is converted from the standard basis into the dual basis. ci′, ith multiplication result, in the dual basis is obtained by multiplying coefficients of A′ in the dual basis by coefficients of the multiplicand B and performing an XOR operation on the results of the multiplication. A coefficient of the highest power term of A′ is updated to k where k is obtained by multiplying coefficients of terms other than the term with the highest power in the defining polynomial of the finite field by coefficients of A′, performing XOR operations on results of the multiplication, and shifting the coefficients of A′ left once. After these steps are repeated n times, C′ is converted into the standard basis.
Changes of A′ within the for loop of Equation 16 will now be explained. If A′ is expressed as A′=(α0′,α1′,α2′, . . . ,αn-1′) when i=k, A′ may be represented by Equation 17 with fn-1= . . . =fn-d+1=0 when i=k+1.
A′=(α1′,α2′, . . . ,αn-1′,(α0′⊕(f1●α1′)⊕ . . . ⊕(fn-d●αn-d′))) (17)
When i=k+d, A′ may be defined by Equation 18.
A′=(αd′, . . . ,αn-1′,(α0′⊕(f1●α1′)⊕ . . . ⊕(fn-d●αn-d′)), . . . ,(αd−1⊕(f1●αd′)⊕ . . . ⊕(fn-d●αn-1′))) (18)
When i=k+d, d-bit parallel processing can be performed on A′ by assumption. Further, if the integer d is small enough, it satisfies most practical cases, that is, parameters of the elliptic curve cryptosystem in SEC and ANSI X9.62 standards, thereby not violating practicability.
Multiplication using the dual basis on which the d-bit parallel processing can be performed can be represented using code expression. That is, C, the product of A and B, may be expressed by Equation 19 when fn-1=fn-d+1=0.
A′←A (basis conversion)
For [i=0 to n-1, i=i+d]
For [=0 to d−1, j++]
sj:=αj′⊕(f1●αj+1′)⊕ . . . ⊕(fn-d●αn-d+j′)
ci+j′:=(b0●αj′)⊕ . . . ⊕(bn-1-j●αn-1′)⊕(bn-j●s0)⊕ . . . ⊕(bn-1●sj−1)
A′:=A′<<d
Rename the coefficients of A′ as α0′, . . . ,αn-1′
For [j=0 to d−1, j++]
αn-d+j′=sj
C←C′ (basis conversion) (19)
In multiplication according to Equation 19, the process described with reference to Equation 16 are performed by d bit unit. Consequently, time complexity can be improved d times and the increase in area complexity can be minimized.
The multiplier storing and updating unit 31 performs operations according to Equation 20 in Equation 19 to obtain coefficients in the dual basis.
A′:=A′<<d
Rename the coefficients of A′ as α0′, . . . ,αn-1′
For [j=0 to d−1, j++]
αn-d+j′:=sj (20)
The operation unit 33 performs an operation corresponding to ci+j′:=(b0●αj′)⊕ . . . ⊕(bn-1-j●αn-1′)⊕(bn-j●s0)⊕ . . . ⊕(bn-1●sj−1) in Equation 19, wherein a′=(a′0, . . . ,a′n), multiplier coefficients, output from the multiplier storing and updating unit 31 and b=(b0, . . . ,bn), multiplicand coefficients, output from the multiplicand storage unit 32. That is, the operation unit 33 multiplies mth multiplicand coefficients by (m+j)th multiplier coefficients and performs XOR operations on results of the multiplication. The final j multiplicand coefficients are multiplied by coefficients s0, . . . , sj−1, which are obtained by sj:=αj′⊕(f1●αj+1′)⊕ . . . ⊕(fn-d●αn-d+j′), and the coefficients s0, . . . , sj−1 are determined by a′, which is updated using Equation 20.
Accordingly, after 1 cycle, A′ becomes A′=(α2′,α3′,α4′,α0′⊕α2′,α1′⊕α3′). Further, ci′ becomes (b0●α0′)⊕(b1●α1′)⊕(b2●α2′)⊕(b3●α3′)⊕(b4●α4′), and ci+1′becomes (b0●α1′)⊕(b1●α2′)⊕(b2●α3′)⊕(b3●α4′)⊕(b4●(α0′⊕α2′)).
The reference numerals t0-t4 and D0-D4 have been used in
Table 1 shows the performance of the apparatus for performing multiplication using the standard basis.
Here, A represents a two input AND gate, X represents a two input XOR gate, R represents a register, TA represents an AND gate delay, TX represents an XOR gate delay, n represents a dimension, and d represents the number of bits in parallel processing.
Table 2 shows the performance of the apparatus for performing multiplication using the dual basis.
Performance values of the basis converting means are excluded from Table 2.
Here, A represents a two input AND gate, X represents a two input XOR gate, R represents a register, TA represents an AND gate delay, TX represents an XOR gate delay, n represents a dimension, and d represents the number of bits in parallel processing.
Table 3 shows the performance of the apparatus for performing multiplication according to an embodiment of the present invention using 0.18 um process technology of Samsung Electronics Co., Ltd. with the performance values described above.
Here, the apparatus for performing multiplication based on the dual basis includes the basis converting means.
According to Table 3, when computational speed doubled, area complexity increased approximately 1.43 to 1.61 times. Therefore, the area complexity does not rise rapidly.
As described above, since the apparatus according to an embodiment of the present invention performs the serial multiplication allowing the d-bit parallel processing, the apparatus is faster for arithmetic operation than the conventional serial multiplier and can minimize the increase of area complexity. Furthermore, an expected maximum delay of 100 MHz is within one clock cycle. Accordingly, the apparatus can be effectively applied to terminals having a low clock speed.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
2003-72140 | Oct 2003 | KR | national |