This application claims the priority of Korean Patent Application No. 10-2004-0087044, filed on Oct. 29, 2004 in the Korean Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.
1. Field of the Invention
The present invention relates to an apparatus for hybrid multiplication, and more particularly, to an apparatus for hybrid multiplication in finite field GF(2m) capable of achieving trade-off between the area and the performance of an apparatus for multiplication.
2. Description of the Related Art
A variety of operations in GF(2m) are widely used in communications systems or public-key cryptosystems. The GF(2m) operation in communication systems is used to enhance the reliability of information and m is determined with respect to the amount of data to be guaranteed for reliability. The exponent m has close relation with the size of hardware for calculation. For communication systems, m in a range from 8 to 32 is used, and a basic calculator for this, such as an adder, a multiplier, an inverse multiplier, is relatively easily implemented.
Meanwhile, in public-key cryptosystems, m is determined according to a guaranteed security, and in case of an elliptic curve cryptosystem (ECC), in order to guarantee sufficient security, m of 160 or over is recommended. Thus, for large m, the area as well as the performance of hardware should be considered. In particular, in case of a multiplier taking a major part of public-key cryptosystem calculations, the difference between the performance and the area can increase depending on the implementation method, and consequently, the difference of the performance of the entire system can increase.
An apparatus for multiplication in GF(2m) can be designed by a bit-serial method or a bit-parallel method. The bit-serial method has an advantage of hardware implementation with a small area, but the operation should be repeatedly performed m or more times such that the operation time increases and the performance of the system can be lowered. Meanwhile, the bit-parallel method can be expected to provide a high-speed operation performance, but with increasing m, the area of the hardware increases by a factor of 2 such that in case of a large system, there is difficult in implementation.
The present invention provides an apparatus and method for hybrid multiplication in finite field GF(2m) capable of achieving trade-off between the area and the performance of an apparatus for multiplication with optimizing the operation in finite field GF(2m).
According to an aspect of the present invention, there is provided an apparatus for hybrid multiplication including: a matrix Z generation unit generating [m×k] matrix Z for performing a partial multiplication of a(x) and b(x), by dividing b(x) by k bits (k≦┌m/2┐), when multiplication of m-bit multiplier a(x) and m-bit multiplicand b(x) is performed from [(m+k−1)×k] coefficient matrix of a(x) in GF(2m); a partial multiplication unit performing the partial multiplication ┌m/k┐k−1 times in units of rows of the matrix Z to calculate an (┌m/k┐k−1)-th partial multiplication value and a final result value of the multiplication; and a reduction unit receiving the (┌m/k┐k−1)-th partial multiplication value fed back from the partial multiplication unit and performing reduction of the value in order to obtain a partial multiplication value next to the (┌m/k┐k−1)-th partial multiplication value.
According to another aspect of the present invention, there is provided a hybrid multiplication method for multiplication of m-bit multiplier a(x) and m-bit multiplicand b(x) in GF(2m) including: generating [m×k] matrix Z for performing a partial multiplication of a(x) and b(x), by dividing b(x) by k bits (k≦┌m/2┐); by performing the partial multiplication ┌m/k┐k−1 times in units of rows of the matrix Z, calculating an (┌m/k┐k−1)-th partial multiplication value and a final result value of the multiplication; and reducing the obtained (┌m/k┐k−1)-th partial multiplication value in order to obtain a partial multiplication value next to the (┌m/k┐k−1)-th partial multiplication value.
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings in which:
The attached drawings for illustrating preferred embodiments of the present invention are referred to in order to gain a sufficient understanding of the present invention, the merits thereof, and the objectives accomplished by the implementation of the present invention. Hereinafter, the present invention will be described in detail by explaining preferred embodiments of the invention with reference to the attached drawings. In the drawings, whenever the same element reappears in subsequent drawings, it is denoted by the same reference numeral.
The present invention is designed to solve the problem of the multiplication methods described above. The present invention provides an apparatus and method for hybrid multiplication in finite field GF(2m) capable of achieving trade-off between the area and the performance of an apparatus for multiplication with optimizing the operation in finite field GF(2m).
For convenience of understanding, the implementation process of multiplication according to the present invention will now be explained considering the numerical formula aspect of the process.
Assuming that f(x)=xm+xn+1(1≦n≦m/2) is an irreducible polynomial in GF(2m), an arbitrary element, a(x), in GF(2m) can be expressed as
α1∈GF(2m).
Here, the irreducible polynomial under condition n≧m/2 is a form of a irreducible polynomial recommended in the standards such as SEC, WTLS, ISO, NIST, and FIPS.
Assuming that
(k>m), if partial multiplication of an m-bit multiplier, a(x), and an arbitrary k-bit multiplicand, b(x), is considered, the partial multiplication d(x) of a(x) and b(x) can be expressed as the following equation 1, and can also be expressed by using a [m+k−1]-row coefficient matrix M:
Since d(x) includes terms having a higher order than [m−1], it can be reduced to the [m−1]-th order by using xm=xn+1. Assuming k≦┌m/2┐, the highest order term xm+k−2 of d(x) satisfies the following equation 2:
xm+k−2=xk−2(xn+1)=xn+k−2+xk−2(n+k−2≦m/2+┌m/2┐−2<m) (2)
That is, if k≦┌m/2┐, reduction is performed only once for each of terms of the m-th or higher order of d(x).
By this reduction, each row element dm+j(0≧j≧k−2) of the matrix M is added once to dn+j and dj. Assuming that Z denotes an [m×k] matrix obtained from matrix M after the reduction, matrix Z is formed as the sum of three matrices X, T, and U, that is, Z=X+T+U. Here, matrix X is a matrix formed by obtaining 1st through m-th rows of matrix M and matrix T is a matrix formed by obtaining the (m+1)-th and higher rows of matrix M and extending the remaining rows with 0's. Matrix U is formed by shifting matrix T down by n rows and filling 0's in the upper n rows. The three matrices are as the following:
Due to predetermined regularity, an arbitrary i-th row Zi of matrix Z can be obtained by signal line mapping without adding an additional logic gate. The n-th row Zn of matrix Z is calculated as the following equation 3:
Zn=(anan−1 . . . a0am−1am−2 . . . am−k+n+1)+(0 am−1 . . . am−k+1) if n<k
Zn=(anan−1 . . . an−k+1)+(0 am−1 . . . am−k+1) if n≧k (3)
The partial multiplication of a(x) and b(x) is calculated as matrix multiplication of matrix Z and b(x). By using this partial multiplication and considering two m-bit multiplications, it is assumed that
In order to calculate a(x)b(x) mod f(x)=c(x) that is the multiplication result of a(x) and b(x), reduction matrix Z is generated and then b(x) is divided into k bits according to the following equation 4:
Here, bi=0 (m≧i<┌m/k┐k).
Assuming s=┌m/k┐−1, by using matrix Z, it can be seen that
As for a(x)Ts−1, k-bit reduction is performed as the following equation 5 and this is added when a(x)Ts−1 is calculated:
Since this reduction process can be performed in parallel when the partial multiplication of a(x)Ts−1 is performed, the calculation time is not delayed. As described above, the multiplication in finite field GF(2m) in the present invention is performed by repeatedly performing this k-bit partial multiplication and reduction from j=s to j=0.
The method for performing the multiplication described above will now be explained with reference to attached drawings.
As showing in
The details of the multiplication method shown in
Multiplier a(x) and multiplicand b(x) input are stored in register A 10 and register B 11, respectively. Register A 10 stores the m-bit input and outputs this as is and register B 11 stores the m-bit input and outputs high-order k-bit values each time.
When multiplier a(x) and multiplicand b(x) are stored in register A 10 and register B 11, Zn, the n-th row value of matrix Z repeatedly used in multiplication is calculated in Zn calculation unit 12 and stored in register Zn 13.
When necessary, without being stored in register Zn 13, the calculation result value of the n-th row of matrix Z may be directly calculated from the output value of the Zn calculation unit 12 and a multiplication can be performed. This has an advantage of reducing a hardware element required for the register Zn 13 because the register is not needed.
Matrix Z is generated by using the calculation result value, Zn, of the n-th row of matrix Z, and multiplier a(x) stored in register A 10 in operation S11. Due to predetermined regularity as described above, matrix Z, as shown in
If each element value of matrix Z is generated, partial multiplication of this value and the output value, bk, of register B 11 is performed in the partial multiplication unit 15 in operation S12.
The partial multiplication unit 15 is formed with a bit multiplication unit 151 performing bit multiplication of the [m×k] bit output of the matrix Z generation unit 14 and k-bit output bk of register B 11, and a bit addition unit 152 calculating partial multiplication result value partial_mul by performing bit addition of row elements of the bit multiplication result value and bit addition of reduced values of the pre-calculated partial multiplication results.
The bit addition unit 152 is implemented by [m×k] XOR operation units 1521. In order to add each row element of the output value bit_mul of the bit multiplication unit 151, the bit addition unit 152 performs bit addition in units of k rows in operation S122, and at the same time, performs further addition of reduction calculation value c_reduct corresponding to each row. Reduction is performed in the reduction unit 17.
The reduction unit 17 is implemented by k XOR operation units 171 and signal line mapping as shown in
If the present invention is used, in GF(2m) defined as irreducible polynomial xm+xn+1(1≧n≧m/2), trade-off between the area and the operation speed of the multiplication apparatus can be achieved such that efficient performing the multiplication is enabled. In particular, this enables an efficient implementation of an elliptic curve cryptosystem using a large m value.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims. The preferred embodiments should be considered in descriptive sense only and not for purposes of limitation. Therefore, the scope of the invention is defined not by the detailed description of the invention but by the appended claims, and all differences within the scope will be construed as being included in the present invention.
Number | Date | Country | Kind |
---|---|---|---|
10-2004-0087044 | Oct 2004 | KR | national |