BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to Elliptic Curve Cryptography (ECC), and in particular, to arithmetic circuits for EC operations.
2. Description of the Related Art
Elliptic Curve Cryptography (ECC) is an approach to public-key cryptography based on the algebraic structure of elliptic curves over finite fields. The use of elliptic curves in cryptography was suggested independently by Neal Koblitz and Victor S. Miller in 1985. Elliptic curves are also used in several integer factorization algorithms that have applications in cryptography, such as, for instance, Lenstra elliptic curve factorization, but this use of elliptic curves is not usually referred to as “elliptic curve cryptography.”
In ECC, a finite field, also referred to as a Galois field (GF), defines a field that contains only finitely many elements. The GF is typically categorized into two types, a prime field GF(p) and a binary field GF(2m). The prime field GF(p) is a finite field with p elements, usually labelled 0, 1, 2, . . . p−1, where arithmetic is performed with modulo p. Most of the ECC schemes are related to the prime field GF(p). Often seen examples are, the Elliptic Curve Diffie-Hellman (ECDH) key agreement scheme based on the Diffie-Hellman algorithm, the Elliptic Curve Digital Signature Algorithm (ECDSA) based on the Digital Signature Algorithm, and the ECMQV key agreement scheme based on the MQV key agreement scheme.
Conventionally, for a software based system, the ECC schemes are executed by a CPU cooperated with memory. The memory is accessed rapidly, thus a costly wide-width bus is requested. Specifically designed circuits are proposed to accelerate the EC operations. For example, prior arts in US patents U.S. Pat. No. 6,963,644, U.S. Pat. No. 6,820,105, U.S. Pat. No. 6,691,143 are hardware implementations for various ECC calculations, in which a plurality of multipliers and adders are utilized. Circuits in the published disclosures, however, are designed for particular operations, and the components therein can not be reused or shared by other algorithms. Thus, redundant components are used with considerable costs, and an improvement is therefore desirable.
BRIEF SUMMARY OF THE INVENTION
An exemplary embodiment of a cryptographic system is disclosed to implement an Elliptic Curve operation method. A memory stores a program and data. A central processor unit (CPU) dispatches requests to the program. The program is converted into an equivalent substitution sequence comprising only arithmetic addition, subtraction and shift operations. A register pool stores program data associated with the substitution sequence. An arithmetic logic unit (ALU) is controlled by the ASIC flow controller or the CPU to execute the substitution sequence to output an execution result.
In the ALU, an adder adds or subtracts two input numbers based on an adder trigger signal to generate the execution result. Two selectors controlled by a selection signal, pass values from the register pool to the adder as the input numbers. The adder trigger signal and selection signal are delivered from the ASIC flow controller based on the substitution sequence.
In the register pool, a plurality of registers store the program data associated with the substitution sequence. A dispatcher selectively stores the execution result or program data to one of the registers based on a storage signal. The storage signal is delivered from the ASIC flow controller based on the substitution sequence.
The shift operation may be performed by the register pool. The ASIC flow controller delivers a shift signal to one of the registers when a shift operation is requested, and the register shifts its stored data leftwards or rightwards accordingly. Each selector is coupled to outputs of the registers, selecting one of them to pass an input number to the adder. The registers may be at least 160 bit, the adder is a 32 bit full adder, and the input numbers are 32 bit individually obtained from the registers based on the selection signal.
Specifically, the program is an Elliptic Curve (EC) related application comprising point multiplication and addition operations, and prime field multiplication, inversion, addition, and subtraction operations.
The ASIC flow controller converts the point multiplication operations to a sequence comprising only prime field operations and shift operations. Furthermore, the ASIC flow controller converts prime field multiplication and inversion operations to an equivalent sequence comprising only arithmetic addition, subtraction and shift operations, such that the substitution sequence equivalent to the program is generated. The conversion of the prime field multiplication and inversion operations is a Montgomery domain transfer.
Another embodiment is an Elliptic Curve operation method, for use in an apparatus only capable of performing arithmetic addition, subtraction and shift operations. A program to be executed is firstly provided. The program is converted into an equivalent substitution sequence comprising only arithmetic addition, subtraction and shift operations. The substitution sequence is then executed and an execution result is output. A detailed description is given in the following embodiments with reference to the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention can be more fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, wherein:
FIG. 1 shows an embodiment of a cryptographic system 100 according to the invention;
FIG. 2 shows an embodiment of a state machine for Elliptic Curve (EC) operations;
FIG. 3 shows an embodiment of a register pool 210 and an ALU 220 according to FIG. 1;
FIG. 4 is an exemplary flowchart of a key generation procedure;
FIG. 5 is an exemplary flowchart of a point addition operation; and
FIG. 6 is a flowchart of a Montgomery multiplication algorithm.
DETAILED DESCRIPTION OF THE INVENTION
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
FIG. 1 shows an embodiment of a cryptographic system 100 according to the invention. The cryptographic system 100 maybe an embedded system comprising a CPU 102, a memory 104 and a specifically designed accelerator 110. The memory 104 may store programs and associated data intended to provide cryptographic services. The accelerator 110 is a supportive unit for accelerating EC related operations needed in the Elliptic Curve Diffie-Hellman (ECDH) key agreement scheme, the Elliptic Curve Digital Signature Algorithm (ECDSA), and the ECMQV key agreement scheme. The accelerator 110 is controlled by the CPU 102, comprising an ASIC flow controller 120, a register pool 210 and an ALU 220. When a program of EC operation is executed, the CPU 102 controls the accelerator 110 via an input interface 115 to accomplish the task. The program may be directly converted by the CPU 102 or the ASIC flow controller 120 (activated by the CPU 102) into an equivalent substitution sequence comprising only arithmetic addition, subtraction and shift operations, with program data #DATA simultaneously extracted therefrom. The ALU 220 then executes the substitution sequence and outputs an execution result #SUM. The register pool 210 stores the program data #DATA associated with the substitution sequence. The execution result #SUM may also be feedback to the register pool 210 for iterative calculations. Specifically, the ASIC flow controller 120 serves as a flow controller while the ALU 220 executes the substitution sequence, thus instructions such as loop, jump and compare are supported thereby.
FIG. 2 shows an embodiment of a state machine for EC operations. The EC operations calculate coordinates of Elliptic curve points (x,y) on a two-dimensional plane, such as addition of two points, doubling a point and finding the multiple of a point. The EC operations can be decomposed into four fundamental operations such as addition, subtraction, multiplication and inversion in the prime field GF(p). All of the operations can be further converted into a simplified form by transferring into Montgomery domain. In state 201, instructions of a program are sequentially executed. When different operations are required, corresponding state blocks are requested as a function call. As an example, an EC point multiplication (kG) is processed in state 203. An arithmetic number k and a point G are input, and their multiplication, kG, is output. EC point multiplication is equivalent to a sequence of EC point additions (also applicable for subtractions). State 205 serves the EC point addition, by which a point P+Q is obtained with two input points P and Q. If the points P and Q are identical ones, the output is referred to as a point double 2P. It is shown that state 205 is a sub-function for states 201 and 203.
Furthermore, EC point addition is convertible to a sequence of operations in Prime field GF(p), such as multiplication/inversion and addition/subtraction. Thus, multiplication (as well as inversion) in Prime field GF(p) is performed in state 207, serving as sub-functions for the aforementioned state blocks 201, 203 and 205. More than that, multiplication in Prime field GF(p) is also convertible to a sequence of arithmetic addition/subtraction operations. For example, by transferring into Montgomery domain, multiplication/inversion in Prime field GF(p) can be accomplished by only adders and bit shifters respectively associated within states 209. In view of the states classification, a generalized hardware is provided in the embodiment to perform all EC operations and operations over Prime field GF(p).
FIG. 3 shows an embodiment of a register pool 210 and an ALU 220 according to FIG. 1. The register pool 210 and ALU 220 are cooperatively controlled by the ASIC flow controller 120 via control signals #store, #shift, #select and #addsub to dedicatedly perform arithmetic addition operations as described in state 209 of FIG. 2. In the register pool 210, a plurality of registers 304 are simultaneously provided to buffer data to be calculated. For example, ECDSA may utilize 160-bit keys for signatures and verifications, so the registers 304 are implemented to have at least 160 bits. Arithmetic shift operations may be performed in the registers 304 under control of a shift signal #shift. When a shift operation is requested during execution of the program, the ASIC flow controller 120 delivers a shift signal #shift to a corresponding register 304, moving its data leftwards or rightwards accordingly. The dispatcher 302 serves as an allocation manager, controlled by a storage signal #store to store the execution result #SUM or program data #DATA to each particularly assigned register 304. Alternatively, the shift operation may also be performed by an adder 308 itself, thus the shift signal #shift is used thereby.
The ALU 220 comprises the adder 308, adding or subtracting two input numbers based on an adder trigger signal #addsub to generate the execution result. The two numbers are selected from the registers 304 by two selectors 306 according to a selection signal #select. The adder trigger signal #addsub and selection signal #select are delivered from the ASIC flow controller 120 or the CPU 102 when required. In the embodiment, the registers 304 are of 160 bit-width, and the adder 308 may be a 32 bit-width full adder. Each input number is 32 bit with an extra bit indicating carry or borrow. The output of the adder 308 is coupled to the dispatcher 302, thus the execution result #SUM can be feedback to the registers 304. If a 160 bit addition is requested, the adder 308 loops for five cycles with 32 bits processed per cycle. The execution result #SUM also comprises an extra bit to indicate carry or borrow. Through the control signals, the register pool 210 and ALU 220 flexibly solve all EC related operations by only addition, subtraction and shift operations.
FIG. 4 is an exemplary flowchart of an EC point multiplication procedure. According to ANSI X9.62 standard, ECDSA signature/verification process requires multiplication of a point G on an Elliptic curve by a constant k. EC multiplication as represented in state 203 of FIG. 2, are accomplished by a sequence of EC addition/subtraction and arithmetic operations. In step 401, the constant k and the point G are given. In step 403, arithmetic multiplication is used to calculate h=3k. Variables are initialized, such as e=k, R=G. In step 405, a loop is initialized for i=r−1 down to 1, where r is the total bits of h. The point R is doubled by EC addition, e.g. R=2R. In step 407, it is determined whether an ith bit of the variables h and e satisfy the conditions hi=1 and ei=0. Yes to step 409, point addition is performed to calculate R=R+G. Otherwise, step 411 is processed, determining whether an ith bit of the variables h and e satisfy the conditions hi=0 and ei=1. If so, EC subtraction is performed to calculate R=R−G in step 413. Thereafter in step 415, the index i is checked whether equivalent to 1. If not, the index i is decreased in step 417, and the process returns to step 405. Otherwise, the loop is deemed finished, and the result R=kG is output in step 419.
FIG. 5 is an exemplary flowchart of a EC addition operation. The EC addition/subtraction as described in state 205 of FIG. 2, are further convertible to a sequence of operations in Prime field GF(p). In step 501, two addends are given as P(x1, y1) and Q(x2, y2) where the coordinates x1, y1, x2, and y2 are real numbers. In step 503, it is determined whether P and Q are the identical point, because derivations of their slopes are different. No to step 505, and yes to step 507. In step 505, the slope λ=(y2−y1)/(x2−x1) is calculated using subtraction, inversion and multiplication in Prime field GF(p). In step 507, the slope λ=(3x12+a)/2y1 is also calculated by operations in Prime field GF(p), where a is a parameter for the elliptic curve y2=x3+ax+b. Then, coordinates of the result R=P+Q are calculated based on the slope. In step 509, x3=λ2−x1−x2. In step 511, y3=λ(x1−x3)−y1. In step 513, the result R(x3,y3) is output. Addition and subtraction are mutual substitutable operations, thus P-Q can be calculated by giving P and −Q in step 501 for this example.
FIG. 6 is a flowchart of a Montgomery multiplication algorithm. Multiplication/inversion operations in Montgomery domain are further simplified to arithmetic addition and shift operations. In step 601, multiplicands x and y, and a n-bit prime modulo p are input. z=(xy/2n) mod p is the destination to be derived. In step 603, variables are initialized, e.g. z=0, i=0. A loop is started in step 605 for i=0 to n−1, and z is updated by adding xiy to itself: z=z+xiy, where xi is the ith digit of x. In step 607, z is updated by adding z0p: z=z+z0p, where z0 is the rightmost digit of z. In step 609, z is shifted rightward by 1 bit, equivalently rendering z=z/2. In step 611, it is determined whether the loop is finished. If not, the index i is incremented and the process returns to step 605. Otherwise, z is modulated by the modulo p in steps 613 and 615 to ensure a result not exceeding p. Thereafter, in step 617, the result p is output. In summary, only arithmetic addition and shift operations are used, thus, through conversion by ASIC flow controller 120, the EC related programs can be executed by register pool 210 and ALU 220 under control of the ASIC flow controller 120. Montgomery algorithm has many variations depending on different conditions, and the embodiment is specifically adaptable for prime field GF(p). Montgomery inversion algorithm is also a sequence of only arithmetic addition operations, thus detailed steps are not introduced in this embodiment. While the invention has been described by way of example and in terms of preferred embodiment, it is to be understood that the invention is not limited thereto. To the contrary, it is intended to cover various modifications and similar arrangements (as would be apparent to those skilled in the art). Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.