The invention will be more fully understood by reference to the following drawings which are for illustrative purposes only:
Referring more specifically to the drawings, for illustrative purposes the present invention is embodied in the apparatus generally shown in
A recoding method is described which is well-suited for use in public- key cryptosystems and particularly elliptic curve cryptography. Integers are converted to signed-digit representations {O, 1, -1} with digit replacement performed in response to joint weight. A scalar multiplication, such as part of elliptic curve cryptography, can be performed during the scanning. The method utilizes left to right scans (most significant bit to least significant bit) of integers ki when computing
and combining the scan with the multiplication to reduce memory requirements.
It will be appreciated that in computing the integer k, values are scanned from left (most significant bit) to right (least significant bit).
Therefore, new representations are obtained during a left-to-right recoding process which can be combined with computing
thereby reducing memory requirements since k, values need not be stored. It should be appreciated that since the k, integer values are large numbers, memory storage can be an important consideration in resource constrained environments, such as ECC implementations within smart cards and the like.
The left-to-right order of the process makes it compatible with the direction by which gP+hQ is computed, thereby reducing memory requirements and even allowing the cryptography solution to be performed in hardware.
A preferred method of computing gP+hQ according to an embodiment of the present invention, involves a computation which will be referred to herein as the “Shamir method” as described in the paper by T. ElGamal, “A public-key cryptosystem and signature scheme based on discrete logarithms,” The IEEE Transactions on Information Theory, Vol. 31, pp 469-472,1985.
Table 1 illustrates an example of how the computation is performed using the Shamir method. The process of recoding the integers according to the present invention is compatible with the Shamir method of performing scalar multiplication.
A number of mechanisms exist by which weight encoding may be performed in preparation for the scalar multiplication. One article describing these mechanisms is by M. Joye and S. Yen entitled “Optimal Left-to-Right Binary Signed-Digit Recoding, “IEEE Transactions on Computers 49(7):740-748, 2000. Of these mechanisms, the joint sparse form (JSF) provides optimizing of joint weight for two integers, wherein the number of non-zero digits is minimized. More recently the JSF method has been extended for use with more than two integers. However, one of the major drawbacks to the JSF method is that it utilizes considerable memory resources because it is performed in a right-to-left order. A JSF representation of two integers g and h has the following properties: JS1 - at least one bit position is a zero for both g and h for any three consecutive bit positions; JS2 - adjacent bits in g and h have opposite sign; JS3-if two consecutive digits in position j+I and j of j +1 g (or h ) are non-zero then the digit in position j +1 in h (or g ) is +1 or -1 and the digit in position j in h (or g ) is 0. The joint sparse form can thus be considered proper if at least one of its left-most bits is non-zero.
A couple of theorems are put forth with regard to the joint sparse form, however, the proofs are omitted for brevity. According to a first theorem a pair of positive integers has at most one proper joint sparse form. According to a second theorem the average joint weight among JSF representations of L-bits is L/2.
Joint sparse form (JSF) recoding of g and h is shown in the first two rows of Table 1. In order to compute 53P+102Q, according to a prior example, we must start from the left-most column and proceed toward the right-most column. The first entry in the row labeled “Double” is 1 (this is the entry in the third row of the left-most non-zero column). The entry in the third row doubles the bottom-most entry in the column to the left. The entries in the other rows of a column are formed based on the bits of g and h in that column. If the bits of g and h in a column are both -1 then the entry on the row labeled “Double” of that column is added to “-(P+Q)” of that column.
The bottom-most entry in the right-most column contains the desired computation “gP+hQ”. Accordingly, the number of additions required is dependent on the joint weight of g and h and the number of doublings required is one less than the number of bits in g or h. It should be noted that obtaining -P from P in the elliptic curve group can be done at negligible cost.
For computing gP+hQ the integers g and h are scanned from left (most significant bit (MSB)) to right (least significant bit (LSB)). Therefore the obtaining of new representations for g and h by scanning them from left-to-right is advantageous because it can be readily combined with the computing of gP+hQ. The directional compatibility between the recoding and computation stages reduces the amount of memory required to perform the computation because the new representations of gP +hQ do not have to be stored. Since g and h are large numbers (greater than 160 bits) this is a very important consideration in resource-constrained environments, such as with Smart cards, cell phones, personal digital assistants and the like.
Recoding two integers according to the JSF method results in an optimal joint weight of 0.5 n, where n is the number of bits needed to express the integers in binary form. However, the JSF method involves substantial memory overhead because it is a right-to-left method, wherein two integers g and h are fully scanned from right-to-left and their JSF stored in memory prior to the computing gP+hQ. Shamir's method involves scanning the JSF representations of g and h from left-to-right.
The following describes an algorithm according to the joint sparse form (JSF) as applied to two integers g and h.
By contrast, the present invention provides a recoding method which is performed in a left-to-right order and which computes a signed-binary representation of several non-negative integers with a minimum joint weight.
The left-to-right order of the recoding and computation steps allows the recoding and computations to be combined into a single left-to-right set of operations, resulting in significant reductions in memory utilization. The invention provides optimization of joint weights comparable to the JSF method, while it is readily implemented and utilizes a left-to-right order more compatible with the computation of gP+hQ which reduces memory requirements. In addition, the combination of recoding and computation within the present invention allows the method to be fully or partially implemented in hardware.
The recoding of the present invention is based on the following observation which is commonly used in Booth multiplication of two's complement binary numbers.
Let d be an L-bit number (dL,atd2,. .., dl, do0). Then d can be written as d=d__ 2(L-1) +d_ 22(-2) ++d,2 +do. Since 2x =2x+-2x, we can express d as follows. d =(d/ l -0)2(L) +(d/_2 -d/_1)2(/-) +.+(d, -d2)22 +(do -d,)2+(o-do) d can now be expressed as an (L+1) -bit number, X =((dL1 -O),(dL2 -d- 1),...,(do-d1),(O-do)) Each digit can be either 0, 1 or -1. Using the above observation we obtain our new recoding for g and h. Let g and h be two L-bit binary numbers expressed as follows. h =(hL-1 hL-2... ho) ) Algorithm 1 below outlines an embodiment of the present method for computing the new binary signed-digit representation.
Algorithm 1. 1. Convert the binary representation of g and h into:
xl =((gL1- 0),(gL-2-g-1) ...I(g0 -g1)(O -g0)) X2 =((h-l o)I(hL-2-h-l) ....(ho-/ ),(O-ho)). 2. Convert Xl and X2 into Y, and Y2 by going from left to right and performing any of the following replacements only if it results in a decrease in the joint weight. replace 1, -1 by 0, 1;
replace -1, 1 by 0, -1;
replace 1, 0, -1 by 0, 1, 1;
replace -1, 0, 1 by 0, -1, -1;
replace 0, -1, -1 by -1, 0, 1;
replace 0, 1, 1 by 1, 0, -1.
Algorithm 2.
Note that in Algorithm 2 we scan g and h only once from left to right three bits at a time. Algorithm 2 can be combined with Shamirs method for computing gP+hQ thereby eliminating the necessity for storing Y, and Y2 as the outputs of Algorithm 1.
Therefore, considering the case in which g and h are 53 and 102, Step 1 in Algorithm 1 results in the following: XIl=010 -11 -11 -1 X2 =10-1010 -10
The joint weight is now 8. Executing Step 2 results in the following: Y, 0100-10-1-1 Y2- 011010-10
After execution the joint weight is now five (5). Converting g and h to the joint sparse form (JSF) also achieves a joint weight of five (5). The JSF of g=53and h=102is 53 =0100 -10 -1 -1102=011010-10
As described later, Algorithm 2 results in an average joint weight of L/2. This is known to be an optimal value according to the JSF. The simplicity of the inventive method as embodied herein can be appreciated by comparing the results from the invention with that produced according to the JSF method. An algorithm for computing g and h within the JSF model is detailed below. In the algorithm below the JSF of g and h is given by:
(uO,-1 , UO,L2 ,. , u o, uo,0 ) and (uLn1 9 ula29... 9 , u14,o )
The function “mods” indicate that the modular reduction is to return the smallest residue in terms of absolute value. The following theorem states the optimum nature of our method, although no proof is given due to space limitations. According to a first theorem of this method, the average joint weight among the signed binary representations of L -bits from Algorithm 2 is L12.
Comparing Methods.
To properly compare the method of the present invention with the JSF method, the algorithms associated with each of these two methods have been simulated with the results provided in Table 2. Each row of the table was obtained by randomly generating one million L -bit binary numbers, g and h, and computing the average joint weight from the JSF algorithm and from Algorithm 2, according to an embodiment of the present invention. The first column in Table 2 lists the number of bits (L ) found in g and h. The second and third column gives the joint weight and execution time obtained for the JSF algorithm. The fourth and fifth columns provide the joint weight and execution times obtained from Algorithm 2. The algorithms were executed on the same processing platform for these tests, by way of example a Pentium IV Mobile processor operating at 1.8 GHz. The last five rows in the table correspond to the size of field elements for the elliptic curves defined by the National Institute of Standards and Technology (NIST) as in the publication “Digital Signature Standard”, FIPS publication 186-2, Feb. 2000.
From Table 2 it can be seen that the joint weights obtained from the JSF algorithm and Algorithm 2 approximate L12. Therefore, we can conclude that the joint weight resulting from Algorithm 2 is optimal as shown in the technical report by J.A. Solinas, “Low-weight Binary Representations for Pairs of Integers”; Technical Report CORR 2001-41, Center for Applied Cryptographic Research, University of Waterloo, Canada, 2001. Since the method according to an embodiment of the present invention scans g and h from left-to-right, using only 3 signed-bits of memory for each integer, it can be considered superior to the JSF algorithm based process due to a substantially decreased memory requirement. Furthermore, the present invention can also be embodied in hardware, or in a combination of hardware and software, as a sequential circuit with the bits of g and h as input (the most significant bits are input first). Whereas the JSF algorithm is not readily amenable to implementation in a sequential logic circuit. Even when comparing both methods implemented in software, the present invention provides somewhat faster execution times that an implementation of the JSF based approach (referring to columns 4 and 5 of Table 2).
The present invention provides a method of obtaining a signed binary representation of two integers that results in optimal joint weight. The algorithm on which the method is based has a lower complexity than the best known algorithm, namely the JSF algorithm. One of the major advantages of the method and algorithm of the present invention, is that it scans from left-to-right, using only three signed-bits of memory for each integer, thus making it compatible with Shamir's method for computing gP+hQ . The method according to the present invention can be readily extended to find the signed binary representations of more than two integers.
The process of determining a signed-binary representation of any number of non-negative integers is outlined in greater detail as Algorithm 3.
Algorithm 3.
Step 1: Convert the binary representation of each ki (O<i <N-1) into an (L+1) -bits {0,1,-1} -based representation using the following rule:
ki=((i,L-1 - 0)I (i,L-1 -ki,L-1)..-I(ki,O -ki,l)(°-ki,O Step 2: Scan all the (L +1) columns in the array from the left-most column to the right-most column (0). Note that there are N entries in each column. Step 3a: If all the N entries in the column being scanned is non-zero, then perform Step 4. Step 4: Mark the rows, which have a non-zero bit in the column being scanned. The non-zero bit is called a “reducible bit”. Step 5: Scan the marked rows from the reducible bit and go rightwards, looking at N bits at the most. Step 6a: If the rightward non-zero bit for at least one marked row is not within the next N bits, such as the next N bits to the right of the reducible bit being all zero, then skip that column and continue to scan the next column to its right. Step 6b: If the next rightward non-zero bit for all marked rows is within the next N bits, then among all marked rows let the maximum distance between the reducible bit and the next rightward non-zero bit be (C-1), for example wherein there are (C-1) zeros between the reducible bit and next rightward non-zero bit. Continue to Step 7. Step 7: Scan the columns from the column with the farthest non-zero bit found in Step 6b to the column with reducible bits. Note that unlike the scanning that this is a right-to-left sweep of (C+1) bits. Step 8: Determine if there exists at least one non-zero entry in each of the (C +1) columns being scanned in Step 7. Note that except for the left-most column within the Nx(C+l) table, at least one of the non-zero values for every non-zero column must be the right-most in that row. Step 9a: If at least one column among the (C +1) columns is zero, then skip that column and continue to scan the next column to its right.
Step 9b: If all the (C+1) columns are non-zero and satisfy the condition of Step 8, then perform Step 10.
Step 10: Suppose the reducible bit in one marked row is x (xe {E,- ).
First replace x by 0. Then replace the bits to its right by x. The second replacement is performed until the next non-zero bit x-. Note that x is also replaced by x, for example replacing xO . . Ox- with Ox ... xx.
Step 11: Skip columns and continue to scan backwards until arriving at the right-most column. Note that the C columns are the columns that have already been replaced.
Algorithm 3 can be described in greater detail as recited in the following description and pseudo-code listing. Algorithm 3 generally consists of two steps:
Step 1: Converting the unsigned-binary input to the alternating greedy expansions.
Step 2: Making replacements on the alternating greedy expansions.
In Step 2, three conditions must be satisfied before a replacement takes place. These three conditions are:
C1: LeftmostIsNonzero # 0.
C2: For each kELeftmostIsNonzero there is an i with j >i >EndComputingAlternatingGreedy satisfying (k) o .
C3: { i: j >i >MinNextNonzeroLocation } ={ RightmostNonzeroLocation [k]: 1 <k <d }
If all three conditions are satisfied then the leftmost column of the d +I columns being scanned will be converted from nonzero to zero. The policy is to replace xO...Ox- with Ox...xx (xE {-1,11) in each row k with kE LeftmostIsNonzero. The algorithm then skips the columns involved in the replacement and restarts the scanning.
If one or more of the three conditions are not satisfied,then Algorithm 3 moves rightward by one column and restarts the scanning.
The properties of Algorithm 3 are stated in the following lemmas and theorems.
Theorem 1. The output of the algorithm has minimal joint Hamming weight among any signed-binary expansions of the d given integers.
Theorem 2. Let J >d be the index of a column such that we have j =J at some stage of the algorithm. Then at least one of the columns J....,J-d will be a zero column in the output of the algorithm.
Theorem 3. Among 2d +1 consecutive columns of the algorithm output, there is at least one 0.
When implemented in hardware the algorithm leads to a significant reduction in hardware overhead. This is because the binary input 7k) is never used again after the calculation of (k) Therefore, the input array ) and the output array _k) can share the same memory space.
During the computation, the number of active columns (i.e. columns that are being scanned) is at most d +1. If the output of the algorithm is input to a real-time processor for further operation, then the amount of required memory could be reduced to as low as d x(d +1) signed-binary bits.
The pseudo-code for the above algorithm follows:
Algorithm 3 pseudo code example for computing a minimal joint expansion from the unsigned binary expansion from left to right.
For the case of a single integer, Algorithm 3 reduces to that of Algorithm 4.
Algorithm 4.
Let k be an L -bit unsigned binary number:
k =(kLlkL2...klko) Step 1: Convert the unsigned binary expansion n into:
k =((kL-l -) (kL-2 - k-)......(k.-kl)(-ko)) Step 2: Make replacements on the alternating greedy expansion of k by going from left to right and replacing xx by Ox, where xE I1,-II . The left-to-right scanning is preferably executed bit-by-bit. However, if a replacement is applied, then the replaced bits are skipped and the scan continues rightwards.
Consider the following example.
Letting k=155. Its unsigned binary expansion is (010011011), with a Hamming weight of five (5). Step 1 of this algorithm results in the binary expansion (ioio0io). Step 2 outputs (o0o1oo1o1) wherein the Hamming weight is thus reduced to four (4).
For the case of two integers a and b. Algorithm 3 reduces to Algorithm 5.
Algorithm 5.
Step 1: Convert unsigned binary expansion of a and b into P, and P2:
p=((aL- -O),(aL-2 -a1, 1)........ ,(ao -a,),(O-a0)) P2 =((bhL-o),(bL-2-bh1 )I ....... ,(bo-bl),(O-bo))
Step 2: Convert P, and P2 into Q, and Q2 by going from left to right and executing any the following replacements which are applicable. In the replacements shown below the top row of digits belongs to P, and the bottom row of digits belongs to P2 while x,yE II,-II . If a replacement is applied then the columns are discarded which have been replaced and the next two or three columns considered for replacement. If no replacement is possible then discard one column and consider the next two or three columns for replacement.
Note that if x =1 then x- =-1
It should also be noted that if is a replacement then so is
This is because it is inconsequential whether we write P, on the top or P2.
Consider the following example having two integers a and b.
Suppose that a =6699 and b=4846. The binary expansion of a and b is given by:
The joint weight of the above is ten (10). Applying Step 1 of Algorithm 5 (two integer case) we arrive at:
The result of step 1 has a joint weight of nine (9).
The left-most three columns of have been replaced using replacement A3, the two columns after that have used replacement A2 and so on.
Finally we discuss a more complicated case. The number of integers N is three (3). Considering the case where k, =23, k2=15 and k3 =7, the binary expansion is given by:
Step 1 of Algorithm 3 results in:
Step 2 suggests starting with the left-most column, [10 O]T. As this is a non-zero column we move to Step 4. The first row is marked and the “1” in the first row is a reducible bit. Step 5 asks us to look rightwards with a distance of not more than N, which in this case is three (3), in looking for the next non-zero bit. We have found the bit immediately to the right of the 1 is T (-1). Here we have C =1 in Step 6b. When we do the left-to-right scanning in Step 7, we find that the condition in Step 8 is satisfied. Now we perform Step 10 and replace 11 of the first column by 01 .
According to Step 11, we discard the left-most two columns. Now we look at the third column from the left, which is a non-zero column [101]T. The first and third columns are marked. A rightward scan is performed to find the next rightward non-zero bits of the first and third row (the first and third -ls at the right-most column of the array), herein being C =3. Step 8 asks us to determine if there exists at least one non-zero entry in each of the (C+1) columns. Unfortunately this is not true since there are two zero columns between the column [101]T and [-1 -1 _1]T. So we do nothing to the column being scanned, which in this case is the third column from the left. Then we move to the fourth column, which in this case is a zero column. According to Step 3b we discard this column and move rightwards again. The fifth column of this example has the same situation. The right-most column does not need to be scanned because it is impossible for it to be reduced. Wherein we arrive at the final output, which is given by:
The joint weight is three (3), which is the minimum possible joint weight among all signed-binary combinations of the integers 23, 15 and 7.
In order to understand an embodiment of the hardware the relationships between the input, intermediate, and signed-binary representation are discussed and an algorithm presented.
Table 3 presents the relationship among the binary input, the intermediate signed-binary representation (ISBR) and the optimal signed binary output sequence. Two bits of output can be determined once three bits of input are received. For three consecutive bits of input (bi,bi1 -,bi2) the notation ISBR(bi,bi-,,b,-2) denotes the corresponding bits of the alternating greedy expansion (ai,ai-1) and OUT(b,,bi-,,bi-2) to denote the corresponding output bits of the optimal signed binary representation (s,si l). The algorithm below presents how this operates in hardware.
A means for generating a signed-binary intermediate (ISBR) representation can be implemented as a logic circuit or gate array, or similar, configured for converting a received unsigned binary bit pattern into a signed-binary bit pattern. This generating means receives bits (bi,bi-,,bi-2) as input and is referred to as ISBR generator 138. A means for generating a signed-binary output (OUT) representation can be similarly implemented as a logic circuit or gate array, and so forth, configured for converting a received unsigned binary bit pattern into a signed-binary bit pattern. This generating means also receives bits (bi,bi-,,bi-2) as input and is referred to as OUT generator 140.
The output of ISBR generator 138 and OUT generator 140 follows according to the lookup table (Table 3). Let the output of ISBR generator 138 be (aiai-1) and that of OUT generator 140 be (sj,sil). Then the final output could be either (sj,sil) or (siI,ai-l), depending on if ai and si are equal or not. A means for selecting either ISBR bits or OUT bits is shown comprising a multiplexer such as MUX 142 which is used as a “switch” to control which bits to be output. The control signal should be “0” if ai =Si-I and “1” otherwise.
Since at and S1.- are not with the same index, they cannot be compared directly. A means for comparing ISBR bits to previous OUT bits is needed for controlling the signal selection means. The OUT signal can be delayed, such as by using a latch, to allow a proper comparison between ISBR and OUT signals, wherein latch 144 is utilized to delay S._I for one clock cycle. Thus S. 1 becomes si The comparison between at and Si is performed by a comparator 146, which generates a “0” to MUX 142 if ai =Si-1, and “1” otherwise. [00109]
Therefore, it will be appreciated that the scope of the present invention fully encompasses other embodiments which may become obvious to those skilled in the art, and that the scope of the present invention is accordingly to be limited by nothing other than the appended claims, in which reference to an element in the singular is not intended to mean “one and only one” unless explicitly so stated, but rather “one or more.” All structural and functional equivalents to the elements of the above-described preferred embodiment that are known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the present claims. Moreover, it is not necessary for a device or method to address each RFT-126-3US 32 and every problem sought to be solved by the present invention, for it to be encompassed by the present claims. Furthermore, no element, component, or method step in the present disclosure is intended to be dedicated to the public regardless of whether the element, component, or method step is explicitly recited in the claims. No claim element herein is to be construed under the provisions of 35 U.S.C. 112, sixth paragraph, unless the element is expressly recited using the phrase “means for.”
This application claims priority from, and is a 35 U.S.C. § 111 (a) continuation of, co-pending PCT international application serial number PCT/US2005/011235, filed on Apr. 4, 2005, incorporated herein by reference in its entirety, which designates the U.S. and which claims priority from U.S. provisional application serial number 60/572,073, filed on May 17, 2004, incorporated herein by reference in its entirety, and from U.S. provisional application serial number 60/570,255, filed on May 11, 2004, incorporated herein by reference in its entirety. Priority is claimed to each of the foregoing applications.
Number | Date | Country | |
---|---|---|---|
60572073 | May 2004 | US | |
60570255 | May 2004 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US05/11235 | Apr 2005 | US |
Child | 11558762 | US |