The present invention relates generally to the field of data encryption and decryption, and more specifically to a method and system for hiding a secret key. The security of information poses challenges for businesses and other organizations that transmit, store and distribute information. Cryptographic systems are intended to transform data into a form that is only readable by authorized users.
There has been a dramatic increase in the need for cryptographic systems that can protect digital data from potential hackers, eavesdroppers, forgers, and other adversaries. This is largely due to the fact that an ever increasing portion of business and communications involve software and digital electronics. In addition, the sophistication of potential adversaries and the equipment at their disposal continues to improve, which makes the problem of protecting digital data even more pressing.
Cryptographic systems are adapted to utilize and transfer information without it being compromised by adversaries. Such systems utilize keys for encoding and decoding information. Information to be protected (M) is encoded using an encoding key to produce an encoded version of the message, or cipher-text (C). The ciphertext C can then be decoded using a decoding key to recover the original message M.
In some instances the software and information to be protected is expected to execute in potentially hostile environments. In these situations, the keys used to decode the protected information must be protected even from the user. These are execution environments in which the adversary who is running the software has complete control over the machine on which it executes.
In situations where the software and information to be protected is expected to execute in a potentially hostile environment, it is desirable that the private key used in the decryption of information never “shows its face” even as the software uses it to decrypt. This keeps the private key hidden from both static analysis and dynamic analysis. It is desirable that the private key's bits never actually appear even in non-contiguous memory locations at any point during the execution of the software. Instead, a number of apparently random keys are used, whose combined net effect is as if the private key had been used. This can be achieved through algebraic decompositions of the private key.
It is also desirable that the number of above-mentioned algebraic decompositions be very large. Making the number of algebraic decompositions of the private key very large, preferably more than the number of executions of the software during its lifetime, makes it very difficult for the adversary to determine the private key. At the same time, it is desirable that the algebraic decompositions take little storage space and execution time.
It is preferable that the different decompositions are used randomly. In each execution of the software, one of the decompositions is randomly selected and used. By using a different decomposition each time, it makes it much more difficult for an adversary to break the encryption. An adversary may carry out a differential analysis of various execution traces thinking that the decryption routine executes in all of these traces and can therefore be pinpointed. By using a different randomly selected decomposition each time, the adversary is frustrated. This “raises the bar” and forces the adversary to do detailed semantic analyses of the code and its behavior.
It is also preferable that the decompositions be “evanescent.” Each of the random numbers that make up the decomposition is preferably generated at run-time and is evanescent in the sense that it briefly appears and then disappears.
It is also desirable to use obfuscation and tamperproofing techniques to disguise the elements of the decompositions. The different random numbers that make up a decomposition of the private key should be used in seemingly different ways, even if they are in fact functionally equivalent. This makes it difficult for the adversary to logically link the random numbers together. The security of the scheme is enhanced when an execution trace shows a huge number (call it N) of values, and even if the adversary knows that k of those values are a decomposition of the private key, he does not know which k they are, especially if there are too many combinations of k values out of N for him to test all of them.
It is also desirable that the above-mentioned algebraic decompositions be dynamically evolving. In other words that the decompositions mutate after every use without changing the secret key. This makes it very difficult for the adversary because even if the adversary determines part of the decomposition, it can change making the adversary's prior work meaningless.
Additional features and advantages of the invention will become apparent to those skilled in the art upon consideration of the following detailed description of illustrated embodiments.
Aspects of the present invention are more particularly described below with reference to the following figures, which illustrate exemplary embodiments of the present invention
For the purposes of promoting an understanding of the principles of the invention, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same. It will nevertheless be understood that no limitation of the scope of the invention is thereby intended, such alterations and further modifications in the illustrated device, and such further applications of the principles of the invention as illustrated therein being contemplated as would normally occur to one skilled in the art to which the invention relates.
The following description provides examples of the techniques for RSA and elliptic curve cryptosystems (ECC), but it will be clear to those skilled in the art that this technique can easily be modified for El Gamal and other public-key cryptosystems.
In the following discussion, the standard RSA notation is used. Two different large prime numbers p and q are chosen, and the value n is determined as n=pq. The public key e, and the private key d have the relationship:
ed mod(p−1)(q−1)=1.
Encryption of a message M is done using the public key e by computing the cipher-text
C=M^e mod n
where “^” denotes exponentiation. Decryption recovers the message M using the private key d by computing
M=C^d mod n.
One method for hiding d consists of replacing the computation of M=C^d mod n by the sequence of computations
C0=C
C1=C0^d1 mod n
C2=C1^d2 mod n
. . .
Ck=Ck-1^dk mod n
Combining the above equations results in
Ck=C^(d1d2d3 . . . dk)mod n
and therefore, if we choose the dj's such that
d1d2d3 . . . dk mod(p−1)(q−1)=d
we will have Ck=M.
The main benefit of doing the above is that it forces the attacker to piece together the roles of the di's, which can be made harder by obfuscating the Ci computations so they do not look similar to each other.
The di's are selected to be relatively prime to (p−1)(q−1), and all but dk are selected randomly. The value of dk is computed using the Extended Euclid algorithm for computing multiplicative inverses modulo (p−1)(q−1):
dk=(d1d2d3 . . . dk-1)−1d mod(p−1)(q−1)
where (d1 d2 d3 . . . dk-1)−1 is the multiplicative inverse of (d1 d2 d3 . . . dk-1) modulo (p−1)(q−1). An algorithm for carrying out the Extended Euclid algorithm to compute multiplicative inverses can be found in many cryptography textbooks and handbooks. The algorithm for modular exponentiation (computing a^b mod c for integers a, b, c) is also found in many textbooks on cryptography and computational number theory, often under the so-called “repeated-squaring” algorithm.
That the di's are random is an advantage, because it is then easy to generate them at run-time so they are not revealed to a static analysis of the software. There are many ways of doing this, for example, they could be evanescent values that appear briefly during the execution of some complicated routine (whose main purpose is in fact to generate them, although it appears to be doing something else).
In the above technique, the same set of di's get used in every execution of the software, and this can be revealed through differential analysis of different execution traces. It would be desirable if every execution involves different sets of (possibly overlapping) di's. This is achieved by the method describe next.
The following method uses different sets of (possibly overlapping) di's for different executions of the program. In this method, a layered directed acyclic graph is generated at software-creation-time. A layered directed graph is a graph such as that shown in
The graph will be used to help illustrate the process of hiding the key, in the following way:
For each such source-to-sink path, the computations that are done by the software correspond to the integers on the edges of that path. The ciphertext entering the tail of a path edge (v,w) from a predecessor vertex is raised, modulo n, to the power d(v,w), thereby generating the ciphertext output to a successor vertex through the head of the path edge. What “enters” the first path edge (call it edge (v,w)) is C0=M^e mod n, and what leaves it is M^(e r(v)−1 r(w))mod n. What “leaves” the last edge is
Ck=M^(e*(product of integers on path's edges))mod n
which equals M^(ed)mod n, which equals the plaintext message M.
Using the acyclic layered graph of
C0=M^e mod n
and what leaves the head of the first path edge is
However, r(v) for all the vertices of the first layer is r0, thus r(A)=r0, and
C1=M^(er0−1r(B))mod n
The second path edge has its tail connected to predecessor vertex B and its head connected to successor vertex C. C1 enters the tail of the second path edge (B, C) and the following computation:
The third and last path edge has its tail connected to predecessor vertex C and its head connected to successor vertex D. C2 enters the third path edge (C,D) and the following computation:
C3=C2^d(C,D)mod n
=M^(er0−1r(B)r(B)−1r(C)r(C)−1r(D))mod n
However, r(v) for all the vertices of the last layer is r0 d, and
C3=M^(er0−1r(B)r(B)−1r(C)r(C)−1r0d)mod n
which simplifies to:
C3=M^(ed)mod n
which equals the plaintext message M, which is what is output at the head of the last path edge (C,D).
Note that only the d(v, w) values appear during the execution of the program, which are, in this example:
d(A,B)=r0−1r(B)
d(B,C)=r(B)−1r(C)
d(C,D)=r(C)−1r0d
The individual r(v) values do not appear separately, and most importantly, the private key, d, is never exposed during the decryption.
By having many layers, and many edges from each layer to the next, there are exponentially many source-to-sink paths, all of which achieve the same result of implicitly decrypting with the private key without that key ever explicitly appearing during execution. For example, if there are 21 layers and every vertex in the first 20 layers has out-degree 10, then the number of different source-to-sink paths is 10^20. In that case, there are almost certainly more different source-to-sink paths than the total number of times the software will execute in its lifetime. Randomization can be used to select which source-to-sink path is used in a particular execution of the software. An adversary who carries out a differential analysis of various execution traces, thinking that the decryption routine executes in all of these traces and can therefore be pinpointed, is thereby foiled because different paths are used in different executions. This “raises the bar” and forces the adversary to do detailed semantic analyses of the code and its behavior.
Recall that, in an elliptic curve cryptosystem (ECC), the algebra involves the group of points on an elliptic curve. There is a public key P, a private key x, and a public key Q=xP. Encryption of a message M is done by selecting a random r and computing the pair:
rP and M+rQ
which together are the encryption of M. Decryption recovers M by computing x(rP) and subtracting the result from M+rQ,
thereby recovering M.
A simple method for hiding x consists of replacing the decryption computation, i.e., M+rQ−x(rP), by the sequence of computations
C0=(M+rQ)
C1=C0−x1(rP)
C2=C1−x2(rP)
. . .
Ck=Ck-1−xk(rP)
Combining the above equations gives:
Ck=(M+rQ)−(x1+x2+x3+ . . . +xk)(rP)
and therefore, if we choose the xj's such that
x1+x2+x3+ . . . +xk=x
then we will have Ck=M.
An algorithm for carrying out the addition of points on the elliptic curve can be found in many cryptography textbooks and handbooks. The algorithm for modular exponentiation (computing a{circumflex over (0)}b mod c for integers a, b, c) is also found in many textbooks on cryptography and computational number theory, often under the so-called “repeated-squaring” algorithm. The ECC algorithm for computing rP, which is equivalent to P added to itself r times, is very similar to the repeated-squaring algorithm except that instead of squaring you now have doubling. In the repeated squaring algorithm the values of a{circumflex over (0)}2 mod c, a^4 mod c, a{circumflex over (0)}8 mod c, a^16 mod c, etc. are calculated, whereas in the algorithm for computing rP, the values of 2P, 4P, 8P, 16P, . . . etc, are calculated.
The main benefit of separating the decryption into a sequence of computations is that it forces the attacker to piece together the roles of the xi's, which can be made harder by obfuscating the Ci computations so they do not look similar to each other. The xi's can be positive or negative, and all but xk are selected randomly, whereas xk is computed as:
xk=x−(x1+x2+x3+ . . . +xk-1).
That the xi's are random is an advantage, because it is then easy to generate them at run-time so they are not revealed to a static analysis of the software. There are many ways of doing this, for example, they could be evanescent values that appear briefly during the execution of some complicated routine whose main purpose is in fact to generate them, although it appears to be doing something else.
One disadvantage with the above is that the same xi get used in every execution of the software, and this can be revealed through differential analysis of different execution traces. It would be desirable if every execution involves different sets of (possibly overlapping) xi's. This is achieved by the method described below.
The following method uses different sets of (possibly overlapping) xi's for different executions of the program. In this method, a layered directed acyclic graph is generated at protect-time, such as the layered directed graph shown in
The graph is used to guide the process of hiding the key, in the following way:
The vertices of the first layer all get the same r(v) associated with them, call it r0. An r(v) equal to x+r0, where x is the private key, is associated with every vertex v of the last layer (layer k). Of course the r(v)'s are not explicitly stored in the software, but are introduced here for the sake of describing the key-hiding method.
For each such source-to-sink path, the computations that are done by the software correspond to the integers on the edges of that path: The ciphertext entering from a predecessor vertex through the tail of an edge (v,w) is modified by subtracting from it (r(w)−r(v)) (rP), thereby generating the ciphertext to be output to a successor vertex through the head of the edge. What “enters” the first path edge (call it edge (v,w)) is
C0=(M+rQ),
and what leaves it is
C1=(M+rQ)−(r(w)−r(v))(rP)
What leaves the last edge is
Ck=(M+rQ)−(sum of integers on path's edges))(rP)
which equals M+rQ−x(rP), which equals the plaintext message M.
Using the acyclic layered graph of
C0=M+rQ
and what leaves the first path edge is
However, r(v) for all the vertices of the first layer is r0, thus r(A)=r0, and
C1=(M+rQ)−(r(B)−r0)(rP)
The second path edge has its tail connected to predecessor vertex B and its head connected to successor vertex C. C1 enters the second path edge (B, C) and the following computation:
The third and last path edge has its tail connected to predecessor vertex C and its head connected to successor vertex D. C2 enters the third path edge (C,D) and the following computation:
However, since r(v) for the last layer is x+r0,
which equals the plaintext message M, which is what leaves the last path edge (C,D).
Note in this example as well that only the combination edge values appear during the execution of the program, which are, in this example:
r(B)−r0,
r(C)−r(B), and
x+r0−r(D)
The individual r(v) values do not appear separately, and most importantly, the private key, x, is never exposed during the decryption.
If there are many layers, and many edges from each layer to the next, then there are exponentially many source-to-sink paths, all of which achieve the same effect of implicitly decrypting with the private key without that key ever explicitly appearing during execution. For example, if there are 21 layers and every vertex in the first 20 layers has out-degree 10, then the number of different source-to-sink paths is 10^20. In that case, there are almost certainly more different source-to-sink paths than the total number of times the software will execute in its lifetime. Randomization can be used to select which source-to-sink path is used in a particular execution of the software. An adversary who carries out a differential analysis of various execution traces, thinking that the decryption routine executes in all of these traces and can therefore be pinpointed, is thereby foiled because different paths are used in different executions. This “raises the bar” and forces the adversary to do detailed semantic analyses of the code and its behavior.
The above descriptions used a layered graph to help explain the method. However, there is no need to use a layered graph in either of the above RSA or the ECC examples. A directed acyclic graph that is not layered can also be used for guiding the process of hiding the key. The main reason a layered graph was used in the above examples is to make the description easier to follow.
Hiding a Key in Hardware
Encryption is often used within tamper-resistant hardware, where the key is presumed to be safe from exposure. However, attacks on tamper-resistant hardware are feasible, and using the hiding technique described in the previous section within a tamper-resistant hardware provides a second line of defense, in case the adversary manages to defeat the existing hardware protection.
The ideas of the previous section can be used to amplify the security of moderately secure hardware, to achieve a much higher level of security. Specifically, if p is the probability that the moderately secure hardware is compromised, then the probability that the key is compromised can be brought down to p^k for any integer k of our choice, by using k copies of the hardware.
This can be achieved using the technique of the previous section in the following manner. A layered graph having k layers is generated, and k copies of the moderately secure hardware devices, call them Hw1, . . . , Hwk are used. An example using three hardware copies is shown in
Note that compromising only one hardware copy Hwj gives the adversary only the integers for the edges between layer j−1 and layer j, which are essentially random data. To get the secret key, the adversary must compromise all k copies of the hardware, and then correlate the random-looking data in each. If the probability of compromise of the data in a single copy of the hardware is p, then the probability of compromise of the proposed system of k layers, which has k copies of the hardware, is p^k. In practice, a value of k=3 (or even 2) may well be enough. For example, having k=3 would bring an unacceptable 0.001 probability of compromise down to a much better one-in-a-billion probability of compromise.
The fact that the “active” key or edge within each Hwj device changes for each decryption, makes an attack on even a single hardware device more difficult than if the same key had been used each time. This implies a lowered probability of compromise for a given Hwj. Moreover, different systems (of k copies each) that implement the same private key, will each have their own random layered graph and hence different sets of random values within their respective copies of the hardware.
This technique makes it possible to use massively produced commercial off-the shelf (COTS) decryption hardware that is only moderately secure to build a hardware decryption system that has dramatically higher security. The advantages of using massively produced COTS hardware are its low costs—the high production volumes for commercial use make such hardware essentially “commoditized” and of much lower cost than non-COTS (i.e., custom-built) hardware. The usual disadvantage with the use of low cost COTS hardware is that it has a higher probability of being compromised than special-purpose (more expensive) hardware. This disadvantage is dramatically reduced by the proposed technique.
Dynamically Evolving the Layered Graph
It is possible to modify the integers of the edges such that the graph, mutates after every use without changing the hidden key. Moreover, each modification can be localized to a vertex and its incident edges (hence no at-once wholesale modification of the whole graph is necessary, as the graph can be modified little by little as it is being used). The advantages of modifying the graph include the following: (i) if the adversary has spent considerable time figuring out some of the bits in an edge's integer (e.g., by using non-destructive probing attacks on a tamper-resistant hardware), then a modification to that integer nullifies the adversary's progress; (ii) it mitigates the security drawbacks of residual data properties of the memory material following a clearing event (because of the frequent over-writing of the memory cells with random-looking numbers—even the most sophisticated data-recovery techniques from cleared memory can look back only a limited number of write cycles). In other words, the integers on the edges of the graph become “moving targets” that must all be determined simultaneously prior to the next mutation
A method for dynamically modifying the layered graph is the following. Let v be a vertex of the graph that is not in the first or last layers of the graph (neither a source nor a sink), let y1, y2, . . . be the integers on the edges that have their heads connected to vertex v, and let z1, z2, . . . be the integers on the edges that have their tails connected to vertex v. The modification then consists of
The key observation is that the net effect of the above change, on any source-to-sink path that goes through vertex v, is nil. This is because such a path uses exactly one yi and exactly one zj and hence the r that modified yi is cancelled out by it's inverse, r′, that modified zj.
The present invention has been described with reference to certain exemplary embodiments, variations, and applications. However, the present invention is defined by the appended claims and therefore should not be limited by the described embodiments, variations, and applications.
This application claims the benefit of U.S. Provisional Application Ser. No. 60/735,906, filed on Nov. 10, 2005, entitled “Method and Apparatus for Hiding a Private Key,” which is incorporated herein by reference.
| Number | Name | Date | Kind |
|---|---|---|---|
| 4309569 | Merkle | Jan 1982 | A |
| 5432852 | Leighton et al. | Jul 1995 | A |
| 5592552 | Fiat | Jan 1997 | A |
| 6226743 | Naor et al. | May 2001 | B1 |
| 6411715 | Liskov | Jun 2002 | B1 |
| 6532543 | Smith | Mar 2003 | B1 |
| 6542610 | Traw | Apr 2003 | B2 |
| 6594761 | Chow | Jul 2003 | B1 |
| 6779114 | Chow et al. | Aug 2004 | B1 |
| 6823068 | Samid | Nov 2004 | B1 |
| 6826687 | Rohatgi | Nov 2004 | B1 |
| 6842862 | Chow | Jan 2005 | B2 |
| 6912654 | Murakami | Jun 2005 | B2 |
| 6950518 | Henson | Sep 2005 | B2 |
| 7539697 | Akella et al. | May 2009 | B1 |
| 7574518 | Jaggi et al. | Aug 2009 | B2 |
| 7634091 | Zhou et al. | Dec 2009 | B2 |
| 20040078775 | Chow | Apr 2004 | A1 |
| 20040139340 | Johnson | Jul 2004 | A1 |
| 20050002532 | Zhou et al. | Jan 2005 | A1 |
| 20050036615 | Jakobsson | Feb 2005 | A1 |
| 20050138392 | Johnson | Jun 2005 | A1 |
| 20050175176 | Venkatesan | Aug 2005 | A1 |
| 20050220299 | Lipson | Oct 2005 | A1 |
| 20050220302 | Mironov | Oct 2005 | A1 |
| 20050246554 | Batson | Nov 2005 | A1 |
| 20060015514 | Suga | Jan 2006 | A1 |
| 20070016769 | Gentry et al. | Jan 2007 | A1 |
| Number | Date | Country |
|---|---|---|
| 0946018 | Sep 1999 | EP |
| 2004004341 | Jan 2004 | JP |
| WO03065639 | Aug 2003 | WO |
| WO2004006497 | Jan 2004 | WO |
| Number | Date | Country | |
|---|---|---|---|
| 20070127721 A1 | Jun 2007 | US |
| Number | Date | Country | |
|---|---|---|---|
| 60735906 | Nov 2005 | US |