The present application is the United States national stage of International Application No. PCT/EP2014/056421, filed Mar. 31, 2014, the entire content of which is incorporated herein by reference.
The present invention relates to a method of obfuscated performance of a predetermined function and a method of configuring a processor to implement a predetermined function in an obfuscated manner, and apparatus and computer programs for carrying out such methods.
A “white box environment” is an execution environment in which a person can execute an amount of computer code (or software)—where the code implements a function F—and the person may inspect and modify the code (or be assumed to know the underlying algorithm that is being implemented) and/or, during execution of the code, the person may inspect and modify the values of data being used (i.e. the contents of the memory being used), the data flow and the process flow (or order of execution of instructions in the code).
Various techniques are known that enable provision or generation of code (that implements the function F) such that, even if the code is executed in a white-box environment, the person executing the code cannot determine the values of inputs to the function F and/or outputs of the function F and/or secret information used by the function F (or, at the very least, such a determination is rendered impractical or infeasible within a given amount of time).
It would be desirable to be able to provide an improved technique for providing or generating code that is suitable for deployment or execution within a white-box environment.
According to a first aspect of the invention, there is provided a method of obfuscated performance of a predetermined function, wherein for the predetermined function there is one or more corresponding first functions so that, for a set of inputs for the function, a corresponding set of outputs may be generated by (a) representing the set of inputs as a corresponding set of values, wherein each value comprises at least part of each input of a corresponding plurality of the inputs, (b) generating a set of one or more results from the set of values, where each result is generated by applying a corresponding first function to a corresponding set of one or more values in the set of values, and (c) forming each output as either a part of a corresponding one of the results or as a combination of at least part of each result of a corresponding plurality of the results; wherein the method comprises: obtaining, for each value in the set of values, one or more corresponding transformed versions of said value, wherein each transformed version of said value is the result of applying a respective bijection, that corresponds to said transformed version, to said value; and generating a set of transformed results corresponding to the set of results, wherein each transformed result corresponds to a respective result and is generated by applying a respective second function, that corresponds to the first function that corresponds to the respective result, to a transformed version of the one or more values in the set of one or more values corresponding to the first function, wherein, for the respective second function, there is a corresponding bijection for obtaining the respective result from said transformed result.
In some embodiments, said obtaining comprises: obtaining the set of values; and generating, for each value in the obtained set of values, said one or more corresponding transformed versions of said value, wherein each of said one or more corresponding transformed versions of said value is generated by applying said respective bijection, that corresponds to said transformed version, to said value. Obtaining the set of values may then comprise: obtaining the set of inputs; and generating the set of values from the set of inputs.
In some embodiments, said obtaining comprises receiving, at a first module that performs said obtaining and said generating, said one or more transformed versions of each value in said set of values from a second module.
In some embodiments, the method comprises: generating the set of results from the set of transformed results by applying, to each transformed result, the bijection that corresponds to the second function for that transformed result. The method may then comprise obtaining the set of outputs from the set of results.
In some embodiments, the method comprises outputting the set of transformed results, from a first module that performs said obtaining and said generating to a second module.
In some embodiments, for each value in the set of values, the at least part of each input of a corresponding plurality of the inputs comprises the whole of each input of the corresponding plurality of inputs.
In some embodiments, for each value in the set of values, the at least part of each input of a corresponding plurality of the inputs comprises a predetermined number of bits of each input of the corresponding plurality of inputs. The predetermined number may be 1.
In some embodiments, the predetermined function corresponds to a lookup table that maps an input in the set of inputs to a corresponding output in the set of outputs.
According to a second aspect of the invention, there is provided a method of configuring a processor to implement a predetermined function in an obfuscated manner, wherein for the predetermined function there is one or more corresponding first functions so that, for a set of inputs for the function, a corresponding set of outputs may be generated by (a) representing the set of inputs as a corresponding set of values, wherein each value comprises at least part of each input of a corresponding plurality of the inputs, (b) generating a set of one or more results from the set of values, where each result is generated by applying a corresponding first function to a corresponding set of one or more values in the set of values, and (c) forming each output as either a part of a corresponding one of the results or as a combination of at least part of each result of a corresponding plurality of the results; wherein the method comprises: for each first function: specifying, for each value in the corresponding set of one or more values for the first function, a corresponding bijection; specifying a bijection for the first function; and based on the specified bijections, determining a second function that corresponds to the first function, wherein the second function, upon application to the one or more values of the respective set of one or more values for the first function when transformed under their corresponding bijections, outputs a transformed version, under the bijection for the first function, of the result corresponding to the first function; and configuring the processor to carry out the method of the above first aspect of the invention, using the determined section functions.
According to a third aspect of the invention, there is provided an apparatus arranged to carry out any one of the above methods.
According to a fourth aspect of the invention, there is provided a computer program which, when executed by a processor, causes the processor to carry out any one of the above methods. The computer program may be stored on a computer-readable medium.
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which:
In the description that follows and in the figures, certain embodiments of the invention are described. However, it will be appreciated that the invention is not limited to the embodiments that are described and that some embodiments may not include all of the features that are described below. It will be evident, however, that various modifications and changes may be made herein without departing from the broader spirit and scope of the invention as set forth in the appended claims.
The storage medium 104 may be any form of non-volatile data storage device such as one or more of a hard disk drive, a magnetic disc, an optical disc, a ROM, etc. The storage medium 104 may store an operating system for the processor 108 to execute in order for the computer 102 to function. The storage medium 104 may also store one or more computer programs (or software or instructions or code).
The memory 106 may be any random access memory (storage unit or volatile storage medium) suitable for storing data and/or computer programs (or software or instructions or code).
The processor 108 may be any data processing unit suitable for executing one or more computer programs (such as those stored on the storage medium 104 and/or in the memory 106), some of which may be computer programs according to embodiments of the invention or computer programs that, when executed by the processor 108, cause the processor 108 to carry out a method according to an embodiment of the invention and configure the system 100 to be a system according to an embodiment of the invention. The processor 108 may comprise a single data processing unit or multiple data processing units operating in parallel or in cooperation with each other. The processor 108, in carrying out data processing operations for embodiments of the invention, may store data to and/or read data from the storage medium 104 and/or the memory 106.
The interface 110 may be any unit for providing an interface to a device 122 external to, or removable from, the computer 102. The device 122 may be a data storage device, for example, one or more of an optical disc, a magnetic disc, a solid-state-storage device, etc. The device 122 may have processing capabilities—for example, the device may be a smart card. The interface 110 may therefore access data from, or provide data to, or interface with, the device 122 in accordance with one or more commands that it receives from the processor 108.
The user input interface 114 is arranged to receive input from a user, or operator, of the system 100. The user may provide this input via one or more input devices of the system 100, such as a mouse (or other pointing device) 126 and/or a keyboard 124, that are connected to, or in communication with, the user input interface 114. However, it will be appreciated that the user may provide input to the computer 102 via one or more additional or alternative input devices (such as a touch screen). The computer 102 may store the input received from the input devices via the user input interface 114 in the memory 106 for the processor 108 to subsequently access and process, or may pass it straight to the processor 108, so that the processor 108 can respond to the user input accordingly.
The user output interface 112 is arranged to provide a graphical/visual and/or audio output to a user, or operator, of the system 100. As such, the processor 108 may be arranged to instruct the user output interface 112 to form an image/video signal representing a desired graphical output, and to provide this signal to a monitor (or screen or display unit) 120 of the system 100 that is connected to the user output interface 112. Additionally or alternatively, the processor 108 may be arranged to instruct the user output interface 112 to form an audio signal representing a desired audio output, and to provide this signal to one or more speakers 121 of the system 100 that is connected to the user output interface 112.
Finally, the network interface 116 provides functionality for the computer 102 to download data from and/or upload data to one or more data communication networks.
It will be appreciated that the architecture of the system 100 illustrated in
As will be described in more detail below, embodiments of the invention involve, or relate to, a predetermined function F. The function F may be any function (or operation or procedure or mapping or calculation or algorithm) that is arranged to operate on (or process) one or more inputs to generate a corresponding output. Some specific examples of the function F are provided later, but it will be appreciated that embodiments of the invention are not limited to the specific examples discussed below.
As shall be described in more detail below, the function F is a function for which one or more corresponding functions (referred to herein as “sub-functions” for ease of reference) F1, . . . , FN can be defined so that, for a set of inputs {x1, . . . , xW} for the function F, a set of outputs {y1, . . . , yV} from the function F that corresponds to the set of inputs {x1, . . . , xW} may be generated by:
(a) representing the set of inputs {x1, . . . , xW} as a corresponding set of values {p1, . . . , pM}, wherein each value pj (j=1, . . . , M) comprises at least part of each input of a corresponding plurality of the inputs;
(b) generating a set of one or more results {q1, . . . , qN} from the set of values {p1, . . . , pM}, by applying each sub-function Fj (j=1, . . . , N) to a corresponding set of one or more values in the set of values {p1, . . . , pM} to generate a respective result qj; and
(c) forming each output yi as either a part of a corresponding one of the results or as a combination of at least part of each result of a corresponding plurality of the results.
Thus, in the description below, the following terminology is used:
In some embodiments, V=W, and yi=F(xi) for i=1, . . . , W. In other embodiments, the function F is arranged to process a number of inputs together to form a single output. As one example, the function F may have three input parameters (or operands), so that y1=F(x1, x2, x3), y2=F(x4, x5, x6), y3=F(x7, x8, x9), . . . , in which case V=W/3; as another example, the function F may have two input parameters (or operands), so that y1=F(x1, x2), y2=F(x2, x3), y3=F(x3, x4), . . . , in which case V=W−1. It will be appreciated that other relationships between the outputs yi and the inputs xi (and, therefore, other relationships between W and V) exist and embodiments of the invention are not limited to any specific relationships.
In some embodiments, the operation/processing performed for the function F is the same regardless of which input(s) xi is provided to the function F. For example, in the exemplary function above in which yi=F(xi) for i=1, . . . , W, the output yi may be calculated based on xi independent of the index i—i.e. for 1≤i<k≤W, yi=F(xi)=F(xj)=yj if xi=xj. Similarly, in the exemplary function above in which y1=F(x1, x2), y2=F(x2, x3), y3=F(x3, x4), . . . , the output yi may be calculated based on xi and xi+1 independent of the index i—i.e. for 1≤i<k<W, yi=F(xi, xi+1)=F(xj, xj+1)=yj if xi=xj and xi+1=xj+1. However, in other embodiments, the operation/processing performed for the function F may be different depending on which input(s) xi is provided to the function F. This may be viewed as the operation/processing performed for the function F being dependent on the index i. For example the function F may be defined as yi=F(xi)=xi+i for i=1, . . . , W. Similarly, the function F may be defined as yi=F(xi, xi+1)=ixi−3+(xi+1+i)/2. Similarly, the function F may be defined as yi=F(xi, xi+1)=Mixi+Mi+1xi+1 where Mi and Mi+1 are matrices for multiplication with input vectors xi and xi+1. Other examples of such functions are, of course, possible.
As shall be described later, all computer-implemented functions F can be implemented in the manner set out above, so that embodiments of the invention are applicable to all predetermined computer-implemented functions F.
Embodiments of the invention are described below now with reference to
The method 300 begins at a step 302, at which a set of inputs {x1, x2, . . . , xW} is obtained. The set of inputs {x1, x2, . . . , xW} may be obtained, at least in part, by receiving one or more of the inputs for the set of inputs {x1, x2, . . . , xW}—for example, a first module that is implementing the step 302 may have one or more of the inputs xi for the set of inputs {x1, x2, . . . , xW} provided to it from a second module as inputs to the first module. Additionally or alternatively, the set of inputs {x1, x2, . . . , xW} may be obtained, at least in part, by accessing or retrieving one or more of the inputs for the set of inputs {x1, x2, . . . , xW}—for example, a module that is implementing the step 302 may access or read one or more of the inputs xi from a memory (such as the memory 106). Thus, the term “obtain” as used herein shall be taken to mean “receive” (or “have provided” in a “passive” way) or “access” (or “retrieve” or “read” in more of an “active” way) or a combination of both receiving and accessing.
For the set of inputs {x1, x2, . . . , xW}, W is an integer greater than 1—thus, a plurality of inputs x1, x2, . . . , xW is obtained.
One or more of the inputs x1, x2, . . . , xW may be obtained separately (for example, the inputs x1, x2, . . . , xW may be obtained one at a time, so that, for example, input xi+1 is obtained after input xi for i=1, . . . , W−1). Additionally or alternatively, two or more of the inputs x1, x2, . . . , xW may be obtained together as a group (for example, the inputs x1, x2, . . . , xW may be obtained as a single amount of data comprising the whole set {x1, x2, . . . , xW}, for example by accessing or reading a block of memory that is storing the plurality of inputs x1, x2, . . . , xW). It will be appreciated that the inputs x1, x2, . . . , xW may be obtained in other ways/groupings.
Each input x1, x2, . . . , xW is a value (or quantity or data element or operand) that is a suitable operand or parameter for the function F. For example, if the function F is to process K-bit integers, then each of the inputs x1, x2, . . . , xW is a K-bit integer.
The set of inputs {x1, x2, . . . , xW} will be processed together, so that a corresponding set of outputs {y1, y2, . . . , yV} can be obtained (or determined or calculated), as will become apparent from the description below. Here, the number of outputs, V, generated from the set of inputs {x1, x2, . . . , xW} is an integer greater than or equal to 1. In some embodiments, V=W; in other embodiments, V≠W.
As used herein, the term “set” means a group or collection of elements in a particular order for example, the set {x1, x2, x3, x4, . . . , xW} is different from the set {x2, x1, x3, x4, . . . , xW} if x1 is different from x2. Thus, a set (as used herein) may be viewed as a vector or sequence or list or array of elements, i.e. the elements of the set are in particular order (the order being represented by the index/subscript).
Next, at a step 304, a set of values {p1, . . . , pM} is generated from the set of inputs {x1 . . . , xW}. Here, M is an integer greater than 1, so that a plurality of values p1, . . . , pM is generated from the set of inputs {x1, . . . , xW}.
At the step 304, the set of values {p1, . . . , pM} is generated according to {p1, . . . , pM}=D({x1, . . . , xW}), where D is an invertible function that maps a set of W inputs xi to a set of M values. Herein, the function D shall be referred to as the “distribution function”. In particular, the function D is a predetermined function that has the property that each value pj (j=1, . . . , M) comprises at least part of each input of a corresponding plurality of the inputs. Put another way, for each value pj (j=1, . . . , M), there is a corresponding set of mj distinct indices {αj,1, . . . , αj,m
Thus,
for a function Dj that corresponds to (or defines, at least in part) the distribution function D. For value pj (j=1, . . . , M), the at least part of the kth input that belongs to the corresponding set of mj inputs (i.e. input xα
For each value pj (j=1, . . . , M), the corresponding plurality of inputs
may be any 2 or more of the inputs x1, . . . , xW and, in particular, the corresponding plurality of the inputs may be all of the set of inputs {x1, . . . , xW} (so that mj=W) or a proper subset of the set of inputs {x1, . . . , xW} (so that mj<W). The number mj of inputs in the corresponding plurality of inputs may differ between values pj. Thus, in some embodiments, mj
For each value pj(j=1, . . . , M), the part Sj,k of the kth input that belongs to the corresponding set of mj inputs (i.e. of input xα
may be defined as the same parts of their respective inputs, or may be different parts of their respective inputs.
Each part Sj,k is some or all of one of the inputs xi. If the part Sj,k is all of the input xi, then Sj,k=xi. Alternatively, if the part Sj,k is only some of the input xi, then this means that, given a representation (e.g. binary, decimal, hexadecimal, etc.) of the input xi, then Sj,k comprises some of the symbols in that representation (e.g. some of the bits, or some of the decimal or hexadecimal values/symbols, of the representation). For example, with reference to
Whilst
Each value pj (j=1, . . . , M) is formed from its respective parts Sj,k (k=1, . . . , mj) by combining those parts, for example by concatenation/merging/mixing/etc. As will be described later (with reference to the examples shown in
Some specific examples of the distribution function D (and therefore the functions Di) shall be provided later, and it will be appreciated that embodiments of the invention are not limited to the specific examples discussed below. However, to help the understanding at this stage, an example of the distribution function D is as follows:
It will be appreciated that, at the step 304, the set of values {p1, . . . , pM} is a representation of the set of inputs {x1, . . . xW}, the representation being according to the distribution function D. A value pj may be generated as an amount of data (e.g. in the memory 106) distinct from the set of inputs {x1, . . . , xW} (so that the value pj is stored in addition to the set of inputs {x1, . . . , xW} and at a distinct address in memory from the set of inputs {x1, . . . , xW}). However, it will be appreciated that the module implementing the step 304 may determine (or generate) a value pj from the existing amounts of data already being stored to represent the set of inputs {x1, . . . , xW}, for example by specifying that value pj is made up from amounts of data at specific memory addresses, where those memory address store the respective parts of the relevant inputs xi (in which case these specific memory addresses implicitly define or represent the distribution function D). Thus, when the value pj is subsequently used, the module making use of the value pj could simply refer to the specific memory addresses that store the respective parts of the relevant inputs xi—in this case, the step of explicitly generating the values pj may be omitted, as that module implicitly uses values pj by virtue of using the correct memory addresses.
Thus, the step 304 may be viewed as a step (either implicit or explicit) of representing the set of inputs {x1, . . . , xW} as the set of values {p1, . . . , pM}.
Next, at a step 306, a set of results {q1, . . . , qN} is generated from the set of values {p1, . . . , pN}. Here, N is an integer greater than or equal to 1.
In particular, for j=1, . . . , N, there is a corresponding set of nj distinct indices {βj,1, . . . , βj,n
according to a predetermined function Fj, i.e.
Here, the function Fj corresponds to (or defines, at least in part) the function F—thus, the function Fj may be viewed as a sub-function corresponding to the function F.
Examples of how the functions F1, . . . , FN are defined and used shall be given later.
Next, at a step 308, the set of outputs {y1, . . . , yV} that corresponds to the set of inputs {x1, . . . , xW} is generated from the set of results {q1, . . . , qN}. In particular, the set of outputs {y1, . . . , yV} is generated according to {y1, . . . , yV}=E({q1, . . . , qN}), where E is an invertible function that maps a set of N results qj to a set of V outputs. Herein, the function E shall be referred to as the “separation function”. In particular, the function E is a predetermined function that has the property that each output yi either is a part of a corresponding one of the results or is a combination of at least part of each result of a corresponding plurality of the results. Put another way, for each value yi (i=1, . . . , V), there is a corresponding set of vi distinct indices {γi,1, . . . , γi,v
Thus,
for a function Ei that corresponds to (or defines, at least in part) the separation function E. The production of the set of outputs {y1, . . . , yV} from the set of results {q1, . . . , qN} via the separation function E (defined by its corresponding functions E1, . . . , EV) operates in an analogous way to the production of the set of values {p1, . . . , pM} from the set of inputs {x1, . . . , xW} via the distribution function D (defined by its corresponding functions D1, . . . , DM), so that the description above for the step 304 applies analogously to the step 308.
In some embodiments, when N=M and W=V, the separation function E is the inverse of the distribution function D.
In essence, then, instead of processing a single input to generate a corresponding output y (where this processing is independent of any of the other inputs), a plurality of inputs x1, . . . , xW are processed together to generate a corresponding plurality of outputs y1, . . . , yV by: (a) generating′values p1, . . . , pM, where each pi is dependent on multiple ones of the inputs; (b) processing the values p1, . . . , pM using functions F1, . . . , FN (that are based on the function F) to generate results q1, . . . , qN; and (c) separating out the outputs y1, . . . , yV from the generated results q1, . . . , qN.
An initial example will aid understanding the method 300 as described above. Consider a predetermined a lookup table that is implemented by the function F. An example of how the method 300 can be applied to such a function F is described below with reference to
Assume that the input to the lookup table F is an input x that is an M-bit value, where the kth bit of x is bk so that the binary representation of x is bMbM-1 . . . b2b1, (so that bk is 0 or 1 for k=1, . . . , M). In the example described below, M=8, so that the binary representation of x is b8b7b6b5b4b3b2b1. Assume also that the output from the lookup table F (i.e. the value looked-up in response to receiving the input x) is an output y that is an N-bit value, where the kth bit of y is ck so that the binary representation of y is cNcN-1 . . . c2c1, (so that ck is 0 or 1 for k=1, . . . , N). In the example described below, N=8, so that the binary representation of y is c8c7c6c5c4c3c2c1. This is schematically illustrated in
It will be appreciated that each of the output bits ck can be calculated or expressed as a respective logical expression Bk applied to the input bits bi i.e. ck=Bk(b1, b2, . . . , bM). This is schematically illustrated in
The function Bk can be expressed using one or more logical AND's (an AND is represented herein by ), zero or more logical OR's (an OR is represented herein by ) and zero or more logical NOT's (a NOT is represented herein by ). In particular, suppose that the lookup table results in output bit ck assuming the value 1 for n input values X1, . . . , Xn (i.e. when x assumes the value of any one of X1, . . . , Xn), and that for all other possible values for input x, ck assumes the value 0. For each i=1, . . . , n, let Ri be a corresponding logical expression defined by Ri(b1, b2, . . . , bM)=b′Mb′M-1b′M-2 . . . b′3b′2b′1 (i.e. an AND of expressions b′j for j=1, . . . , M), where, for j=1, . . . , M, b′j=bj if the jth bit bj of the input value Xi is a 1 and b′j=bj if the jth bit bj of the input value Xi is a 0. For example, for the 8-bit input value Xi=53 in decimal, or (00110101) in binary, then Ri(b1, b2, . . . , bM)=b8b7b6b5b4b3b2b1. Thus, Ri(b1, b2, . . . , bM) only evaluates to the value 1 for an input value of 53. Then, Bk can be defined as R1vR2v . . . vRn, i.e. by OR-ing the expressions Ri (i=1, . . . , n) together (if n=1, then no OR's are necessary). Then Bk only evaluates to the value 1 for an input that assumes the value of one of X1, X2, . . . , Xn. For example, suppose n=3 and c4 only assumes the value of 1 if the input x takes the value 31 (=(00011111) in binary), 53 (=(00110101) in binary) or 149 (=(10010101) in binary). Then:
R1(b1,b2, . . . ,bM)=b8b7b6b5b4b3b2b1
R2(b1,b2, . . . ,bM)=b8b7b6b5b4b3b2b1
R3(b1,b2, . . . ,bM)=b8b7b6b5b4b3b2b1
so that B4 can be expressed as
B4(b1,b2, . . . ,bM)=(b8b7b6b5b4b3b2b1)v(b8b7b6b5b4b3b2b1)v(b8b7b6b5b4b3b2b1)v
There are, of course, more efficient or optimized ways of expressing Bk, i.e. with fewer logical operations. For example, one could express B4 above as follows:
B4(b1,b2, . . . , bM)=b1b3b5b7((b8b6b4b2)(b8b6b4b2)(b8b6b4b2))
and further more optimized expressions are possible. Indeed, in general, it is expected that an optimized expression may contain between 10% and 20% of the above “naïve” logical expression generated by simply OR-ing together the sub-expressions Ri.
Thus, the lookup table F may be considered to be implemented by the functions B1, . . . , BN, so that given an input x=bMbM-1 . . . b2b1, the corresponding output y=cNcN-1 . . . c2c1 is defined by ck=Bk(b1, b2, . . . , bM) for k=1, . . . , N.
Although implementing the lookup table F by using N separate Boolean expressions B1, . . . , BN may introduce a performance penalty, this can be largely mitigated by performing data-level parallelism. In particular, given a set of inputs {x1, x2, . . . , xW}, a set of M values {p1, . . . , pM} may be obtained or generated, where each value pk is a W-bit value, where the wth bit of value pk is the kth bit of input xw (for w=1, . . . , W and k=1, . . . , M). In other words, the set of inputs {x1, x2, . . . , xW} may be expressed or represented as a set of M values {p1, . . . , pM}. This is illustrated schematically in
Thus, in this example, the distribution function D used at the step 304 of
In
B4(b1,b2, . . . , bM)=b1b3b5b7((b8b6b4b2)(b8b6b4b2)(b8b6b4b2))
the corresponding function F4 may be expressed as
F4(p1,p2, . . . ,pM)=p1p3p5p7((p8p6p4p2)(p8p6p4p2)(p8p6p4p2))
(where , and in F4 are bit-wise AND, OR and NOT operations on multi-bit operands). For example, the kth bit of the intermediate value q4 is the result of applying the function B4 to the bits of the kth input xk, and this results (in the kth bit position) by applying the function F4 to the values p1, . . . , pM.
As processors are often arranged to perform logical operations on multi-bit operands, depending on the word-size of the processor (e.g. by having 32-bit or 64-bit AND, OR and NOT operators), a whole set of outputs {y1, y2, . . . , yW} can be obtained from the set of inputs {x1, x2, . . . , xW} at the same time. This helps mitigate the performance penalty incurred by implementing the lookup table using Boolean expressions.
A particular example of the “general lookup table” concept set out above in section 2.1 is provided below for a specific example lookup table. This specific lookup table is quite small (for ease of explanation), but, as set out in section 2.1 above, it will be appreciated that the “general lookup table” concept can be applied to other, potentially larger, lookup tables.
Consider the following lookup table that defines the function F:
Here, the input to the function F is a 3-bit number and the output is a 4-bit number. Thus, M=3 and N=4. The above table defines function B1, . . . , B4 as follows:
B4(b1,b2,b3)=(b3b2b1)(b3b2b1)
B3(b1,b2,b3)=(b3b2b1)(b3b2b1)
B2(b1,b2,b3)=(b3b2b1)(b3b2b1)(b3b2b1)(b3b2b1)
B1(b1,b2,b3)=(b3b2b1)(b3b2b1)(b3b2b1)(b3b2b1)(b3b2b1)
so that functions F1, . . . , F4 (i.e. the sub-functions for function F) are defined by
F4(p1,p2,p3)=(p3p2p1)(p3p2p1)
F3(p1,p2,p3)=(p3p2p1)(p3p2p1)
F2(p1,p2,p3)=(p3p2p1)(p3p2p1)(p3p2p1)(p3p2p1)
Fi(p1,p2,p3)=(p3p2p1)(p3p2p1)(p3p2p1)(p3p2p1)(p3p2p1)
Consider the set of five inputs {x1, x2, x3, x4, x5}, where x5=5=(101), x4=7=(111), x3=2=(010), x2=0=(000) and x1=4=(100)—i.e. W=5. Then the set of values {p1, p2, p3} are formed from the set of three inputs {x1, x2, x3} as discussed above, where each value pi is a W-bit value. Thus, p3=(11001), p2=(01100) and p1=(11000).
Then we note that the set of results {q1, q2, q3, q4} are formed from the set of values {p1, p2, p3} according to:
The set of five outputs {y1, y2, y3, y4, y5} are formed from the set of four results {q1, q2, q3, q4} as discussed above, which results in y5=(1011), y4=(0010), y3=(0000), y2=(0000) and y1=(0101).
The lookup table defined above in section 2.2 need not be implemented using the “general lookup table” concept set out above in section 2.1. There are many ways in which the method 300 of
Given a set of five inputs {x1, x2, x3, x4, x5}, where each input xi is a 3-bit value, define the distribution function D by:
p1=(bit 3 of x1)(bit 2 of x1)(bit 3 of x2)(bit 2 of x2)
p2=(bit 1 of x1)(bit 1 of x2)x3x4x5
Thus, given the set five inputs {x1, x2, x3, x4, x5} that assume values x5=5=(101), x4=7=(111), x3=2=(010), x2=0=(000) and x1=4=(100), the set of values {p1, p2} is formed from the set of three inputs {x1, x2, x3} as discussed above according to the distribution function D, so that p1=(1000) and p2=(00010111101).
Define the functions F1A, F1B, F1C, F2A, F2B, F3A, F3B, F3C, F3D and F3E as follows:
F1A(p1,p2)=(00) if (a)(p1(1100)=(0000)) or (b) (p1(1100)=(0100)) and (p2(10000000000)=(00000000000)) or (c) (p1(1100)=(1100))
F1A(p1,p2)=(01) if (p1(1100)=(1000)) and (p2(10000000000)=(00000000000))
F1A(p1,p2)=(10) if (p1(1100)=(1000)) and (p2(10000000000)=(10000000000))
F1A(p1,p2)=(11) if (p1(1100)=(0100)) and (p2(10000000000)=(10000000000))
F1B(p1,p2)=(00) if (a)(p1(0011)=(0000)) or (b) (p1(0011)=(0001)) and (p2(01000000000)=(00000000000)) or (c) (p1(0011)=(0011))
F1B(p1,p2)=(01) if (p1(0011)=(0010)) and (p2(01000000000)=(00000000000))
F1B(p1,p2)=(10) if (p1(0011)=(0010)) and (p2(01000000000)=(01000000000))
F1B(p1,p2)=(11) if (p1(0011)=(0001)) and (p2(01000000000)=(01000000000))
F1C(p2)=(00) if (a)(p2(00110000000)=(00000000000)) or (b) (p2(00111000000)=(00010000000)) or (c) (p2(00110000000)=(00110000000))
F1C(p2)=(01) if (p2(00111000000)=(00100000000))
F1C(p2)=(10) if (p2(00111000000)=(00101000000))
F1C(p2)=(11) if (p2(00111000000)=(00011000000))
F2A(p2)=(00) if (a)(p2(00000110000)=(00000000000)) or (b) (p2(00000111000)=(00000010000)) or (c) (p2(00000110000)=(00000110000))
F2A(p2)=(01) if (p2(00000111000)=(00000100000))
F2A(p2)=(10) if (p2(00000111000)=(00000101000))
F2A(p2)=(11) if (p2(00000111000)=(00000011000))
F2B(p2)=(00) if (a)(p2(00000000110)=(00000000000)) or (b) (p2(00000000111)=(00000000010)) or (c) (p2(00000000110)=(00000000110))
F2B(p2)=(01) if (p2(00000000111)=(00000000100))
F2B(p2)=(10) if (p2(00000000111)=(00000000101))
F2B(p2)=(11) if (p2(00000000111)=(00000000011))
F3A(p1,p2)=(00) if (a)(p1(1100)=(0000)) and (p2(10000000000)=(00000000000)) or (b) (p1(1100)=(0100)) and (p2(10000000000)=(00000000000))
F3A(p1,p2)=(01) if (a)(p1(1100)=(1000)) and (p2(10000000000)=(00000000000)) or (b) (p1(1100)=(1100)) and (p2(10000000000)=(00000000000))
F3A(p1,p2)=(10) if (a)(p1(1100)=(1100)) and (p2(10000000000)=(10000000000))
F3A(p1,p2)=(11) if (a)(p1(1100)=(0000)) and (p2(10000000000)=(10000000000)) or (b) (p1(1100)=(0100)) and (p2(10000000000)=(10000000000)) or (c) (p1(1100)=(1000)) and (p2(10000000000)=(10000000000))
F3B(p1,p2)=(00) if (a)(p1)(0011)=(0000)) and (p2(01000000000)=(00000000000)) or (b) (p1(0011)=(0001)) and (p2(01000000000)=(00000000000))
F3B(p1,p2)=(01) if (a)(p1(0011)=(0010)) and (p2(01000000000)=(00000000000)) or (b) (p1(0011)=(0011)) and (p2(01000000000)=(00000000000))
F3B(p1,p2)=(10) if (a)(p1(0011)=(0011)) and (p2(01000000000)=(01000000000))
F3B(p1,p2)=(11) if (a)(p1(0011)=(0000)) and (p2(01000000000)=(01000000000)) or (b) (p1(0011)=(0001)) and (p2(01000000000)=(01000000000)) or (c) (p1(0011)=(0010)) and (p2(01000000000)=(01000000000))
F3C(p2)=(00) if (a)(p2(00111000000)=(00000000000)) or (b) (p2(00111000000)=(00010000000))
F3C(p2)=(01) if (a)(p2(00111000000)=(00100000000)) or (b) (p2(00111000000)=(00110000000))
F3C(p2)=(10) if (a)(p2(00111000000)=(00111000000))
F3C(p2)=(11) if (a)(p2(00111000000)=(00001000000)) or (b) (p2(00111000000)=(00011000000)) or (c) (p2(00111000000)=(00101000000))
F3D(p2)=(00) if (a)(p2(00000111000)=(00000000000)) or (b) (p2(00000111000)=(00000010000))
F3D(p2)=(01) if (a)(p2(00000111000)=(00000100000)) or (b) (p2(00000111000)=(00000110000))
F3D(p2)=(10) if (a)(p2(00000111000)=(00000111000))
F3D(p2)=(11) if (a)(p2(00000111000)=(00000001000)) or (b) (p2(00000111000)=(00000011000)) or (c) (p2(00000111000)=(00000101000))
F3E(p2)=(00) if (a)(p2(00000000111)=(00000000000)) or (b) (p2(00000000111)=(00000000010))
F3E(p2)=(01) if (a)(p2(00000000111)=(00000000100)) or (b) (p2(00000000111)=(00000000110))
F3E(p2)=(10) if (a)(p2(00000000111)=(00000000111))
F3E(p2)=(11) if (a)(p2(00000000111)=(00000000001)) or (b) (p2(00000000111)=(00000000011)) or (c) (p2(00000000111)=(00000000101))
Then define the sub-functions F1, F2 and F3 by:
F1(p1, p2)=F1A(p1, p2)F1B(p1, p2)F1C(p2), i.e. the concatenation of the outputs of F1A(p1, p2) and F1B(p1, p2) and F1c(p2);
F2(p2)=F2A(p2)F2B(p2) i.e. the concatenation of the outputs of F2A(p2) and F2B(p2); and
Then if q1=F1 (p1, p2) and q2=F2(p2) and q3=F3(p1, p2), then for p1=(1000), p2=(00010111101) as derived above, q1=(010000), q2=(0010) q3=(0100001011). The roles of the functions F1A, F1B, F1C, F2A, F2B, F3A; F3B, F3C, F3D and F3E relative to the results q1, q2 and q3 is illustrated schematically in
Define the separation function E as follows:
y1=(bit 6 of q1)(bit 5 of q1)(bit 10 of q3)(bit 9 of q3)
y2=(bit 4 of q1)(bit 3 of q1)(bit 8 of q3)(bit 7 of q3)
y3=(bit 2 of q1)(bit 1 of q1)(bit 6 of q3)(bit 5 of q3)
y4=(bit 4 of q2)(bit 3 of q2)(bit 4 of q3)(bit 3 of q3)
y5=(bit 2 of q2)(bit 1 of q2)(bit 2 of q3)(bit 1 of q3)
Then based on q1=(010000), q2=(0010) q3=(0100001011) as derived above, y1=(0101), y2=(0000), y3=(0000), y4=(0010) and y5=(1011).
Any computer-implemented function can be implemented as a lookup table (albeit a potentially large lookup table). Even if the output of the function is dependent on time, then the time value can be considered to be an input to the lookup table too. Thus, using the principles of section 2.1 above, any computer-implemented function F can be implemented using the method 300 of
A further example of the method 300 will be described below, where this example does not rely on the “lookup table principles” described in section 2.1 above. Thus, this example serves to show that, whilst any computer-implemented function can be implemented as a lookup table so that the “lookup table principles” described in section 2.1 above can be used to create a corresponding implementation in the form of the method 300, other (potentially more efficient) implementations of the function F in the form of the method 300 can be achieved via other routes.
Consider a finite impulse response (FIR) filter that is implemented by the function F. In particular, given a sequence (or set) of inputs x1, x2, . . . , the FIR filter generates a corresponding sequence (or set) of outputs yL, yL+1, . . . according to:
(where L is the length of the filter and δ0, δ1, . . . , δL−1 are the filter weights/taps). A specific example is used below, where L=3 and δ0=δ1=δ2=⅓, so that
although it will be appreciated that other example FIR filters could be implemented analogously.
Given a set of inputs {x1, x2, x3, x4, x5} (so W=5), a set of values {p1, p2, p3} is formed (so M=3). In particular, value pj is formed by concatenating: (a) one or more first spacer 0-bits; (b) input xj; (c) one or more second spacer 0-bits; (d) input xj+1; (e) one or more third spacer 0-bits; and (f) input xj+2. This is illustrated schematically in
From the set of values {p1, p2, p3}, a set of results {q1} is formed (so N=1). In particular, q1=F1(p1, p2, p3)=(p1+p2+p3)/3.
From the set of results {q1}, a set of outputs {y3, y4, y5} is formed (so V=3) In particular, the result q1 comprises outputs y3, y4, y5, where y3 occupies the space/part in q1 that corresponds to the space/part in p1 that was occupied by x1, y4 occupies the space/part in q1 that corresponds to the space/part in p1 that was occupied by x2, and y5 occupies the space/part in q1 that corresponds to the space/part in p1 that was occupied by x3. This is illustrated schematically in
As a slight modification of this example, given a set of inputs {x1, x2, . . . , x8} (so W=8), a set of values {p1, p2, p3, p4} is formed (so M=4). In particular, value pj is formed by concatenating: (a) one or more first spacer 0-bits; (b) input xj; (c) one or more second spacer 0-bits; (d) input xj+2; (e) one or more third spacer 0-bits; and (f) input xj+4. This is illustrated schematically in
From the set of values {p1, p2, p3, p4}, a set of results {q1, q2} is formed (so N=2). In particular, q1=F1(p1, p2, p3)=(p1+p2+p3)/3 and q2=F2(p2, p3, p4)=(p2+p3+p4)/3.
From the set of results {q1, q2}, a set of outputs {y3, y4, . . . , y8} is formed (so V=6). In particular, the result q1 comprises outputs y3, y5, y7, where y3 occupies the space/part in q1 that corresponds to the space/part in p1 that was occupied by x1, y5 occupies the space/part in q1 that corresponds to the space/part in p1 that was occupied by x3, and y7 occupies the space/part in q1 that corresponds to the space/part in p1 that was occupied by x5; the result q2 comprises outputs y4, y6, y8, where y4 occupies the space/part in q2 that corresponds to the space/part in p1 that was occupied by x1, y6 occupies the space/part in q2 that corresponds to the space/part in p1 that was occupied by x3, and y8 occupies the space/part in q2 that corresponds to the space/part in p1 that was occupied by x5 This is illustrated schematically in
Example 1, as set out in section 2.1 above, illustrated a function F that implements a single look-up table. In other words, for each input xi i=1, . . . , W, an output yi is generated by setting yi to be the result of looking-up xi in a lookup table. In Example 1, the same lookup table was used regardless of the value of the index i. Suppose, instead, that for each input xi i=1, . . . , W, an output yi is generated by setting yi to be the result of looking-up xi in a corresponding lookup table LTi, where the lookup tables LTi may vary based on the index i, i.e. there may be indices i and j for which LTi≠LTj. Thus yi=F(xi)=LTi(xi).
Again, one can assume that the input to a lookup table LTi is an input xi that is an M-bit value, where the kth bit of xi is bi,k so that the binary representation of xi is bi,Mbi,M-1 . . . bi,2bi,1, (so that bi,k is 0 or 1 for k=1, . . . , M). In the example described below, M=8, so that the binary representation of xj is bi,8bi,7bi,6bi,5bi,4bi,3bi,2bi,1. Assume also that the output from the lookup table LTi (i.e. the value looked-up in response to receiving the input xi) is an output yi that is an N-bit value, where the kth bit of yi is ci,k so that the binary representation of yi is ci,Nci,N-1 . . . ci,2ci,1, (so that ci,k is 0 or 1 for k=1, . . . , N). In the example described below, N=8, so that the binary representation of yi is ci,8ci,7ci,6ci,5ci,4ci,3ci,2ci,1. As with Example 1, this is schematically illustrated in
It will be appreciated that each of the output bits ci,k can be calculated or expressed as a respective logical expression Bi,k applied to the bits of the input xi i.e. ci,k=Bi,k(bi,1, bi,2, . . . , bi,M). Again, as with Example 1, this is schematically illustrated in
Thus, the lookup table LTi may be considered to be implemented by the functions Bi,1, . . . , Bi,N, so that given an input xi=bi,mbi,M-1 . . . bi,2bi,1, the corresponding output yi=ci,Nci,N-1 . . . ci,2ci,1 is defined by ci,k=Bi,k(bi,1, bi,2, . . . , bi,M) for k=1, . . . , N and i=1, . . . , W.
As with Example 1, given a set of inputs {x1, x2, . . . , xW}, a set of M values {p1, . . . , pM} may be obtained or generated, where each value pk is a W-bit value, where the wth bit of value pk is the kth bit of input xw (for w=1, . . . , W and k=1, . . . , M). In other words, the set of inputs {x1, x2, . . . , xW} may be expressed or represented as a set of M values {p1, . . . , pM}. This is illustrated schematically in
Thus, in this example, the distribution function D used at the step 304 of
Functions Fk (k=1, . . . , N) that calculate, respectively the results qk (k=1, . . . , N) may be defined as follows. For k=1, . . . , N, and for v=1, . . . , V, the vth bit of result qk=Fk(p1, . . . , pM) is defined as Bv,k(p1,v, p2,v, . . . , pM,v), where pi,j is the jth bit of value pi for i=1, . . . , M and j=1, . . . , W.
Consider the situation in which W=4, M=2 and the function F is defined as follows: yi=F(xi)=xi+i mod(4). Then the function F can be considered as implementing 4 lookup tables LTi i=1, . . . , 4, namely:
F(x1)=x1+1 mod(4)=LT1(x1), so that LT1 is defined as the table
F(x2)=x2+2 mod(4)=LT2(x2), so that LT2 is defined as the table
F(x3)=x3+3 mod(4)=LT3(x3), so that LT3 is defined as the table
F(x4)=x4+4 mod(4)=LT4(x4), so that LT4 is defined as the table
One can then use the procedure set out in Example 6 to define values {p1, p2} and functions F1 and F2 that will determine results {q1, q2}.
Consider the situation in which W=V=2, M=N=3 and the function F operates on inputs x1 and x2 (whose binary representations are (x1,3x1,2x1,1) and (x2,3x2,2x2,1) respectively) to yield outputs y1 and y2 (whose binary representations are (y1,3y1,2y1,1) and (y2,3y2,2y2,1) respectively) according to the affine transformation Y=F(x1, x2)=MX+B, where
M is a 6×6 binary matrix, and B is a 6×1 binary matrix, where addition is modulo 2. As an example, let
so that
y1,1=x1,1⊕x2,1⊕x2,2
y1,2=x2,2⊕x2,3
y1,3=x1,1⊕x1,3⊕1
y2,1=x1,2⊕x1,3
y2,2=x2,2⊕1
y2,3=x1,1⊕x2,3⊕1
This is another example of where the predetermined function F applies different processing to obtain y1 from one or more of the inputs (here, x1 and x2) than it applies to obtain y2 from those inputs.
Define values p1, p2 and p3 as values whose binary representations are: p3=(p3,2p3,1)=(x1,3x2,2), p2=(p2,3p2,2p2,1)=(x1,2x2,1x1,3), p1=(p1,2p1,1)=(x1,1x2,3). This defines the distribution function D.
Define results q1, . . . , q6 as q1=y1,1, q2=y1,2, q3=y1,3, q4=y2,1, q5=y2,2, q6=y2,3. This defines the separation function E.
Then functions F1, . . . , F6 can be defined as follows:
q1=F1(p1,p2,p3)=p1,2⊕p2,2⊕p3,1
q2=F2(p1,p3)=p3,1⊕p1,1
q3=F3(p1,p3)=p1,2⊕p3,2⊕1
q4=F4(p2)=p2,3⊕p2,1
q5=F5(p2,p3)=p3,1⊕1 or p3,2⊕p2,1⊕p3,1⊕1
q6=F6(p1)=p1,2⊕p1,1⊕1
Embodiments of the invention aim to be able to execute code, that implements the function F, securely in a so-called white-box environment. A “white box environment” is an execution environment in which a person can execute an amount of computer code (or software)—where the code implements the function F—and the person may inspect and modify the code (or be assumed to know the underlying algorithm that is being implemented) and/or, during execution of the code, the person may inspect and modify the values of data being used (i.e. the contents of the memory being used), the data flow and the process flow (or order of execution of instructions in the code). Embodiments of the invention therefore aim to be able to provide or generate code (that implements the function F) such that, even if the code is executed in a white-box environment, the person executing the code cannot determine the values of inputs to the function F and/or outputs of the function F and/or secret information used by the function F.
In the following, one or more bijective functions (or transformations or transforms) will be used. A bijective function is a function that is injective (i.e. is a 1-to-1 mapping) and that is surjective (i.e. maps onto the whole of a particular range of values). If the domain of possible input, values for the function T is domain Dom, and if the function T is an injective function (so that T(a)=T(b) if and only if a=b), then T is a bijective function from Dom onto the range T(Dom)={T(a): aϵDom}.
An initial simple example will help understand how the use of bijective functions T can help achieve the above aim. In this example, the bijective functions T are linear transformations in a Galois field GF(Ψn) for some prime number Ψ and positive integer n, i.e. T: GF(Ψn)→GF(Ψn). For example, if the processor executing the code uses Z-bit registers for its data (e.g. Z=32), then a Z-bit number may be viewed as an element of the Galois field GF(2Z), so that Ψ=2 and n=Z.
Consider a predetermined function G that operates on elements s1 and s2 in the Galois field GF(Ψn) according to r=G(s1, s2)=s1+s2, where + is addition in the Galois field GF(Ψn). In this Galois field GF(Ψn), the addition s1+s2 is the same as an XOR operation, so that r=G(s1, s2)=s1⊕s2. Let s1*, s2* and r* be transformed versions of s1, s2 and r according to respective linear transformations T1, T2 and T3 in the Galois field GF(Ψn), so that s1*=T1(s1)=a·s1+b, s2*=T2(s2)=c·s2+d and r*=T3(r)=e·r+f for arbitrary non-zero constants a, c, and e in the Galois field GF(Ψn), and arbitrary constants b, d and f in the Galois field GF(Ψn) (so that constants a, c, and e may be randomly chosen from GF(Ψn)/{0} and constants b, d, and f may be randomly chosen from GF(Ψn)). Then r*=e·(s1+s2)+f=e·(a−1(s1*+b)+c−1(s2*+d))+f=g·s1*+h·s2*+i, where g=e·a−1, h=e·c−1 and i=e·(a−1b+c−1d)+f.
Thus, given the transformed versions s1*=T1(s1) and s2*=T2(s2) of the inputs s1 and s2, it is possible to calculate the transformed version r*=T3(r) of the result r without having to remove any of the transformations (i.e. without having to derive s1 and/or s2 from the versions s1* and s2*). In particular, having defined the transformations T1, T2 and T3 by their respective parameters (a and b for T1, c and d for T2, e and f for T3), a transformed version G* of the function G can be implemented according to G*(s1*, s2*)=g·s1*+h·s2*+i, where g=e·a−1, h=e·c−1 and i=e·(a−1b+c−1d)+f, so that r*=G*(s1*, s2*) can be calculated without determining/revealing s1 or s2 as an intermediate step in the processing. The result r can then be obtained from the transformed version r*=G*(s1*, s2*) of the result r, as r=e−1(r*+f))—thus, a linear transformation T4 (which is the inverse of T3) can be used to obtain the result r from the transformed version r*, where r=T4(r*)=e−1r*+e−1f. Alternatively, the transformed version r* of the result r could be an input to a subsequent function. In other words, given the function G that operates on inputs s1 and s2 to produce a result r, if transformations T1, T2 and T3 are specified (e.g. randomly, by choosing the parameters for the transformations randomly, or based on some other parameters/data), then a transformed version G* of the function G can be generated/implemented, where the function G* operates on transformed inputs s1*=T1(s1) and s2*=T2(s2) to produce a transformed result r*=T3(r) according to r*=g·s1*+h·s2*+i. If a person implements the function G* in a white-box environment, then that person cannot identify what operation the underlying function G is performing, nor can the person determine the actual result r nor in inputs s1 and s2 (since these values are never revealed when performing the function G*).
Note that it is possible for one or both of T1 and T2 to be the identity transformation (i.e. T1 is the identity transformation if T1(s1)=s1 for all values of s1, so that a=1 and b=0 in the above example, and T2 is the identity transformation if T2(s2)=s2, so that c=1 and d=0 in the above example). If this is the case, then the person implementing the function G* can identify the value assumed by the input s1 (if T1 is the identity transformation) and/or the value assumed by the input s2 (if T2 is the identity transformation). However, so long as T3 is not the identity transformation, then that person cannot identify what operation the underlying function G is performing, nor can the person determine the actual result r.
Similarly, it is possible for T3 to be the identity transformation (i.e. T3 is the identity transformation if T3(r)=r for all values of r, so that e=1 and f=0 in the above example). If this is the case, then the person implementing the function G* can identify the value assumed by the output r. However, so long as one or both of T1 and T2 are not the identity transformation, then that person cannot identify what operation the underlying function G is performing, nor can the person determine one or both of the initial inputs s1 and s2.
It will be appreciated that other functions G could be implemented as a corresponding “transformed version” G*, where the input(s) to the function G* are transformed versions of the input(s) to the function G according to respective injective (1-to-1) transformations and the output(s) of the function G* are transformed versions of the output(s) of the function G according to respective injective transformations. The transformations need not necessarily be linear transformations as set out above, but could be any other kind of injective transformation. Thus, given a function G that has u inputs s1, . . . , su and v outputs r1, . . . , rv, a transformed version G* of the function G can be implemented, where G* has transformed versions s1*, . . . , s*u of the inputs s1, . . . , su as its input and outputs transformed versions r1*, . . . , rv* of the outputs r1, . . . , rv, where si*=Ti(si) and ri*=Ti+u(ri) for injective functions T1, . . . , Tu+v. It is possible that two or more of the functions Ti might be the same as each other. The fact that this can be done for any function G is discussed below.
As set out below, the XOR operation, along with conditional branching on constants, forms a system which is Turing complete. This means that any mathematical function can be implemented using only (a) zero or more XOR operations and (b) zero or more conditional branchings on constants.
A Turing machine is a notional device that manipulates symbols on a strip of tape according to a table of rules. Despite its simplicity, a Turing machine can be adapted to simulate the logic of any computer algorithm. The Turing machine mathematically models a machine that mechanically operates on a tape. On this tape are symbols which the machine can read and write, one at a time, using a tape head. Operation is fully determined by a finite set of elementary instructions such as “in state 42, if the symbol seen is 0, write a 1; if the symbol seen is 1, change into state 17; in state 17, if the symbol seen is 0, write a 1 and change to state 6” etc. More precisely, a Turing machine consists of:
Turing machines are very well-known and shall, therefore, not be described in more detail herein.
If it can be shown that any possible 5-tuple in the action table can be implemented using the XOR operation and conditional branching on constants, then we know that a processing system based on the XOR operation and conditional branching on constants is Turing complete (since any function or computer program can be implemented or modelled as a Turing machine, and all of the 5-tuples in the action table of that Turing machine can be implemented using the XOR operation and conditional branching on constants).
Consider the following mappings between the elements in the Turing machine and those in a system that uses only XORs and conditional branching on constants:
The following pseudo-code shows a typical state implementation (for the state with identifier “i”); where values X1, X2, . . . , Xq are constants and “Addr” is the pointer to a memory location. The example shown below illustrates the three possibilities of incrementing, decrementing and not-changing the address “Addr” variable.
Thus, any possible 5-tuple in the action table can be implemented using the XOR operation and conditional branching. Hence, a system based on the XOR operation and conditional branching is Turing complete, i.e. any Turing machine can be implemented using only XORs (for point (e) above) and conditional jumps (for point (b) above).
As shown above, it is possible to perform an operation in the transformed domain (via the function G*) that is equivalent to r=s1⊕s2 without ever removing the transformations on r*, s1* or s2*. A conditional jump is implemented using the capabilities of the programming language. This means that it is possible to implement any mathematical operation in the transformed domain without ever removing the transformations on the data elements being processed. In other words, given any function G that has u inputs s1, . . . , su (u≥1) and v outputs r1, . . . , rv, (v≥1), a transformed version G* of the function G can be implemented, where G* is a function that has transformed versions s1*, . . . , s*u of the inputs s1, . . . , su as its input(s) and that outputs transformed versions, r1*, . . . , rv* of the output(s) r1, . . . , rv, where si*=Ti(si) and ri*=Ti+u(ri) for injective functions T1, . . . , Tu+v. It is possible that two or more of the functions Ti might be the same as each other. As set out above, the injective functions T1, . . . , Tu+v may be defined (e.g. randomly generated injective functions), and, given the particular injective functions T1 . . . , Tu+v that are defined, a particular transformed version G* of the function G results (or is defined/obtained/implemented).
The use of bijective functions T to obfuscate the implementation of a predetermined function, and the various methods of such use, are well-known in this field of technology—see, for example: “White-Box Cryptography and an AES Implementation”, by Stanley Chow, Philip Eisen, Harold Johnson, and Paul C. Van Oorschot, in Selected Areas in Cryptography: 9th Annual International Workshop, SAC 2002, St. John's, Newfoundland, Canada, Aug. 15-16, 2002; “A White-Box DES Implementation for DRM Applications”, by Stanley Chow, Phil Eisen, Harold Johnson, and Paul C. van Oorschot, in Digital Rights Management: ACM CCS-9 Workshop, DRM 2002, Washington, D.C., USA, Nov. 18, 2002; U.S. 61/055,694; WO2009/140774; U.S. Pat. Nos. 6,779,114; 7,350,085; 7,397,916; 6,594,761; and 6,842,862, the entire disclosures of which are incorporated herein by reference.
As will be described in more detail below, embodiments of the invention relate to a predetermined function F. Some embodiments relate to obfuscated performance (or execution or running) of the function F. The function F may be performed, for example, by the system 100 (for example by the processor 108 executing a computer program that implements, amongst other things, the function F). Other embodiments relate to configuring a processor to implement the function F in an obfuscated manner (such as arranging the processor 108 to execute a suitable computer program). Embodiments of the invention aim to be able to execute code, that implements the function F, securely in a white-box environment.
The function F is a function as described in section 2 above, namely one for which one or more corresponding functions (referred to herein as “sub-functions”) F1, . . . , FN can be defined so that, for a set of inputs {x1, . . . , xW} for the function F, a set of outputs {y1, . . . , yV} from the function F that corresponds to the set of inputs {x1, . . . , xW} may be generated by:
(a) representing the set of inputs {x1, . . . , xW} as a corresponding set of values {p1, . . . , pM}, wherein each value pj (j=1, . . . , M) comprises at least part of each input of a corresponding plurality of the inputs;
(b) generating a set of one or more results {q1, . . . , qN} from the set of values {p1, . . . , pM}, by applying each sub-function Fj (j=1, . . . , N) to a corresponding set of one or more values in the set of values {p1, . . . , pM} to generate a respective result qj; and
(c) forming each output yi as either a part of a corresponding one of the results or as a combination of at least part of each result of a corresponding plurality of the results.
Thus, the function F is a function that can be implemented according to the method 300 of
To recap, at the step 306 of the method 300, the set of values {p1, . . . , pM} was processed, using the sub-functions F1, . . . , FN, so as to generate the set of results {q1, . . . , qN}. In embodiments of the invention, transformed versions of the functions F1, . . . , FN are used instead of the functions F1, . . . , FN.
In particular, as discussed above, for each j=1, . . . , N, there is a corresponding set of nj distinct indices {βj,1, . . . , βj,n
according to a predetermined function Fj, i.e.
Therefore, at the step 802, for each j=1, . . . , N, transformed versions
of the nj values
are obtained, using respective infective transforms Tj,1, . . . , Tj,n
It will be appreciated that all of the transforms Tj,k may be different from each other for j=1, . . . , M and k=1, . . . , nj. However, it will be appreciated that some or all of the transforms Tj,k may be the same as each other for j=1, . . . , M and k=1, . . . , nj. For example, there may be a single transform T such that Tj,k=T for j=1, . . . , M and k=1, . . . , nj. Similarly, there may be M different transforms T′1, . . . , T′M, so that Tj,k=T′i if βj,k=i, i.e. each value pi is only transformed by one transform T′i. In some embodiments, at least one of the values pi is transformed by two or more different transforms to obtain two or more corresponding different transformed versions of the value pi. Other embodiments may make use of different mixes/combinations of transforms.
As discussed above in section 3, the transforms Tj,k (j=1, . . . , M and k=1, . . . , nj) may be any injective functions (so that Tj,k is a 1-1 function over the domain of the possible values that pj may assume). In some embodiments, some or all of the transforms Tj,k (j=1, . . . , M and k=1, . . . , nj) are linear transformations (such as those set out in section 3 above), but it will be appreciated that this need not be the case. Each transform Tj,k (j=1, . . . , M and k=1, . . . , nj) is predetermined, and may be defined, for example, by randomly selecting one or more parameters that define the transform or based on other data/parameters—for example, if transform Tj,k is a linear transform so that pβ
At the step 804, transformed versions F1*, . . . , FN* of the sub-functions F1, . . . , FN are used to generate a set of transformed results {q1*, . . . , qN*}. In particular,
for j=1, . . . , N.
At the step 806, the set of results {q1, . . . , qN} is obtained from the set of transformed results {q1*, . . . , qN*}. In particular, for j=1, . . . , N, the result qj is calculated by applying a transform {tilde over (T)}j to the transformed result qj*, so that qj={tilde over (T)}j(qj*). It will be appreciated that all of the transforms {tilde over (T)}j may be different from each other, for j=1, . . . , N. However, it will be appreciated that some or all of the transforms {tilde over (T)}j may be the same as each other, for j=1, . . . , N.
As discussed above in section 3, the transforms {tilde over (T)}j (j=1, . . . , N) may be any injective functions (so that {tilde over (T)}j is a 1-1 function over the domain of the possible values that qj* may assume). In some embodiments, one or more of the transforms {tilde over (T)}j (j=1, . . . , N) are linear transformations (such as those set out in section 3 above), but it will be appreciated that this need not be the case. Each transform {tilde over (T)}j (j=1, . . . , N) is predetermined, and may be defined, for example, by randomly selecting one or more parameters that define the transform or based on other data/parameters—for example, if transform {tilde over (T)}j is a linear transform so that qj={tilde over (T)}j (qj*)=ajqj*+bj for a non-zero constant aj and a constant bj, then aj and bj may be randomly chosen (prior to performing the method 800).
The initial sub-function Fj (j=1, . . . , N) has nj inputs pβ
and an output qj*, where pβ
Thus, for j=1, . . . , N, the result qj is effectively calculated as:
The transformed version Fj* of the function Fj may be determined at the time that the transforms Tj,1, . . . , Tj,n
Note that in some embodiments of the invention, the step 302 is optional in the method 800. In particular, when an entity or a module performs or implements the function F, that entity or module may receive or obtain the set of values {p1, . . . , pM} instead of receiving or obtaining the set of inputs {x1, . . . , xW} at the step 302. For example, a first entity or module may perform or implement the function F in an obfuscated manner as set out above, and a second entity or module may be arranged to determine the set of values {p1, . . . , pM} from the set of inputs {x1, . . . , xW} and then provide the set of values {p1, . . . , pM} to the first entity or module. In such cases, the first entity or module performs the method 800 without the optional step 302.
Additionally, the step 304 is optional in the method 800. In particular, when an entity or a module performs or implements the function F, that entity or module may receive or obtain the set of transformed values {pβ
This can arise in a number of ways. For example, the second entity or module may obtain the set of inputs {x1, . . . , xW} and derive the set of values {p1, . . . , pM} (or may simply obtain the set of values {p1, . . . , pM}) and may then determine the set of transformed values {pβ
Similarly in some embodiments of the invention, the step 308 is optional in the method 800. In particular, when an entity or a module performs or implements the function F, that entity or module may simply output the set of results {q1, . . . , qN} instead of outputting the set of outputs {y1, . . . , yW} at the step 308. For example, a first entity or module may perform or implement the function F in an obfuscated manner as set out above, and a second entity or module may be arranged to determine the set of outputs {y1, . . . , yW} from the set of results {q1, . . . , qN} provided by the first entity or module. In such cases, the first entity or module performs the method 800 without the optional step 308.
Additionally, the step 806 is optional in the method 800. In particular, when an entity or a module performs or implements the function F, that entity or module may output the set of transformed results {q1*, . . . , qN*}. For example, a first entity or module may perform or implement the function F in an obfuscated manner as set out above and provide the set of transformed results {q1*, . . . , qN*} to a second entity or module that may be arranged to determine the set of outputs {y1, . . . , yW} from the set of transformed results {q1*, . . . , qN*} itself. In such cases, the first entity or module performs the method 800 without the optional steps 806 and 308.
Indeed, the set of outputs {y1, . . . , yN} and, indeed, the set of results {q1, . . . , qN}, need not necessarily be derived or obtained. For example, suppose that the intention is to perform the function H=G° F., so that H(x)=G(F(x)). Then both the function G and the function F can be implemented using embodiments of the invention. When implementing the function F, a set of transformed results will be produced (at the step 804 for the function F). One or more of these transformed results could then be used in the set of transformed values for the function G. Thus, the set of transformed results generated when implementing the function F may be used directly as inputs to the function G without deriving the corresponding set of results and/or set of outputs for the function F—i.e., the step 804 for the function F may form part of the step 802 for the function G (without needing to perform the steps 806 and 308 for the function F).
As can be seen from the above, embodiments of the invention make use of the synergy between (a) having the values pj (j=1, . . . , M) dependent on multiple inputs xi and (b) using transformed versions Fj* of the sub-functions Fj (j=1, . . . , N). In particular, suppose that the function F were a function that generates output yi based on input xi, so that yi=F(xi). It would have been possible to implement the function F as a transformed function F* as set out above. However, in doing so, each input xi in the set of inputs {x1, . . . , xW} would have been processed separately (e.g. a transformed output yi* would have been generated as yi*=F*(xi*), i.e. as a function of a transformed input xi* for i=1, . . . , W). Such separate processing makes the task of an attacker easier, where the attacker wishes to determine the input xi and/or the output yi and/or one or more secret values used by the function F (e.g. a cryptographic key). For example, calculation of the values xi* as set out above could be implemented via a loop, such as:
The presence of such loops is detectable by, and exploitable by attackers. In contrast, as embodiments of the invention are based around operations on values pj (that inherently are each dependent on multiple inputs xi), the task of the attacker is made harder. For example, looped processing (as set out above) can be avoided or minimized, making it harder for an attacker to be successful in their attack.
What is more, efficiency gains can be achieved, as multiple outputs are effectively determined at the same time. In particular, having the values pj (j=1, . . . , M) dependent on multiple inputs xi, the full bit-width of the processing system can be leveraged. For example, if the inputs xi are 8-bit values and the processor is a 32-bit processor, then each value pj could (for example) be made up of 4 different inputs), thereby making better use of the processor's capabilities. This helps mitigate the performance loss sometime experienced when performing obfuscation using transforms as set out in section 3 above.
It will be appreciated that the methods described have been shown as individual steps carried out in a specific order. However, the skilled person will appreciate that these steps may be combined or carried out in a different order whilst still achieving the desired result.
It will be appreciated that embodiments of the invention may be implemented using a variety of different information processing systems. In particular, although the figures and the discussion thereof provide an exemplary computing system and methods, these are presented merely to provide a useful reference in discussing various aspects of the invention. Embodiments of the invention may be carried out on any suitable data processing device, such as a personal computer, laptop, personal digital assistant, mobile telephone, set top box, smartcard, television, server computer, etc. Of course, the description of the systems and methods has been simplified for purposes of discussion, and they are just one of many different types of system and method that may be used for embodiments of the invention. It will be appreciated that the boundaries between logic blocks are merely illustrative and that alternative embodiments may merge logic blocks or elements, or may impose an alternate decomposition of functionality upon various logic blocks or elements.
It will be appreciated that the above-mentioned functionality may be implemented as one or more corresponding modules as hardware and/or software. For example, the above-mentioned functionality may be implemented as one or more software components for execution by a processor of the system. Alternatively, the above-mentioned functionality may be implemented as hardware, such as on one or more field-programmable-gate-arrays (FPGAs), and/or one or more application-specific-integrated-circuits (ASICs), and/or one or more digital-signal-processors (DSPs), and/or other hardware arrangements. Method steps implemented in flowcharts contained herein, or as described above, may each be implemented by corresponding respective modules; multiple method steps implemented in flowcharts contained herein, or as described above, may be implemented together by a single module.
It will be appreciated that, insofar as embodiments of the invention are implemented by a computer program, then a storage medium and a transmission medium carrying the computer program form aspects of the invention. The computer program may have one or more program instructions, or program code, which, when executed by a computer carries out an embodiment of the invention. The term “program” as used herein, may be a sequence of instructions designed for execution on a computer system, and may include a subroutine, a function, a procedure, a module, an object method, an object implementation, an executable application, an applet, a servlet, source code, object code, a shared library, a dynamic linked library, and/or other sequences of instructions designed for execution on a computer system. The storage medium may be a magnetic disc (such as a hard drive or a floppy disc), an optical disc (such as a CD-ROM, a DVD-ROM or a BluRay disc), or a memory (such as a ROM, a RAM, EEPROM, EPROM, Flash memory or a portable/removable memory device), etc. The transmission medium may be a communications signal, a data broadcast, a communications link between two or more computers, etc.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2014/056421 | 3/31/2014 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2015/149827 | 10/8/2015 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6594761 | Chow | Jul 2003 | B1 |
6779114 | Chow | Aug 2004 | B1 |
6842862 | Chow et al. | Jan 2005 | B2 |
7350085 | Johnson et al. | Mar 2008 | B2 |
7397916 | Johnson et al. | Jul 2008 | B2 |
8543835 | Michiels | Sep 2013 | B2 |
8700915 | Michiels | Apr 2014 | B2 |
20030163718 | Johnson | Aug 2003 | A1 |
20030221121 | Chow et al. | Nov 2003 | A1 |
20040139340 | Johnson | Jul 2004 | A1 |
20120002807 | Michiels | Jan 2012 | A1 |
20120093313 | Michiels | Apr 2012 | A1 |
Number | Date | Country |
---|---|---|
101401348 | Apr 2009 | CN |
102016871 | Apr 2011 | CN |
103004130 | Mar 2013 | CN |
103559458 | Feb 2014 | CN |
2009140774 | Nov 2009 | WO |
2014154271 | Feb 2014 | WO |
2015149827 | Oct 2015 | WO |
Entry |
---|
International Search Report and Written Opinion cited in corresponding International Application No. PCT/EP2014/056421 dated Dec. 2, 2014. |
Chow S et al: “White-box cryptography and an AES implementation”, Jan. 1, 2003; 20030000, Jan. 1, 2003 (Jan. 1, 2003), pp. 250-270, XP002462505. |
James A Muir: “A Tutorial on White-box AES”,International Association for Cryptologic Research vol. 20130228: 053134, Feb. 28, 2013 (Feb. 28, 2013), pp. 1-25, XP061007352. |
Chow S et al: “A white-box DES implementation for DRM applications”, Jan. 1, 2003; 20030000, Jan. 1, 2003 (Jan. 1, 2003), pp. 1-15, XP002462504. |
Plasmans M: “White-Box Cryptography for Digital Content Protection”, Internet Citation, May 2005 (May 2005), XP003019136, Retrieved from the Internet: URL:http://www.alexandria.tue.nl/extral/afstversl/wsk-i.plasmans2005.pdf [retrieved on Jan. 1, 2007]. |
International Preliminary Report on Patentability issued in PCT Application No. PCT/EP2014/056421, dated Oct. 13, 2016, 8 pages. |
Wikipedia, “Bijection”, website: https://en.wikipedia.org/w/index.php?title=Bijection&oldid=601397690, obtained on Jan. 12, 2016, 6 pages. |
Wikipedia, “Bit slicing”, website: https://en.wikipedia.org/w/index.php?title=Bit_slicing&oldid=590022783, obtained on Jan. 12, 2016, 3 pages. |
Wikipedia, “Central processing unit”, website: https://en.wikipedia.org/w/index.php?title=Central_processing_unit&oldid=600744248, obtained on Jan. 12, 2016, 15 pages. |
Wikipedia, “Circuit minimization for Boolean functions”, website: https://en.wikipedia.org/w/index.php?title=Circuit_minimization_for_Boolean_functi . . . obtained on Jan. 12, 2016, 3 pages. |
Wikipedia, “Espresso heuristic logic minimizer”, website: https://en.wikipedia.org/w/index.php?title=Espresso_heuristic_logic_minimizer&oldi . . . , obtained on Jan. 12, 2016, 4 pages. |
Wikipedia, “Finite impulse response”, website: https://en.wikipedia.org/w/index.php?title=Finite_impulse_response&oldid=601145459, obtained on Jan. 12, 2016, 6 pages. |
Wikipedia, “Functional completeness”, website: https://en.wikipedia.org/w/index.php?title=Functional_completeness&oldid=579647479, obtained on Jan. 12, 2016, 5 pages. |
Wikipedia, “Kamaugh map”, website: https://en.wikipedia.org/w/index.php?title=Karnaugh_map&oldid=601618348, obtained on Jan. 12, 2016, 8 pages. |
Wikipedia, “Lookup table”, website: https://en.wikipedia.org/w/index.php?title=Lookup_table&oldid=598220793, obtained on Jan. 12, 2016, 8 pages. |
Wikipedia, “Bitslice DES”, website: https://web.archive.org/web/20131021232909/http://www.darkside.com.au/bitslice/, obtained on Sep. 8, 2017, 2 pages. |
Wikipedia, “Graphics processing unit” website: https://en.wikipedia.org/w/index.php?title=Graphics_processing_unit&oldid=601713 . . . , obtained on Jan. 12, 2016, 10 pages. |
Wikipedia, “SWAR” website: https://en.wikipedia.org/w/index.php?title=SWAR&oldid=600625450, obtained on Jan. 12, 2016, 3 pages. |
Chinese First Office Action for Chinese Application No. 201480079433.9, dated Dec. 3, 2018, 7 pages. |
Number | Date | Country | |
---|---|---|---|
20170126398 A1 | May 2017 | US |