PRIVACY-PRESERVING EVALUATION OF DECISION TREES

Information

  • Patent Application
  • 20190190714
  • Publication Number
    20190190714
  • Date Filed
    December 20, 2017
    7 years ago
  • Date Published
    June 20, 2019
    5 years ago
Abstract
A method for performing a secure evaluation of a decision tree, including: receiving, by a processor of a server, an encrypted feature vector x=(x1, . . . , xn) from a client; choosing a random mask μ0; calculating m0 and sending m0 to the client, wherein m0=xi0(0)−t0(0)+μ0 and t0(0) is a threshold value in the first node in the first level of a decision tree ′; performing a comparison protocol on m0 and μ0, wherein the server produces a comparison bit b0 and the client produces a comparison bit b0′; choosing a random bit s0∈{0,1} and when s0=1 switching a left and right subtrees of ′; sending b0⊕s0 to the client; and for each level =1, 2, . . . , d−1 of the decision tree ′, where d is the number of levels in the decision tree ′, perform the following steps: receiving from the client y0 where k=0, 1, . . . , −1; performing a comparison protocol on and , wherein is a random mask and is based upon, x, , yk, and and the server produces a comparison bit and the client produces a comparison bit ; choosing a random bit ∈{0,1} and when =1 switching all left and right subtrees at level of ′; and sending ⊕ to the client.
Description
TECHNICAL FIELD

Various exemplary embodiments disclosed herein relate generally to a method and apparatus for performing a privacy preserving evaluation of decision trees.


BACKGROUND

Protocols have been developed for comparing private values using homomorphic encryption. These protocols may be used in the evaluation of decision trees. Embodiments improving upon the state of the art will be described below.


SUMMARY

A brief summary of various exemplary embodiments is presented below. Some simplifications and omissions may be made in the following summary, which is intended to highlight and introduce some aspects of the various exemplary embodiments, but not to limit the scope of the invention. Detailed descriptions of an exemplary embodiment adequate to allow those of ordinary skill in the art to make and use the inventive concepts will follow in later sections.


Various embodiments relate to a method for performing a secure evaluation of a decision tree, including: receiving, by a processor of a server, an encrypted feature vector custom-characterxcustom-character=(custom-characterxicustom-character, . . . , custom-characterxncustom-character) from a client, where n is an integer, wherein custom-characterxcustom-character denotes an additively homomorphic encryption of x using a cryptographic key of the second party; choosing a random mask custom-character0; calculating custom-characterm0custom-character and sending custom-characterm0custom-character to the client, wherein custom-characterm0custom-character=custom-characterxi0(0)−t0(0)0custom-character and xi0(0) is the value of entry i0(0) of the feature vector x (1≤i0(0)≤n), and t0(0) is a threshold value in the first node in the first level of a decision tree custom-character′; performing a comparison protocol on m0 and μ0, wherein the server produces a comparison bit b0 and the client produces a comparison bit b0′; choosing a random bit s0∈{0,1} and when s0=1 switching a left and right subtrees of custom-character′; sending b0⊕s0 to the client; and for each level custom-character=1, 2, . . . , d−1 of the decision tree custom-character′, where d is the number of levels in the decision tree custom-character′, perform the following steps: receiving from the client custom-characterykcustom-character where k=0, 1, . . . , custom-character−1; performing a comparison protocol on custom-character and custom-character, wherein custom-character is a random mask and custom-character is based upon custom-characterxcustom-character, custom-character, custom-characterykcustom-character, and custom-character and the server produces a comparison bit custom-character and the client produces a comparison bit custom-character; choosing a random bit custom-character∈{0,1} and when custom-character=1 switching all left and right subtrees at level custom-character of custom-character′; and sending custom-charactercustom-character to the client.


Further various embodiments relate to a non-transitory machine-readable storage medium encoded with instructions for performing a secure evaluation of a decision tree, including: instructions for receiving, by a processor of a server, an encrypted feature vector custom-characterxcustom-character=(custom-characterx1custom-character, . . . , custom-characterxncustom-character) from a client, where n is an integer, wherein custom-characterxcustom-character denotes an additively homomorphic encryption of x using a cryptographic key of the second party; instructions for choosing a random mask μ0; instructions for calculating custom-characterm0custom-characterand sending custom-characterm0custom-characterto the client, wherein custom-characterm0custom-character=custom-characterxi0(0)−t0(0)0custom-characterand xi0(0) is the value of entry i0(0) of the feature vector x (1≤i0(0)≤n), and t0(0) is a threshold value in the first node in the first level of a decision tree custom-character″;


instructions for performing a comparison protocol on m0 and μ0, wherein the server produces a comparison bit b0 and the client produces a comparison bit b0′; instructions for choosing a random bit s0∈{0,1} and when s0=1 switching a left and right subtrees of custom-character′; instructions for sending b0⊕s0 to the client; and for each level custom-character=1, 2, . . . , d−1 of the decision tree custom-character′, where d is the number of levels in the decision tree custom-character′, perform the following instructions: instructions for receiving from the client custom-characterykcustom-character where k=0, 1, . . . , custom-character−1; instructions for performing a comparison protocol on custom-character and custom-character, wherein custom-character is a random mask and custom-character is based upon custom-characterxcustom-character, custom-character, custom-characterykcustom-character, and custom-character and the server produces a comparison bit custom-characterand the client produces a comparison bit custom-character; instructions for choosing a random bit custom-character∈{0,1} and when custom-character=1 switching all left and right subtrees at level custom-character of custom-character′; and instructions for sending custom-charactercustom-character to the client.


Various embodiments are described, wherein when the server engages in a 1-out-of-2d oblivious transfer with the client to learn the value of custom-character′(x).


Various embodiments are described, wherein when the client computes r=(β0, β0, . . . , βd−1)2 which is the index of the leaf node in custom-character′ indicating the output of the decision tree and where custom-character=custom-charactercustom-charactercustom-character.


Various embodiments are described, wherein the first encryption uses the Pallier cryptosystem.


Various embodiments are described, wherein performing a comparison protocol on custom-character and custom-character, further includes: choosing, by the server, custom-character random masks custom-character for 0≤k≤custom-character−1, and a random mask custom-character; computing custom-character=custom-character; computing for 0≤k≤custom-character−1 custom-character where custom-character=custom-charactercustom-character+custom-character; and sending custom-character and custom-character, . . . , custom-character to the client.


Various embodiments are described, wherein performing a comparison protocol on custom-character and custom-character, further includes: decrypting, by the client, custom-character to get custom-character; and computing, by the client, custom-character=custom-characteryk zkcustom-character.


Various embodiments are described, wherein performing a comparison protocol on custom-character and custom-character, further includes: choosing, by the server, a random mask custom-character; computing <<custom-character>> where custom-character=custom-character+custom-character(custom-charactercustom-character), <<w>> denotes a second homomorphic encryption of w using a cryptographic key of the second party; and sending <<custom-character>> to the client.


Various embodiments are described, wherein performing a comparison protocol on custom-character and custom-character, further comprises, decrypting, by the client, <<custom-character>> to get custom-character.


Various embodiments are described, wherein the second encryption uses the Boneh-Goh-Nissim (BGN) cryptosystem.


Various embodiments are described, wherein the second encryption uses a somewhat homomorphic cryptosystem.


Various embodiments are described, wherein for each level custom-character=1, 2, . . . , d−1 of the decision tree custom-character′ the client computes yk=1{custom-character=k}, where custom-character=custom-character, encrypts yk resulting in custom-characterykcustom-character, and sending custom-characterykcustom-character to the server.





BRIEF DESCRIPTION OF THE DRAWINGS

In order to better understand various exemplary embodiments, reference is made to the accompanying drawings, wherein:



FIG. 1 illustrates an exemplary protocol using additively homomorphic encryption (e.g., Paillier cryptosystem) for performing step 5b; and



FIG. 2 illustrates an exemplary protocol using somewhat homomorphic encryption (e.g., Boneh-Goh-Nissim cryptosystem) for performing step 5b.





To facilitate understanding, identical reference numerals have been used to designate elements having substantially the same or similar structure and/or substantially the same or similar function.


DETAILED DESCRIPTION

The description and drawings illustrate the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its scope. Furthermore, all examples recited herein are principally intended expressly to be for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Additionally, the term, “or,” as used herein, refers to a non-exclusive or (i.e., and/or), unless otherwise indicated (e.g., “or else” or “or in the alternative”). Also, the various embodiments described herein are not necessarily mutually exclusive, as some embodiments can be combined with one or more other embodiments to form new embodiments.


Privacy-preserving data mining has gained a lot of attention in the last decade. The main goal of these methods is to analyze and extract useful information from the data of the users without accessing the individual's private data. One commonly used machine learning technique is the decision tree. Decision trees are simple classifiers that consist of a collection of decision nodes in a tree structure. It is particularly useful as it can be easily implemented in query-based systems and databases.


In this disclosure, an embodiment of a new protocol to evaluate decision trees in a secure way is disclosed. It is assumed that there is a client with input data x∈custom-charactern who wants to evalute the decision tree custom-character:custom-characterncustom-character. The goal is for the client to learn the output of the decision tree while the server learns nothing about the client's private input.


An embodiment of a privacy preserving decision tree protocol will be described herein that decreases the computational complexity as well as the required bandwidth as compared to the current state-of-the-art. In the previous works in the literature, in order to have privacy for the user's data, the server should implement the comparison protocol for all the internal nodes. The embodiment described here is a new method for decision tree evaluation which only requires one comparison in each level of the decision tree. This results in decreasing the total number of comparisons by a logarithmic factor. This is a worthwhile improvement as performing the comparison protocol usually requires massive amounts of computation and bandwidth.


The new embodiment of the decision tree protocol utilizes an additively homomorphic encryption scheme. Let custom-charactermcustom-character denote the encryption of a message m. The homomorphic property implies that for any two messages m and m′, the encryption of m+m′ can be obtained from the encryptions of m and m′ as custom-characterm+m′custom-character=custom-charactermcustom-character·custom-characterm′custom-character. Likewise, for a known constant c, the encryption of c·m can be obtained from the encryption of m as custom-characterc mcustom-character=custom-charactermcustom-characterc.


An efficient additively homomorphic encryption scheme is provided by Paillier cryptosystem. Paillier scheme is defined as follows. On input some security parameter, two large primes p and q are generated. Let N=pq and λ=lcm (p−1, q−1). The public key is pk=N and the secret key is sk=λ. The message space is custom-character={0, . . . , N−1}.


The encryption of a message m∈custom-character is given by C=(1+mN)rN mod N2 for some random element r∈{1, . . . , N−1}. Writing C=custom-charactermcustom-character, the decryption of C is obtained as






m
=

L
λ





mod N where L=(Cλ−1 mod N2)/N.


Given C1=custom-characterm1custom-character and C2=custom-characterm2custom-character, one can obtain the encryption of m1+m2 as custom-characterm1+m2custom-character=C1C2r′N (mod N2) for any non-zero integer r′.


Let the server possess the database x1, x2, . . . , xn, and the client has a selection index i∈{1, 2, . . . , n}. In a 1-out-of-n oblivious transfer protocol, the client learns only xi and the server learns nothing.


In a secure comparison protocol there are two parties who want to compare their private data. It is assumed that one party has integer x and the other party has integer y in the clear and they want to compare these numbers (without revealing the value) to check whether x≤y or not. The final result of the comparison will be secretly shared between the two parties.


The DGK+ comparison protocol which was proposed by Damgård, Geisler, and Krøigaard will be used as an example. The setting is as follows. Alice possesses a private t-bit integer x=Σi=0t−1xi 2i and Bob possesses another private t-bit integer y=Σi=0t−1yi 2i. The goal for Alice and Bob is to respectively obtain bits δA and δB such that δA⊕δB=[x≤y]. The protocol proceeds in four steps:

    • 1. Bob encrypts the bits of y=Σi=0t−1yi 2i under his public key and sends custom-characteryicustom-character,
    • 0≤i≤t−1, to Alice.
    • 2. Alice chooses uniformly at random a bit δA∈{0,1} and defines s=1−2δA. Alice also selects t+1 random invertible scalars ri, −1≤i≤t−1.
    • 3. Next, for t−1≥i≥0, Alice computes






custom-character
c
i*custom-character=(custom-characterscustom-character·custom-characterxicustom-character·custom-characteryicustom-character−1·(Πj=i+1t−1custom-characterxj⊕yjcustom-character)3)ri.

    • Finally, Alice computes






custom-character
c
−1*custom-character=(custom-characterδAcustom-character·Πj=0t−1custom-characterxj⊕yjcustom-character)r−1.

    • Alice sends the t+1 ciphertexts custom-characterci*custom-character in a random order to Bob.
    • 4. Using his private key, Bob decrypts the received custom-characterci*custom-character's. If one is decrypted to zero, Bob sets δB=1. Otherwise, he sets δB=0.


This DGK+ comparison protocol has been improved upon by introducing a new method that reduces both the computational complexity and the communication bandwidth by a factor of two. This comparison protocol is described in copending patent application entitled “PRIVACY PRESERVING COMPARISON” filed Dec. 20, 2017, attorney docket number 82096582US01. This improved version along with other privacy preserving comparison protocols may be used in the embodiment described in this disclosure.


Decision tree evaluation is a well-known method in the machine learning community to evaluate the model's output corresponding to a user's input data. In order to evaluate the decision tree custom-character on the input vector x∈custom-charactern, when everything is available in the clear, one should traverse custom-character by doing one comparison for each of the levels of the tree, and finding the leaf node corresponding to x. The value of this leaf node is the output of the model. As one can see, the total number of comparisons in this case is bounded by the depth of the tree custom-character, i.e., the length of the longest path from the root to the leaves.


In David J Wu, Tony Feng, Michael Naehrig, and Kristin Lauter, “Privately evaluating decision trees and random forests,” Proceedings on Privacy Enhancing Technologies, 2016(4):335-355, 2016, Wu et al. introduced a method for privately evaluating decision trees and random forests. They have proposed a method in which a client can learn the output corresponding to her input value while the server learns nothing. They also showed that by using their protocol the client's data and the server's model remain private. They compared the performance of their secure protocol with the previous works in the literature and showed a 10 times improvement in terms of computations and bandwidth.


In spite of the fact that the method proposed by Wu et al. is secure and has much better performance than the prior art, they still need to perform the comparison protocol for all the internal nodes of the tree. In other words, when the decision tree custom-character is a complete binary tree of depth d, in their protocol they need to perform 2d−1 comparisons to securely compute the output of the model. However, as discussed earlier, when the input and the model are available in the clear one should perform only d comparisons to evaluate the decision tree.


In this disclosure, an embodiment is described for privately evaluating the decision tree with only d comparisons. This embodiment uses the same idea as Wu et al. to permute the tree nodes, whereas it needs to do only one comparison for each level of the decision tree. This is done by letting the client learn the indices of the corresponding internal nodes which appear while traversing the permuted tree. Having the knowledge of the index of the internal node at each level will result in doing one comparison in each level. Note that knowing the indices of the internal nodes in the permuted tree does not reveal any information about the actual decision tree because the nodes at each level are permuted using a random permutation.


Because the comparison protocol requires large amount of computation and bandwidth, our protocol greatly improves previous results in terms of computation and speed, by reducing the total number of comparisons to a logarithmic factor.


The embodiment described herein is fully generic in terms of the encryption scheme and the comparison protocol. Indeed, one can use the embodiment for evaluating a decision tree with the freedom of choosing any encryption scheme that is additively homomorphic, as well as choosing any comparison protocol for comparing the private data of the client and the user.


It is assumed that the server has a decision tree custom-character:custom-characterncustom-character and the client has a feature vector x∈custom-charactern. It is assumed that custom-character is a complete binary tree of depth d. The depth of a tree is the length of the longest path from the root to the leaf. In general, a binary decision tree may not be complete but one can transform any decision tree to a complete tree by introducing dummy internal nodes.


The following rule is used for indexing the nodes of the decision tree custom-character with depth d:

    • The nodes at level custom-character in the tree are the nodes that have distance custom-character from the root of the tree. Therefore, all the internal nodes in custom-character have level between 0 and d−1;
    • The custom-character nodes at level custom-character (for custom-character=0, 1, . . . , d−1) are indexed from the left to the right by custom-character, with 0≤k≤custom-character−1, where custom-character denotes the leftmost node at level custom-character and custom-character denotes the rightmost node; and
    • An index from 0 to 2d−1 is defined for the leaf nodes. With this indexing scheme, the leaves of the tree, when read from left-to-right, correspond with the ordering z0, . . . , z2d−1.


Each internal node custom-character in the tree is associated with a Boolean function






custom-character(x)=1 when {custom-charactercustom-character},


where custom-character is an index in the feature vector x∈custom-charactern, and custom-character is a threshold. To evaluate the output of the decision tree, one should start from the root and at each level depending on the result of custom-character take either the left branch (when for example custom-character=0) or the right branch (when for example custom-character=1) of the tree, and repeat the process until a leaf node is reached. The output custom-character(x) is zr, the value of the so-obtained leaf node.


In order to find the index of the leaf node, r, for the feature vector x, it is necessary and sufficient to know the result of the Boolean comparison at each level of the tree. The embodiment described herein is therefore optimal in the number of comparisons.


It is assumed that the client has a matching public/private key pair (pk, sk) which is used for encryption and decryption of the messages under an additively-homomorphic encryption scheme as described above. Also, as described above, custom-charactermcustom-character is used to denote an encryption of the message m under client's public-key, pk. It is also assumed that the server uses a copied version of the decision tree custom-character, denoted by custom-character′ and performs the permutation on that. An embodiment of the decision tree protocol proceeds as follows:

    • 1. The client encrypts entries of the feature vector x=(x1, . . . , xn)∈custom-charactern and sends custom-characterxcustom-character=(custom-characterx1custom-character, . . . , custom-characterxncustom-character) to the server.
    • 2. The server defines custom-character′←custom-character. It chooses a random mask μ0 in the message space and sends custom-characterm0custom-character=custom-characterxi0(0)−t0(0)0custom-character to the client. Client recovers m0 using private key sk.
    • 3. The client and server perform comparison protocol on m0 and μ0 and share the result. At the end of the protocol, the client possesses b0′∈{0,1} and the server has b0∈{0,1} such that:






b
0
⊕b
0′=1{m0≤μ0}=1{xi0(0)≤t0(0)}.

    • 4. The server chooses a bit s0∈{0,1} uniformly at random. If s0=1, server switches the left and right subtree of custom-character′, and calls custom-character′ the resulting tree. Server then sends b0⊕s0 to the client, that in turn recovers β0=b0′⊕b0⊕s0.
    • 5. For custom-character=1, 2, . . . , d−1:
      • (a) The client defines custom-character=(β0, β1, . . . , custom-character)2:=custom-character. For k=0, 1, . . . , custom-character−1, it sets yk=1{custom-character=k} and sends custom-characterykcustom-character to server. Note that this definition implies that custom-character=1 and yk=0 for k≠custom-character.
      • (b) The server and client engage in a multi-party computation protocol and secret share the result of the comparison at level custom-character. At the end of the protocol, the client possesses custom-character∈{0,1} and the server has custom-character∈{0,1} such that:








b
l



b
l



=

1



{





k
=
0



2
l

-
1





y
k



(


x

i
k

(
l
)



-

t
k

(
l
)



)




0

}

.










      • (c) The server chooses a bit custom-character∈{0,1} uniformly at random. If custom-character=1, the server switches all the left and right subtrees at level custom-character of custom-character′, and calls custom-character′ the resulting tree. The server sends custom-charactercustom-character to the client, that in turn recovers custom-character=custom-charactercustom-charactercustom-character.



    • 6. The client computes r=(β0, β1, . . . , βd−1)2 which is the index of the leaf node in custom-character′. Next, the client engages in a 1-out-of-2d oblivious transfer with the server to learn the value of custom-character(x).





In step 5a in the embodiment described above, letting y=(y0, . . . , custom-character−1), it turns out that









y
=

(

,





,
0
,
1
,
1
,





,
0

)













β

(
l
)


-

th





position








namely, y has a single bit set to 1.


As a result, in step 5a, the following results:










k
=
0



2
l

-
1





y
k



(


x

i
k

(
l
)



-

t
k

(
l
)



)



=



x

i

k
*


(
l
)



-


t

k
*


(
l
)







where






k
*



=

β

(
l
)







and thus








b
l



b
l



=

{




1




if






x

i

k
*


(
l
)






t

k
*


(
l
)







0


otherwise



.






The advantage of this approach is that only a single comparison is needed as opposed to the approach of Wu et al. where custom-character comparisons are performed at level custom-character. Indeed, at level custom-character, their proposed method requires the evaluation of 1{custom-charactercustom-character} for 0≤k≤custom-character−1.


Now two detailed implementations the multi-party protocol for step 5b will be presented. The first embodiment makes use of additively homomorphic encryption, and the second embodiment makes use of somewhat homomorphic encryption.



FIG. 1 illustrates an exemplary protocol using additively homomorphic encryption (e.g., Paillier cryptosystem) for performing step 5b.


The steps of this protocol are as follows: After the client computes yk, for 0≤k≤custom-character−1, and the server receives inputs custom-character, custom-character, and custom-characterykcustom-character, for 0≤k≤custom-character−1, in step 1, the server chooses custom-character random masks custom-character for 0≤k≤custom-character−1, and a random mask custom-character. In step 2, the server defines and computes custom-character=custom-character. In step 3, the server computes for 0≤k≤custom-character−1 custom-character where custom-character=custom-charactercustom-character+custom-character. In step 4, the server sends custom-character and custom-character, . . . , custom-character to the client.


In step 5, the client decrypts custom-character to get custom-character. In step 6, the client sets custom-character=custom-charactercustom-character. In step 7, the client and server engage in a comparison protocol (e.g., the DGK+ protocol) on input custom-character for the client and custom-character for the server. The client then outputs custom-characterand the server outputs custom-character. These output values can then be used as described above in steps 5c and 6 in the decision tree protocol.


Note that custom-character and custom-character may be computed by the client from custom-character, custom-character, and custom-characterykcustom-character as they satisfy









M
l



=



(




k
=
0



2
l

-
1







y
k




r
k

(
l
)




)



μ
l




-
1









and






z
k

(
l
)




=




x

i
k

(
l
)










(



t
k

(
l
)




)


-
1




r
k

(
l
)




.






The second embodiment of step 5b that makes use of somewhat homomorphic encryption will now be described. Somewhat homomorphic encryption allows anyone to add encrypted messages as in additively homomorphic encryption. Advantageously it also allows anyone to multiply encrypted messages but just once. An example of such a scheme is the BGN (Boneh-Goh-Nissim) cryptosystem, whose description follows.


On input some security parameter, let custom-character and custom-characterT be two cyclic groups of order n=q1q2 where q1 and q2 are prime, equipped with a bilinear map e: custom-character×custom-charactercustom-characterT. Let also g, u←custom-character be two random elements in custom-character and h=uq2. The public key is pk=(n, custom-character, custom-characterT, e, g, h) and the secret key is sk=q1. The message space custom-character is the set {0, 1, . . . , T} with T<q2.


The encryption of a message m∈custom-character is given by C=gmhrcustom-character for some random element r∈{0, . . . , n−1}. Define C=custom-charactermcustom-character. Noting that Cq1=(gmhr)q1=(gq1)m, the decryption of C is obtained as the discrete logarithm of Cq1 with respect to base gq1.


Next, set G=e(g, g) and H=e(g, h). There is another way to define the encryption of a message m∈custom-character. Choose some random element r∈{0, . . . , n−−1} and define the encryption of m as Ĉ=GmHrcustom-characterT. Define Ĉ=<<m>>. Plaintext message m can then be recovered as the discrete logarithm of Ĉq1 with respect to base Gq1 using secret key q1.


For the first encryption scheme encrypted messages may be added as follows: Given C1=custom-characterm1custom-character and C2=custom-characterm2custom-character, anyone can obtain the encryption of m1+m2 as custom-characterm1+m2custom-character=C1C2nr′ for any integer r′.


For the second encryption scheme encrypted messages may be added as follows: Given custom-character=<<m1>> and custom-character=<<m2>>, anyone can obtain the encryption of m1+m2 as <<m1+m2>>=custom-characterHr′ for any integer r′.


Encrypted messages may be multiplied as follows: Given C1=custom-characterm1custom-character and C2=custom-characterm2custom-character, anyone can obtain the encryption of m1·m2 as <<m1·m2>>=e(C1, C2)Hr′ (for any integer r′). It can be verified that e(C1,C2)Hr′=Gm1m2 H{tilde over (r)} for some {tilde over (r)}.


Assuming using a somewhat homomorphic encryption, the multi-party computation protocol for step 5b may proceed as depicted in FIG. 2.


After the client computes yk, for 0≤k≤custom-character−1, and the server receives inputs custom-character, custom-character, and custom-characterykcustom-character, for 0≤k≤custom-character−1, in step 1, the server chooses a random mask custom-character. In step 2, the server computes <<custom-character>> where custom-character=custom-character+custom-character(custom-charactercustom-character). In step 3, the server sends <<custom-character>> to the client.


In step 4, the client decrypts <<custom-character>> to get custom-character. In step 5, the client and server engage in a comparison protocol (e.g., the DGK+ protocol) on input custom-character for the client and custom-character for the server. The client then outputs custom-character and the server outputs custom-character. These output values can then be used as described above in steps 5c and 6 in the decision tree protocol.


Note that <<custom-character>> may be computed by the client from custom-character, custom-character, and custom-character as it satisfies











m
l





=






μ
l










k
=
0



2
l

-
1





e


(




y
k



,





x

i
k

(
l
)







t
k

(
l
)





-
1



)


.







Decision trees have various applications in different fields including object recognition, molecular biology, and financial analysis. The embodiments described herein describe a new protocol for decision-tree evaluation and guarantees the privacy of both the user's information and the server's model.


These embodiments have potential applications in the emerging field of cloud computing. In a cloud-based query system, the service provider possesses a model which is developed by integrating the data of thousands of users and the client wants to learn the output of the model for her input data. Currently, such services require having access to the user's information in the clear. However, this information may be very sensitive in certain cases (such as medical data). The embodiments described herein can be a good alternative to be implemented for these applications in a privacy-preserving way.


The embodiments described herein represent an improvement in the technology of the secure evaluation of a decision tree by a party who does not have access to the underlying secure data and another party who does not have access to the specifics of the decision tree. These embodiments provide a reduction in the amount of computations needed to evaluate the decision trees as well as reducing the amount of data needed to be exchanged between the parties engage in evaluating the decision tree. As a result, the embodiments also lead to an improvement in the operation of a computer that may be used to carry out such secure decision tree evaluations.


The methods described above may be implemented in software which includes instructions for execution by a processor stored on a non-transitory machine-readable storage medium. The processor may include a memory that stores the instructions for execution by the processor.


Any combination of specific software running on a processor to implement the embodiments of the invention, constitute a specific dedicated machine.


As used herein, the term “non-transitory machine-readable storage medium” will be understood to exclude a transitory propagation signal but to include all forms of volatile and non-volatile memory. Further, as used herein, the term “processor” will be understood to encompass a variety of devices such as microprocessors, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), and other similar processing devices. When software is implemented on the processor, the combination becomes a single specific machine.


It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the invention.


Although the various exemplary embodiments have been described in detail with particular reference to certain exemplary aspects thereof, it should be understood that the invention is capable of other embodiments and its details are capable of modifications in various obvious respects. As is readily apparent to those skilled in the art, variations and modifications can be effected while remaining within the spirit and scope of the invention. Accordingly, the foregoing disclosure, description, and figures are for illustrative purposes only and do not in any way limit the invention, which is defined only by the claims.

Claims
  • 1. A method for performing a secure evaluation of a decision tree, comprising: receiving, by a processor of a server, an encrypted feature vector x=x1) from a client, where n is an integer, wherein x denotes an additively homomorphic encryption of x using a cryptographic key of the second party;choosing a random mask μ0;calculating m0 and sending m0 to the client, wherein m0=xi0(0)−t0(0)+μ0 and xi0(0) is the value of entry i0(0) of the feature vector x (1≤i0(0)≤n), and t0(0) is a threshold value in the first node in the first level of a decision tree ′;performing a comparison protocol on m0 and μ0, wherein the server produces a comparison bit b0 and the client produces a comparison bit b0′;choosing a random bit s0∈{0,1} and when s0=1 switching a left and right subtrees of ′;sending b0⊕s0 to the client; andfor each level =1, 2, . . . , d−1 of the decision tree ′, where d is the number of levels in the decision tree ′, perform the following steps: receiving from the client yk where k=0, 1, . . . , −1;performing a comparison protocol on and , wherein is a random mask and is based upon x, , yk, and and the server produces a comparison bit and the client produces a comparison bit ;choosing a random bit ∈{0,1} and when =1 switching all left and right subtrees at level of ′; andsending ⊕ to the client.
  • 2. The method of claim 1, wherein when the server engages in a 1-out-of-2d oblivious transfer with the client to learn the value of ′(x).
  • 3. The method of claim 2, wherein when the client computes r=(−β0, β1, . . . , βd−1)2 which is the index of the leaf node in ′ indicating the output of the decision tree and where =⊕⊕.
  • 4. The method of claim 1, wherein the first encryption uses the Pallier cryptosystem.
  • 5. The method of claim 1, wherein performing a comparison protocol on and , further comprises: choosing, by the server, random masks for 0≤k≤−1, and a random mask ;computing =;computing for 0≤k≤− where =−+; andsending and , . . . , to the client.
  • 6. The method of claim 5, wherein performing a comparison protocol on and , further comprises: decrypting, by the client, to get ; andcomputing, by the client, =−.
  • 7. The method of claim 1, wherein performing a comparison protocol on and , further comprises: choosing, by the server, a random mask ;computing <<>> where =+yk(−), <<w>> denotes a second homomorphic encryption of w using a cryptographic key of the second party; andsending <<>> to the client.
  • 8. The method of claim 7, wherein performing a comparison protocol on and , further comprises, decrypting, by the client, <<>> to get .
  • 9. The method of claim 7, wherein the second encryption uses the Boneh-Goh-Nissim (BGN) cryptosystem.
  • 10. The method of claim 7, wherein the second encryption uses a somewhat homomorphic cryptosystem.
  • 11. The method of claim 1, wherein for each level =1, 2, . . . , d−1 of the decision tree ′ the client computes yk=1{=k}, where =,encrypts yk resulting in yk, andsending yk to the server.
  • 12. A non-transitory machine-readable storage medium encoded with instructions for performing a secure evaluation of a decision tree, comprising: instructions for receiving, by a processor of a server, an encrypted feature vector x=(x1, from a client, where n is an integer, wherein x denotes an additively homomorphic encryption of x using a cryptographic key of the second party;instructions for choosing a random mask μ0;instructions for calculating m0 and sending m0 to the client, wherein m0=xi0(0)−t0(0)+μ0 and xi0(0) is the value of entry i0(0) of the feature vector x (1≤i0(0)≤n), and t0(0) is a threshold value in the first node in the first level of a decision tree ′;instructions for performing a comparison protocol on m0 and μ0, wherein the server produces a comparison bit b0 and the client produces a comparison bit b0′;instructions for choosing a random bit s0∈{0,1} and when s0=1 switching a left and right subtrees of ′;instructions for sending b0⊕s0 to the client; andfor each level =1, 2, . . . , d−1 of the decision tree ′, where d is the number of levels in the decision tree ′, perform the following instructions: instructions for receiving from the client where yk=0, 1, . . . , −1;instructions for performing a comparison protocol on and , wherein is a random mask and is based upon, x, , yk, and and the server produces a comparison bit and the client produces a comparison bit ;instructions for choosing a random bit ∈{0,1} and when =1 switching all left and right subtrees at level of ′; andinstructions for sending ⊕ to the client.
  • 13. The non-transitory machine-readable storage medium of claim 12, wherein when the server engages in a 1-out-of-2d oblivious transfer with the client to learn the value of ′(x).
  • 14. The non-transitory machine-readable storage medium of claim 13, wherein when the client computes r=(β0, β1, . . . , βd−1)2 which is the index of the leaf node in ′ indicating the output of the decision tree and where =⊕⊕.
  • 15. The non-transitory machine-readable storage medium of claim 12, wherein the first encryption uses the Pallier cryptosystem.
  • 16. The non-transitory machine-readable storage medium of claim 12, wherein instructions for performing a comparison protocol on and , further comprises: instructions for choosing, by the server, random masks for 0≤k≤−1, and a random mask ;instructions for computing =−;instructions for computing for 0≤k≤1 where =−+; andinstructions for sending and , . . . , to the client.
  • 17. The non-transitory machine-readable storage medium of claim 16, wherein instructions for performing a comparison protocol on and , further comprises: instructions for decrypting, by the client, to get ; andinstructions for computing, by the client, =−.
  • 18. The non-transitory machine-readable storage medium of claim 12, wherein instructions for performing a comparison protocol on and , further comprises: instructions for choosing, by the server, a random mask ;instructions for computing <<>> where =+yk(−), <<w>> denotes a second homomorphic encryption of w using a cryptographic key of the second party; andinstructions for sending <<>> to the client.
  • 19. The non-transitory machine-readable storage medium of claim 18, wherein instructions for performing a comparison protocol on and , further comprises, instructions for decrypting, by the client, <<>> to get .
  • 20. The non-transitory machine-readable storage medium of claim 18, wherein the second encryption uses the Boneh-Goh-Nissim (BGN) cryptosystem.
  • 21. The non-transitory machine-readable storage medium of claim 18, wherein the second encryption uses a somewhat homomorphic cryptosystem.
  • 22. The non-transitory machine-readable storage medium of claim 12, wherein for each level =1, 2, . . . , d−1 of the decision tree ′ the client performs instructions for computes yk=1{=k}, where =,instructions for encrypts yk resulting in , andinstructions for sending to the server.