The present principles relate to privacy-preserving recommendation systems and secure multi-party computation, and in particular, to providing recommendations to rating contributing users based on matrix factorization, in a privacy-preserving and blind fashion.
A great deal of research and commercial activity in the last decade has led to the wide-spread use of recommendation systems. Such systems offer users personalized recommendations for many kinds of items, such as movies, TV shows, music, books, hotels, restaurants, and more.
Nevertheless, earlier studies, such as those by B. Mobasher, R. Burke, R. Bhaumik, and C. Williams: “Toward trustworthy recommender systems: An analysis of attack models and algorithm robustness.”, ACM Trans. Internet Techn., 7(4), 2007, and by E. Aimeur, G. Brassard, J. M. Fernandez, and F. S. M. Onana: “ALAMBIC: A privacy-preserving recommender system for electronic commerce”, Int. Journal Inf. Sec., 7(5), 2008, have identified multiple ways in which recommenders can abuse such information or expose the user to privacy threats. Recommenders are often motivated to resell data for a profit, but also to extract information beyond what is intentionally revealed by the user. For example, even records of user preferences typically not perceived as sensitive, such as movie ratings or a person's TV viewing history, can be used to infer a user's political affiliation, gender, etc. The private information that can be inferred from the data in a recommendation system is constantly evolving as new data mining and inference methods are developed, for either malicious or benign purposes. In the extreme, records of user preferences can be used to even uniquely identify a user: A. Naranyan and V. Shmatikov strikingly demonstrated this by de-anonymizing the Netflix dataset in “Robust de-anonymization of large sparse datasets”, in IEEE S&P, 2008. As such, even if the recommender is not malicious, an unintentional leakage of such data makes users susceptible to linkage attacks, that is, an attack which uses one database as auxiliary information to compromise privacy in a different database.
Because one cannot always foresee future inference threats, accidental information leakage, or insider threats (purposeful leakage), it is of interest to build a recommendation system in which users do not reveal their personal data in the clear. A co-pending application by the inventors filed on the same date as this application and titled “A METHOD AND SYSTEM FOR PRIVACY PRESERVING MATRIX FACTORIZARTION” describes a privacy-preserving recommendation system based on matrix factorization. It operates on ratings submitted by users to a recommender system, which profiles the items rates without learning the ratings of individual users or the items they rated. This presumes that the users consent to the recommender learning the item profiles.
The present principles propose a stronger privacy-preserving recommendation system in which the recommender system does not learn any information about the users' ratings and the items that the system has rated, and does not learn any information about the item profiles, or any statiscal information extracted from user data, including the recommendations. Hence, the recommendation system provides recommendations to users who contributed ratings while being completely blind to the recommendations it provides.
The present principles propose a method for providing recommendations based on a collaborative filtering technique known as matrix factorization securely, in a privacy-preserving fashion. In particular, the method receives as inputs the ratings users gave to items (e.g., movies, books) and creates a profile for each item and each user that can be subsequently used to predict what rating a user can give to each item. The present principles allow a recommender system based on matrix factorization to perform this task without ever learning the ratings of a user, which item the user has rated, the item profiles or any statiscal information extracted from user data. In particular, the recommendation system provides recommendations to users who contributed ratings, in the form of predictions on how they would rate items that they have not already rated, while being completely blind to the recommendations it provides.
According to one aspect of the present principles, a method for securely generating recommendations through matrix factorization is provided, the method including: receiving a set of records (220), wherein each record is received from a respective user and comprises a set of tokens and a set of items, and wherein each record is kept secret from parties other than its respective user; receiving a request from a requesting user for at least one particular item (330); evaluating the set of records in a Recommender (RecSys) (230) by using a garbled circuit (355) based on matrix factorization, wherein the output of the garbled circuit comprises masked item profiles for the at least one particular item and a masked user profile for the one requesting user; and jointly evaluating the masked item profiles and the masked user profile, among the requesting user, the RecSys and a Crypto-Service Provider (CSP) in order to generate recommendations to the requesting user about the at least one particular item (360-385), wherein each recommendation and user profile for the requesting user is kept secret from parties other than the requesting user and the item profiles for the at least one particular item are kept secret from all parties, and wherein a user profile and an item profile are unmasked versions of the respective masked user profile and masked item profile. The method can further include: designing the garbled circuit in the CSP to perform matrix factorization on the set of records (340), wherein the garbled circuit output comprises masked item profiles for the at least one particular item and a masked user profile for the one requesting user; and transferring the garbled circuit to the RecSys (345). The step of designing in the method can include: designing a matrix factorization operation as a Boolean circuit (3402). The step of designing a matrix factorization circuit can include: constructing an array of the set of records; and performing the operations of sorting (420, 440, 470, 490), copying (430, 450), updating (470, 480), comparing (480) and computing gradient contributions (460) on the array. The method can further include:receiving a set of parameters for the design of the garbled circuit by the CSP, wherein the parameters were sent by the RecSys (335).
According to one aspect of the present principles, the method can further include: encrypting the set of records to create encrypted records (315), wherein the step of encrypting is performed prior to the step of receiving a set of records. The method can further include: generating public encryption keys in the CSP; and sending the keys to the respective users (310). The encryption scheme can be a partially homomorphic encryption (310), and the method can include: masking the encrypted records in the RecSys to create masked records (320); and decrypting the masked records in the CSP to create decrypted-masked records (325). The step of designing (340) includes: unmasking the decrypted-masked records inside the garbled circuit prior to processing them. The method can further include: performing oblivious transfers (350) between the CSP and the RecSys (3502), wherein the RecSys receives the garbled values of the decrypted-masked records and the records are kept private from the RecSys and the CSP.
According to one aspect of the present principles, the step of jointly evaluating can further include: unmasking the masked user profile with a first mask to obtain the user profile (360); encrypting the user profile to create an encrypted user profile (360); calculating in the RecSys a first product of the encrypted user profile and the masked item profile for the at least one particular item (370); calculating in the CSP at least one second product of the encrypted user profile and at least one second mask for the at least one particular item (375); subtracting in the RecSys the at least one second product from the first product to create at least one encrypted recommendation for the at least one particular item (380); and decrypting the at least one encrypted recommendation for the at least one particular item (385). The first mask can be chosen by the requesting user (315) and the at least one second mask can be chosen by the CSP (340). The steps of encrypting and decrypting can use an additively homomorphic encryption scheme chosen by the requesting user (360).
According to one aspect of the present principles, the method can further include: receiving the number of tokens and items of each record (220, 305). Furthermore, the method can further include: padding each record with null entries when the number of tokens of each record is smaller than a value representing a maximum value, in order to create records with a number of tokens equal to said value (3052). The source of the set of records can be a database.
According to one aspect of the present principles, a system for securely generating recommendations through matrix factorization, the system comprising a set of users which will provide a respective set of records, a Crypto-Service Provider (CSP) which will provide a secure matrix factorization circuit and a RecSys which will evaluate the records, such that each record is kept private from parties other than its respective user, wherein each user, the CSP and the RecSys each include: a processor (602), for receiving at least one input/output (604); and at least one memory (606, 608) in signal communication with the processor, wherein the RecSys processor is configured to: receive a set of records, wherein each record comprises a set of tokens and a set of items, and wherein each record is kept secret from parties other than the respective user; receive a request from a requesting user for at least one particular item; evaluate the set of records with a garbled circuit based on matrix factorization, wherein the output of the garbled circuit comprises masked item profiles for the at least one particular item and a masked user profile for the one requesting user; and wherein the requesting user, RecSys and CSP processors are configured to: jointly evaluate the masked item profiles and the masked user profile in order to generate recommendations to the requesting user about the at least one particular item (360-385), wherein each recommendation and user profile for the requesting user is kept secret from parties other than the requesting user, and the item profiles for the at least one particular item are kept secret from all parties and wherein a user profile and an item profile are unmasked versions of the respective masked user profile and masked item profile. The CSP processor in the system can be configured to: design the garbled circuit to perform matrix factorization of the set of records, wherein the garbled circuit output comprises masked item profiles for the at least one particular item and a masked user profile for the one requesting user; and transfer the garbled circuit to the RecSys. The CSP processor in the system can be configured to design the garbled circuit by being configured to: design a matrix factorization operation as a Boolean circuit. The CSP processor in the system can be configured to design the matrix factorization circuit by being configured to: construct an array of the set of records; and perform the operations of sorting, copying, updating, comparing and computing gradient contributions on the array. The CSP processor in the system can be further configured to: receive a set of parameters for the design of the garbled circuit, wherein the parameters were sent by the RecSys.
According to one aspect of the present principles, each user processor can be configured to:encrypt the respective record to create an encrypted record prior to providing the record. The CSP processor can be further configured to: generate public encryption keys in the CSP; and send the keys to the set of users. The encryption scheme can be a partially homomorphic encryption, and wherein the RecSys processor is further configured to:mask the encrypted records to create masked records; and the CSP processor is further configured to:decrypt the masked records to create decrypted-masked records. The CSP processor in the system can be configured to design the garbled circuit by being further configured to: unmask the decrypted-masked records inside the garbled circuit prior to processing them. The RecSys processor and the CSP processor in the system can be further configured to perform oblivious transfers, wherein the RecSys receives the garbled values of the decrypted-masked records and the records are kept private from the RecSys and the CSP.
According to one aspect of the present principles, the requesting user processor can be further configured to: unmask the masked user profile with a first mask to obtain the user profile; encrypt the user profile to create an encrypted user profile; and decrypt at least one encrypted recommendation for the at least one particular item. wherein the RecSys processor is further configured to: calculate a first product of the encrypted user profile and the masked item profile for the at least one particular item; subtract at least one second product from the first product to create at least one encrypted recommendation for the at least one particular item; and wherein the CSP processor is further configured to: calculate in the CSP the at least one second product of the encrypted user profile and at least one second mask for the at least one particular item. The first mask can be chosen by the requesting user and the at least one second mask can be chosen by the CSP. The requesting user processor can be further configured to: use an additively homomorphic encryption scheme chosen by the requesting user.
According to one aspect of the present principles, the RecSys processor in the system can be further configured to: receive the number of tokens of each record, wherein the number of tokens were sent by the Source. Each user processor can be configured to: pad the respective record with null entries when the number of tokens of each record is smaller than a value representing a maximum value, in order to create records with a number of tokens equal to said value. The source of the set of records can be a database.
Additional features and advantages of the present principles will be made apparent from the following detailed description of illustrative embodiments which proceeds with reference to the accompanying figures.
The present principles may be better understood in accordance with the following exemplary figures briefly described below
In accordance with the present principles, a method is provided for performing recommendations based on a collaborative filtering technique known as matrix factorization securely, in a privacy-preserving and blind fashion.
The method of the present principles can serve as a service to make a recommendation about an item in a corpus of records, each record comprising a set of tokens and items. The set or records includes more than one record and the set of tokens includes at least one token. A skilled artisan will recognize in the example above that a record could represent a user; the tokens could be a user's ratings to the corresponding items in the record. The tokens can also represent ranks, weights or measures associated with items, and the items can represent persons, tasks or jobs. For example, the ranks, weights or measures can be associated with the health of an individual, and a researcher is trying to correlate the health measures of a population. Or they can be associated with the productivity of an individual and a company is trying to predict schedules for certain jobs, based on prior history. However, to ensure the privacy of the individuals involved, the service wishes to do so in a blind fashion, without learning the contents of each record, the item profiles it provides, or any statiscal information extracted from user data (records). In particular, the service should not learn (a) in which records each token/item appeared or, a fortiori, (b) what tokens/items appear in each record (c) the values of the tokens and (d) the item profiles or any statistical information extracted from user data. In the following, terms and words like “privacy-preserving”, “private” and “secure” are used interchangeably to indicate that the information regarded as private by a user (record) is only known by the user; the word “blind” is used to indicate that parties other than the user are blind to the recommendation as well.
There are several challenges associated with performing matrix factorization in a privacy-preserving way. First, to address the privacy concerns, matrix factorization should be performed without the recommender ever learning the users' ratings, or even which items they have rated. The latter requirement is key: earlier studies show that even knowing which movie a user has rated can be used to infer, e.g., her gender. Second, such a privacy-preserving algorithm ought to be efficient, and scale gracefully (e.g., linearly) with the number of ratings submitted by users. The privacy requirements imply that the matrix factorization algorithm ought to be data-oblivious: its execution ought to not depend on the user input. Moreover, the operations performed by matrix factorization are non-linear; thus it is not a-priori clear how to implement matrix factorization efficiently under both of these constraints. Finally, in a practical, real-world scenario, users have limited communication and computation resources, and should not be expected to remain online after they have supplied their data. Instead it is desirable to have a “send and forget” type solution that can operate in the presence of users that move back and forth between being online and offline from the recommendation service.
As an overview of matrix factorization, in the standard “collaborative filtering” setting, n users rate a subset of m possible items (e.g., movies). For [n]:={1, . . . , n} the set of users, and [m]:={1, . . . , m} the set of items, denote by ⊂[n]×[m] the user/item pairs for which a rating has been generated, and by M=[] the total number of ratings. Finally, for (i, j)∈, denote by ri,j∈ the rating generated by user i for item j. In a practical setting, both n and m are large numbers, typically ranging between 104 and 106. In addition, the ratings provided are sparse, that is, M=0(n+m), which is much smaller than the total number of potential ratings n×m. This is consistent with typical user behavior, as each user may rate only a finite number of items (not depending on m, the “catalogue” size).
Given the ratings in , a recommender system wishes to predict the ratings for user/item pairs in [n]×[m]\. Matrix factorization performs this task by fitting a bi-linear model on the existing ratings. In particular, for some small dimension d∈, it is assumed that there exist vectors ui∈d, i∈[n], and vj∈d, j∈[m], such that
r
i,j
=
u
i
,v
j
+εi,j (1)
where εi,j are i.i.d. (independent and identically distributed) Gaussian random variables. The vectors ui and vj are called the user and item profiles, respectively and ui, vj is the inner product of the vectors. The used notation is U=[uiT]i∈[n]∈n×d for the n×d matrix whose i-th row comprises the profile of user i, and V=[vjT]j∈[m]∈m×d for the m×d matrix whose j-th row comprises the profile of item j.
Given the ratings R={ri,j:(i, j)∈}, the recommender typically computes the profiles U and V performing the following regularized least squares minimization:
for some positive λ, μ>0. One skilled in the art will recognize that, assuming Gaussian priors on the profiles U and V, the minimization in (2) corresponds to maximum likelihood estimation of U and V. Note that, having the user and item profiles, the recommender can subsequently predict the ratings {circumflex over (R)}={circumflex over (r)}i,j: i∈[n], j∈[m]) such that, for user i and item j:
{circumflex over (r)}
i,j
=
u
i
,v
j
,i∈[n],j∈[m] (3)
The regularized mean square error in (2) is not a convex function; several methods for performing this minimization have been proposed in literature. The present principles focus on gradient descent, a popular method used in practice, which is described as follows. Denoting by F(U, V) the regularized mean square error in (2), gradient descent operates by iteratively adapting the profiles U and V through the adaptation rule:
u
i(t)=ui(t−1)−γ∇u
v
i(t)=vi(t−1)−γ∇v
where γ>0 is a small gain factor and
where U(0) and V(0) consist of uniformly random norm 1 rows (i.e., profiles are selected u.a.r. (uniformly at random) from the norm 1 ball).
Another aspect of the present principles is proposing a secure multi-party computation (MPC) algorithm for matrix factorization based on sorting networks and Yao's garbled circuits. Secure multi-party computation (MPC) was initially proposed by A. Chi-Chih Yao in the 1980's. Yao's protocol (a.k.a. garbled circuits) is a generic method for secure multi-party computation. In a variant thereof, adapted from “Privacy-preserving Ridge Regression on Hundreds of millions of records”, in IEEE S&P, 2013, by V. Nikolaenko, U. Weinsberg, S. Ioannidis, M. Joye, D. Boneh, and N. Taft, the protocol is run between a set of n input owners, where ai denotes the private input of user i, 1≦i≦n, an Evaluator, that wishes to evaluate ƒ(a1, . . . , an), and a third party, the Crypto-Service Provider (CSP). At the end of the protocol, the Evaluator learns the value of ƒ(a1, . . . , an) but no party learns more than what is revealed from this output value. The protocol requires that the function ƒ can be expressed as a Boolean circuit, e.g. as a graph of OR, AND, NOT and XOR gates, and that the Evaluator and the CSP do not collude.
There are recently many frameworks that implement Yao's garbled circuits. A different approach to general purpose MPC is based on secret-sharing schemes and another is based on fully-homomorphic encryption (FHE). Secret-sharing schemes have been proposed for a variety of linear algebra operations, such as solving a linear system, linear regression, and auctions. Secret-sharing requires at least three non-colluding online authorities that equally share the workload of the computation, and communicate over multiple rounds; the computation is secure as long as no two of them collude. Garbled circuits assumes only two noncolluding authorities and far less communication which is better suited to the scenario where the Evaluator is a cloud service and the Crypto-Service Provider (CSP) is implemented in a trusted hardware component.
Regardless of the cryptographic primitive used, the main challenge in building an efficient algorithm for secure multi-party computation is in implementing the algorithm in a data-oblivious fashion, i.e., so that the execution path does not depend on the input. In general, any RAM program executable in bounded time T can be converted to a O(T̂3) Turing machine (TM), which is a theoretical computing machine invented by Alan Turing to serve as an idealized model for mathematical calculation and wherein O(T̂3) means that the complexity is proportional to T3. In addition, any bounded T-time TM can be converted to a circuit of size O(T log T), which is data-oblivious. This implies that any bounded T-time executable RAM program can be converted to a data-oblivious circuit with a O(T̂3 log T) complexity. Such complexity is too high and is prohibitive in most applications. A survey of algorithms for which efficient data-oblivious implementations are unknown can be found in “Secure multi-party computation problems and their applications: A review and open problems”, in New Security Paradigms Workshop, 2001, by W. Du and M. J. Atallah—the matrix factorization problem broadly falls into the category of Data Mining summarization problems.
Sorting networks were originally developed to enable sorting parallelization as well as an efficient hardware implementation. These networks are circuits that sort an input sequence (a1, a2, . . . , an) into a monotonically increasing sequence (a′1, a′2, . . . , a′n). They are constructed by wiring together compare-and-swap circuits, their main building block. Several works exploit the data-obliviousness of sorting networks for cryptographic purposes. However, encryption is not always enough to ensure privacy. If an adversary can observe your access patterns to encrypted storage, they can still learn sensitive information about what your applications are doing. Oblivious RAM solves this problem by continuously shuffling memory as it is being accessed; thereby completely hiding what data is being accessed or even when it was previously accessed. In oblivious RAM, sorting is used as a means of generating data-oblivious random permutation. More recently, it has been used to perform data-oblivious computations of a convex hull, all-nearest neighbors, and weighted set intersection;
The present principles propose a method based on secure multi-party sorting which is close to weighted set intersection but which incorporates garbled circuits.
According to the present principles, a protocol is proposed that allows the RecSys to execute matrix factorization and provide recommendations, while neither the RecSys nor the CSP learn anything useful about the users, including the recommendations, {circumflex over (R)}, given to the users, for which the encrypted values are the outputs of RecSys in
The item profile can be seen as a metric which defines an item as a function of the ratings of a set of users/records. Similarly, a user profile can be seen as a metric which defines a user as a function of the ratings of a set of users/records. In this sense, an item profile is a measure of approval/disapproval of an item, that is, a reflection of the features or characteristics of an item. And a user profile is a measure of the likes/dislikes of a user, that is, a reflection of the user's personality. If calculated based on a large set of users/records, an item or user profile can be seen as an independent measure of the item or user, respectively. One with skill in the art will realize that there is a utility in learning the item profiles alone. First, the embedding of items in d through matrix factorization allows the recommender to infer (and encode) similarity: items whose profiles have small Euclidean distance are items that are rated similarly by users. As such, the task of learning the item profiles is of interest to the recommender beyond the actual task of recommendations. In particular, the users may not need or wish to receive recommendations, as may be the case if the Source is a database. Second, having obtained the item profiles, there is a trivia: the recommender can use them to provide relevant recommendations without any additional data revelation by users. The recommender can send V to a user (or release it publicly); knowing her ratings per item, user i can infer her (private) profile, ui, by solving (2) with respect to ui; for given V (this is a separable problem), and each user can obtain her profile by performing ridge regression over her ratings. Having ui and V the user can predict all her ratings to other items locally through (4). This is the subject of a co-pending application by the inventors filed on the same date as this application and titled “A METHOD AND SYSTEM FOR PRIVACY-PRESERVING RECOMMENDATION BASED ON MATRIX FACTORIZATION AND RIDGE REGRESSION”.
Both of the scenarios discussed above presume that neither the recommender nor the users object to the public release of V. For the sake of simplicity, as well as on account of the utility of such a protocol to the recommender, a co-pending application by the inventors filed on the same date as this application and titled “A METHOD AND SYSTEM FOR PRIVACY PRESERVING MATRIX FACTORIZATION” allows the recommender to learn the item profiles. The present principles extend this design so that users learn their predicted ratings while the recommender performs the operation in a blind fashion and does not learn any useful information about the users, not even V.
According to the present principles, it is assumed that the security guarantees will hold under the honest but curious threat model. In other words, the RecSys and CSP follow the protocols as prescribed; however, these interested parties may elect to analyze protocol transcripts, even off-line, in order to infer some additional information. It is further assumed that the recommender and CSP do not collude.
The preferred embodiment of the present principles comprises a protocol satisfying the flowchart 300 in
The above construction works only for users whose user profiles were computed through matrix factorization. A new user, that has not submitted yet her data, cannot get recommendations this way. Co-pending application Ser. No. ______ and titled “A METHOD AND SYSTEM FOR PRIVACY-PRESERVING RECOMMENDATION BASED ON MATRIX FACTORIZATION AND RIDGE REGRESSION” addresses this particular case.
Technically, this protocol leaks the number of tokens provided by each user, This can be rectified through a simple protocol modification, e.g., by “padding” records submitted with appropriately “null” entries until reaching pre-set maximum number 312. For simplicity, the protocol was described without this “padding” operation.
As garbled circuits can only be used once, any future computation on the same ratings would require the users to re-submit their data through proxy oblivious transfer. A proxy oblivious transfer is an oblivious transfer is which 3 or more parties are involved. For this reason, the protocol of the present principles adopted the hybrid approach, combining public-key encryption with garbled circuits.
In the present principles, public-key encryption is used as follows: Each user i encrypts her respective mask θi under the public key, pkCSP, provided by the CSP with a semantically secure encryption algorithm ξpk
The CSP public-key encryption algorithm is partially homomorphic: a constant can be applied to an encrypted message without the knowledge of the corresponding decryption key. Clearly, an additively homomorphic scheme such as Paillier or Regev can also be used to add a constant, but hash-ElGamal, which is only partially homomorphic, suffices and can be implemented more efficiently in this case.
Upon receiving M ratings from users—recalling that the encryption is partially homomorphic—the RecSys obscures them with random masks ĉ=c⊕η, where η is a random or pseudo-random variable and ⊕ is an XOR operation. The RecSys sends them to the CSP together with the complete specifications needed to build a garbled circuit. In particular, the RecSys specifies the dimension of the user and item profiles (i.e., parameter d), the total number of ratings (i.e., parameter M), and the total number of users and of items, as well as the number of bits used to represent the integer and fractional parts of a real number in the garbled circuit.
Whenever the RecSys wishes to perform matrix factorization over M accumulated ratings, it reports M to the CSP. The CSP may provide the RecSys with a garbled circuit that (a) decrypts the inputs and then (b) performs matrix factorization. In “Privacy-preserving ridge regression on hundreds of millions of records”, in IEEE S&P, 2013, by V. Nikolaenko, U. Weinsberg, S. Ioannidis, M. Joye, D. Boneh, and N. Taft, decryption within the circuit is avoided by using masks and homomorphic encryption. The present principles utilize this idea to matrix factorization, but only require a partially homomorphic encryption scheme.
Upon receiving the encryptions, the CSP decrypts them and gets the masked values (i, (j, ri,j)⊕η). Then, using the matrix factorization as a blueprint, the CSP prepares a Yao's garbled circuit that:
At the end of the of the protocol, the RecSys sends the respective ûi to each user i, who can then recover her profile ui by removing the mask: ui=ûi−θi, i∈[n]. If a user i, wishes to get item recommendations, the user encrypts her profile under her own public key, pki, with an additively homomorphic encryption algorithm pk
The computation of matrix factorization by the gradient descent operations outlined in (4) and (5) involves additions, subtractions and multiplications of real numbers. These operations can be efficiently implemented in a circuit. The K iterations of gradient decent (4) correspond to K circuit “layers”, each computing the new values of profiles from values in the preceding layer. The outputs of the circuit are the item profiles V, while the user profiles are discarded.
One with skill in the art will observe that the time complexity of computing each iteration of gradient descent is O(M), when operations are performed in the clear, e.g., in the RAM model. The computation of each gradient (5) involves adding 2M terms, and profile updates (4) can be performed in O(n+m)=O(M).
The main challenge in implementing gradient descent as a circuit lies in doing so efficiently. To illustrate this, one may consider the following naïve implementation:
Unfortunately, this implementation is inefficient: every iteration of the gradient descent algorithm will have a circuit complexity of O(n×m). When M<<n×m, as it is usually the case in practice, the above circuit is drastically less efficient than gradient descent in the clear. In fact, the quadratic cost O(n×m) is prohibitive for most datasets. The inefficiency of the naïve implementation arises from the inability to identify which users rate an item and which items are rated by a user at the time of the circuit design, mitigating the ability to leverage the inherent sparsity in the data.
Conversely, according to the preferred embodiment of the present principles, a circuit implementation is provided based on sorting networks whose complexity is O((n+m+M)log2(n+m+M)), i.e., within a polylogarithmic factor of the implementation in the clear.
In summary, both the input data, corresponding to the tuples (i, j, ri,j), and placeholders ⊥ for both the user and item profiles are stored together in an array. Through appropriate sorting operations, user or item profiles can be placed close to the input with which they share an identifier. Linear passes through the data allow the computation of gradients, as well as updates of the profiles. When sorting, the placeholder is treated as +∞, i.e., larger than any other number.
The matrix factorization algorithm according to a preferred embodiment of the present principles and satisfying the flowchart 400 in
S
5,k
←S
3,k
*S
5,k−1+(1−S3,k)*s5,k, for k=2, . . . , M+n
s
6,k
←s
3,k
*S
6,k−1+(1−s3,k)*s6,k, for k=2, . . . , M+m
s
6,k
←s
6,k
+s
3,k+1
*s
6,k+1+(1−s3,k)*2γμs6,k, for k=M+n−1, . . . 1
s
5,k
←s
5,k
+s
3,k+1
*s
5,k+1+(1−s3,k)*2γλs5,k, for k=M+n−1, . . . 1
The gradient descent iterations comprise the following three major steps:
The above operations are to be repeated K times, that is, the number of desirable iterations of gradient descent. Finally, at the termination of the last iteration, the array is sorted with respect to the flags (i.e., s3,k) as a primary index, and the item ids (i.e., s2,k) as a secondary index. This brings all item profile tuples to the first m positions in the array, from which the item profiles can be outputted. Furthermore, in order to obtain the user profiles, at the termination of the last iteration, the array is sorted with respect to the flags (i.e., s3,k) as a primary index, and the user ids (i.e., s1,k) as a secondary index. This brings all user profile tuples to the first n positions in the array, from which the user profiles can be outputted.
One with skill in the art will recognize that each of the above operations is data-oblivious, and can be implemented as a circuit. Copying and updating profiles requires (n+m+M) gates, so the overall complexity is determined by sorting which, e.g., using Batcher's circuit yields a O((n+m+M)log2 (n+m+M)) cost. Sorting and the gradient computation in step C6 of the algorithm are the most computationally intensive operations; fortunately, both are highly parallelizable. In addition, sorting can be further optimized by reusing previously computed comparisons at each iteration. In particular, this circuit can be implemented as a Boolean circuit (e.g., as a graph of OR, AND, NOT and XOR gates), which allows the implementation to be garbled, as previously explained.
According to the present principles, the implementation of the matrix factorization algorithm described above together with the protocol previously described provides a novel method for recommendation, in a privacy-preserving fashion. In addition, this solution yields a circuit with a complexity within a polylogarithmic factor of matrix factorization performed in the clear by using sorting networks. Furthermore, an additional advantage of this implementation is that the garbling and the execution of this circuit are highly parallelizable.
In an implementation of a system according to the present principles, the garbled circuit construction was based on FastGC, a publicly available garbled circuit framework. FastGC is a Java-based open-source framework, which enables circuit definition using elementary XOR, OR and AND gates. Once the circuits are constructed, the framework handles garbling, oblivious transfer and the complete evaluation of the garbled circuit. However, before garbling and executing the circuit, FastGC represents the entire ungarbled circuit in memory as a set of Java objects. These objects incur a significant memory overhead relative to the memory footprint that the ungarbled circuit should introduce, as only a subset of the gates is garbled and/or executed at any point in time. Moreover, although FastGC performs garbling in parallel to the execution process as described above, both operations occur in a sequential fashion: gates are processed one at a time, once their inputs are ready. A skilled artisan will clearly recognize that this implementation is not amenable to parallelization.
As a result, the framework was modified to address these two issues, reducing the memory footprint of FastGC but also enabling parallelized garbling and computation across multiple processors. In particular, we introduced the ability to partition a circuit horizontally into sequential “layers”, each one comprising a set of vertical “slices” that can be executed in parallel. A layer is created in memory only when all its inputs are ready. Once it is garbled and evaluated, the entire layer is removed from memory, and the following layer can be constructed, thus limiting the memory footprint to the size of the largest layer. The execution of a layer is performed using a scheduler that assigns its slices to threads, enabling them to run in parallel. Although parallelization was implemented on a single machine with multiple cores, the implementation can be extended to run across different machines in a straightforward manner since no shared state between slices is assumed.
Finally, to implement the numerical operations outlined in the algorithm, FastGC was extended to support addition and multiplications over the reals with fixed-point number representation, as well as sorting. For sorting, Batcher's sorting network was used. Fixed-point representation introduced a tradeoff between the accuracy loss resulting from truncation and the size of circuit. Furthermore, the implementation of the algorithm was optimized in multiple ways, in particular:
It is to be understood that the present principles may be implemented in various forms of hardware, software, firmware, special purpose processors, or a combination thereof. Preferably, the present principles are implemented as a combination of hardware and software. Moreover, the software is preferably implemented as an application program tangibly embodied on a program storage device. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (CPU), a random access memory (RAM), and input/output (I/O) interface(s). The computer platform also includes an operating system and microinstruction code. The various processes and functions described herein may either be part of the microinstruction code or part of the application program (or a combination thereof), which is executed via the operating system. In addition, various other peripheral devices may be connected to the computer platform such as an additional data storage device and a printing device.
It is to be further understood that, because some of the constituent system components and method steps depicted in the accompanying figures are preferably implemented in software, the actual connections between the system components (or the process steps) may differ depending upon the manner in which the present principles are programmed. Given the teachings herein, one of ordinary skill in the related art will be able to contemplate these and similar implementations or configurations of the present principles.
Although the illustrative embodiments have been described herein with reference to the accompanying figures, it is to be understood that the present principles are not limited to those precise embodiments, and that various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present principles. All such changes and modifications are intended to be included within the scope of the present principles as set forth in the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
PCT/US2013/076353 | Dec 2013 | US | national |
This application claims the benefit of and priority to the U.S. Provisional patent applications filed on Aug. 9, 2013: Ser. No. 61/864,088 and titled “A METHOD AND SYSTEM FOR PRIVACY PRESERVING MATRIX FACTORIZATION”; Ser. No. 61/864,085 and titled “A METHOD AND SYSTEM FOR PRIVACY PRESERVING COUNTING”; Ser. No. 61/864,094 and titled “A METHOD AND SYSTEM FOR PRIVACY-PRESERVING RECOMMENDATION TO RATING CONTRIBUTING USERS BASED ON MATRIX FACTORIZATION”; and Ser. No. 61/864,098 and titled “A METHOD AND SYSTEM FOR PRIVACY-PRESERVING RECOMMENDATION BASED ON MATRIX FACTORIZATION AND RIDGE REGRESSION”. In addition, this application claims the benefit of and priority to the PCT Patent Application filed on Dec. 19, 2013, Serial No. PCT/US13/76353 and titled “A METHOD AND SYSTEM FOR PRIVACY PRESERVING COUNTING” and to the U.S. Provisional patent application filed on Mar. 4, 2013: Ser. No. 61/772,404 and titled “PRIVACY-PRESERVING LINEAR AND RIDGE REGRESSION”. The provisional and PCT applications are expressly incorporated by reference herein in their entirety for all purposes.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2014/036359 | 5/1/2014 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
61772404 | Mar 2013 | US | |
61864088 | Aug 2013 | US | |
61864094 | Aug 2013 | US | |
61864098 | Aug 2013 | US | |
61864085 | Aug 2013 | US |