The present invention generally relates to data privacy, and more particularly relates to private approximation protocols.
The availability of distributed massive datasets has led to significant privacy concerns. While generic techniques such as secure function evaluation (SFE) and fully homomorphic encryption (FHE) are available, such techniques concern exact computation. For large datasets, computing even basic statistics exactly is prohibitive or impossible.
In one embodiment, a method for transforming a two-party approximation protocol into a private approximation protocol is disclosed. The method comprises receiving a first input xε{0, 1, . . . , M}n and a second input yε{0, 1, . . . , M}n of a two party approximation protocol (TPAP) for approximating a function of a form ƒ(x, y)=Σj=1ng(xj, yj), where g is any non-negative efficiently computable function. Variable B is set as a public upper bound on ƒ(x, y) for the first input x and the second input y. The variable l=O*(1). The following is performed until
or B<1, where t is an arbitrary number: (1) a private importance sampling protocol is executed with the first input x, the second input y, and a third input 1k, independently for jε[l], where k is a security parameter. The output of the private importance sampling protocol is shares of Ijε[n]∪{⊥}; (2) l coin tosses z1, . . . , zl, where zj=1 iff Ij≠⊥ are independently generated; and (3) B is divided by 2. A determination is made that
or B<1. A private (ε, δ)-approximation protocol Ψ for ƒ(x, y)=Σj=1ng(xj, yj), where
ε is an arbitrary number, and δ=exp(−k) is outputted.
In another embodiment, an information processing system for transforming a two-party approximation protocol into a private approximation protocol is disclosed. The information processing system comprises a memory and a processor that is communicatively coupled to the memory. A private approximation protocol generator is communicatively coupled to the processor and the memory. The private approximation protocol generator is configured to perform a method. The method comprises receiving a first input xε{0, 1, . . . , M}n and a second input yε{0, 1, . . . , M}n of a two party approximation protocol TPAP for approximating a function of a form ƒ(x, y)=Σj=1ng(xj, yj), where g is any non-negative efficiently computable function. Variable B is set as a public upper bound on ƒ(x, y) for the first input x and the second input y. The variable l=O*(1). The following is performed until
or B<1, where t is an arbitrary number: (1) a private importance sampling protocol is executed with the first input x, the second input y, and a third input 1k, independently for jε[l], where k is a security parameter. The output of the private importance sampling protocol is shares of Ijε[n]∪{⊥}; (2) l coin tosses z1, . . . , zl, where zj=1 iff Ij≠⊥ are independently generated; and (3) B is divided by 2. A determination is made that
or B<1. A private (ε, δ)-approximation protocol Ψ for ƒ(x, y)=Σj=1ng(xj, yj), where
ε is an arbitrary number, and δ=exp(−k) is outputted.
In yet another embodiment, a computer program product for transforming a two-party approximation protocol into a private approximation protocol is disclosed. The computer program product comprises a storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method. The method comprises receiving a first input xε{0, 1, . . . , M}n and a second input yε{0, 1, . . . , M}n of a two party approximation protocol TPAP for approximating a function of a form ƒ(x, y)=Σj=1ng(xj, yj), where g is any non-negative efficiently computable function. Variable B is set as a public upper bound on ƒ(x, y) for the first input x and the second input y. The variable l=O*(1). The following is performed until
or B<1, where t is an arbitrary number: (1) a private importance sampling protocol is executed with the first input x, the second input y, and a third input 1k, independently for jε[l], where k is a security parameter. The output of the private importance sampling protocol is shares of Ijε[n]∪{⊥}; (2) l coin tosses z1, . . . , zl, where zj=1 iff Ij≠⊥ are independently generated; and (3) B is divided by 2. A determination is made that
or B<1. A private (ε, δ)-approximation protocol Ψ for ƒ(x, y)=Σj=1ng(xj, yj), where
ε is an arbitrary number, and δ=exp(−k) is outputted.
The accompanying figures where like reference numerals refer to identical or functionally similar elements throughout the separate views, and which together with the detailed description below are incorporated in and form part of the specification, serve to further illustrate various embodiments and to explain various principles and advantages all in accordance with the present invention, in which:
Operating Environment
The computer system/server 102 is illustratively shown in the form of a general-purpose computing device. The components of computer system/server 102 include, but are not limited to, one or more processors or processing units 104, a system memory 106, and a bus 108 that couples various system components including system memory 106 to processor 104. The bus 108 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnects (PCI) bus.
Computer system/server 102 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer system/server 102 and includes both volatile and non-volatile media, and removable and non-removable media. The system memory 106 of this embodiment includes computer system readable media in the form of volatile memory, such as random access memory (RAM) 112 and cache memory 114.
Computer system/server 102 can further include other removable/non-removable, volatile/non-volatile computer system storage media. By way of example, storage system 116 of this embodiment is provided for reading from and writing to a non-removable, non-volatile magnetic media (i.e., a “hard drive”). A magnetic disk drive for reading from and writing to a removable, non-volatile magnetic disk (i.e., a “floppy disk”), and an optical disk drive for reading from or writing to a removable, non-volatile optical disk such as a CD-ROM, DVD-ROM, or other optical media can also be provided. In such instances, each is connected to bus 108 by one or more data media interfaces. Additionally, memory 106 includes at least one program product having one or more program modules that are configured to carry out the functions of embodiments of the present invention.
Program/utility 118, having one or more program modules 120, is stored in memory 106. In this embodiment, Program/utility 118 also includes an operating system, one or more application programs, other program modules, and program data. Each of the operating system, one or more application programs, other program modules, and program data, or some combination thereof, may include an implementation of a networking environment. Program modules 120 generally carry out the functions and/or methodologies of embodiments of the present invention.
The computer system/server 102 also comprises a private approximation protocol (PAP) generator 110 that performs one or more of the functions discussed below with respect to
Overview
An approximation protocol for a function ƒ is a two-party protocol in which one party has an input vector x, the other has an input vector y, and the parties output an approximation to ƒ(x, y). The approximation protocol is private if the parties do not learn anything about each other's input other than what follows from the exact value ƒ(x, y). It is insufficient to use secure function evaluation or fully homomorphic encryption on a standard, non-private protocol for approximating f. This is because the approximation may reveal information about x and y that does not follow from ƒ(x, y). In the past, efficient private approximation protocols were only known for a few specific problems.
One type of private approximation protocol is a two-party private approximation protocol. Generally speaking, a two-party protocol for a function ƒ(x, y), where the first party has input x and the second input y, is a private approximation protocol (PAP) of ƒ(x, y) if it satisfies the following two properties. First, the output F(x, y) must be a functionally private approximation (FPA). That is, it approximates ƒ(x, y) in the usual sense, for example, is an (ε, δ)-approximation (F(x, y) is an (ε, δ)-approximation of ƒ(x, y) if ∀x, y, Pr[(1−ε)ƒ(x, y)≦F(x, y)≦(1+ε)ƒ(x, y)]≧1−δ), and its distribution can be simulated given only the exact function value ƒ(x, y). Thus, an FPA captures the intuition that each party learns nothing about the other party's input from the output except what follows from ƒ(x, y) and the party's own input. The second condition of a PAP is that the entire view of the parties can be simulated given only ƒ(x, y).
In general, it is insufficient to perform secure function evaluation (SFE) or fully homomorphic encryption (FHE) on a standard, non-private protocol for approximating f. This is because the approximation F(x, y) may reveal information about x and y that does not follow from ƒ(x, y). For example, if ƒ(x, y) is the Hamming distance between x and y, the least significant bit of the approximation may equal an arbitrary bit of x. Given a protocol that outputs an FPA, it can be compiled in a generic way using an FHE to obtain a PAP by increasing the computation, communication, and round complexity by an O*(1) factor. The notation O*(ƒ) means ƒ(k, n, M, ε)(kε−1 log(nM)log 1/δ), where k is a security parameter. Thus, the main focus of previous work on PAPs is on designing FPAs. An FPA is also independently motivated, for instance, if two honest parties wish to publish a statistic of their joint data that is functionally private.
Similarity estimation is a basic primitive for comparing massive data sets. A generic similarity measure between vectors x, yε{−M, −M+1, . . . , M}n is Σj=1ng(xj, yj), for some function g. One of the well-studied similarity measures is the lp-distance ∥x−y∥p for p≧0, or equivalently, the p-th power of the lp-distance, known as the p-th frequency moment. Here, the function g(z)=|z|p, so that ∥x−y∥pp=Σj=1n|xj−yj|p. When p=0, then 0° is interpreted as 0, and so l0 measures the number of coordinates for which x and y differ.
One known PAP for the lp-distances gives an O*(√{square root over (n)}) communication protocol for privately approximating the Hamming distance between bit-strings. This has been extended to O*(1) communication and O*(n2) work for the Euclidean distance, for which Hamming distance on bit-strings is a special case. It has also been reduced to O*(n) using the FFT. There are also known PAPs for the problem of finding the l2-heavy hitters of x−y, and to a weaker extent the l1-heavy hitters. The latter problem is used to detect all coordinates i for which |xi−yi| is large. There is also a known FPA of the lp-distance which critically relies on p-stable distributions for pε(0, 2]. Nothing is known for pε{0}∪(2, ∞), despite these being well-studied distances. The case p=0 is known as the Hamming norm, a generalization of Hamming distance to non-binary strings, while p=3 is the skewness and p=4 the kurtosis.
Embodiments of the present invention provide private approximation protocols (PAPs) for one or more of these functions. For example, one embodiment provides the following general transformation: any two-party protocol for outputting a (1+ε)-approximation to ƒ(x, y)=Σj=1ng(xj, yj) with probability of at least ⅔, for any non-negative efficiently computable function g, can be compiled (e.g., via the PAP generator) into a two-party private approximation protocol with only a polylogarithmic factor loss in communication, computation, and round complexity. In general, it is insufficient to use secure function evaluation or fully homomorphic encryption on a standard, non-private protocol for approximating f. This is because the approximation may reveal information about x and y that does not follow from ƒ(x, y).
By applying the transformation and variations of it provided by embodiments of the present invention, near-optimal private approximation protocols are obtained for a wide range of problems in data streaming Near-optimal private approximation protocols are provided for the lp-distance for every p≧0, for the heavy hitters and importance sampling problems with respect to any lp-norm, for the max-dominance and other dominant lp-norms, for the distinct summation problem, for entropy, for cascaded frequency moments, for subspace approximation and block sampling, and for measuring independence of datasets. Using a result for data streams, embodiments obtain private approximation protocols with polylogarithmic communication for every non-decreasing and symmetric function g(xj, yj)=h(xj−yj) with at most quadratic growth. If the original (non-private) protocol is a simultaneous protocol, e.g., a sketching algorithm, then the only cryptographic assumption is efficient symmetric computationally-private information retrieval; otherwise it is fully homomorphic encryption. The various protocols provided by embodiments of the present invention generalize straightforwardly to more than two parties.
Protocol Privacy Definition and Tools
The following is a discussion of the various preliminaries for the PAP transformation process (e.g., as performed by the PAP generator). With respect to the security parameter k, in this illustrative embodiment this parameter is set to k=(n). Thus, in the following definitions of privacy, it is insufficient to protect against (k)-time adversaries, as the parties themselves run in (n) time. Hence, throughout security is defined with respect to exp(k)-time algorithms. In this embodiment, the notion of computational indistinguishability is needed. Distributions 1 and 2 are computationally indistinguishable, denoted 12, if for every pair of random variables X1˜1 and X2˜2 and for any family of exp(k)-size circuits {Ck}, |Pr[Ck(X1)=1]−Pr[Ck(X2)=1]|=exp(−k).
A two-party private protocol will now be defined. Given two parties/entities, Alice and Bob, let h be a possibly randomized mapping from input pairs (a, b) to output pairs (c, d). A randomized synchronous protocol proceeds in rounds. In each round a party sends a message based on the security parameter k, the party's input and random tape, as well as messages passed in previous rounds. During each round either party may decide to terminate based on the party's view, which is a party's input and its random tape together with all messages exchanged. It should be noted that in this embodiment a random tape of an entity is a string of random bits stored in memory by the entity and unknown to the other entity. Such a string can be generated in various ways, e.g., by using a random number generator such as AES (Advanced Encryption Standard).
To capture the privacy of a protocol Π for a mapping h, the random variable REALΠ, A(k, (a, b)) is used. This contains the view of Alice in Π with the input to the protocol set to (a, b), concatenated with the output of Bob (this concatenation is required for technical reasons). REALΠ, B(k, (a, b)) is similarly defined. Next, for an efficient ((n)-time) algorithm S known as a simulator, let IDEALΠ, A, S, h(k, (a, b)) be the output of the random process: (1) apply h to (a, b), resulting in a pair of outputs (c, d), (2) invoke S on (k, a, c), and (3) concatenate the output of S with d. IDEALΠ, B, S, h(k, (a, b)) is similarly defined.
A private two-party protocol Π of a randomized mapping h is a protocol for which: (1) the distribution on outputs has l1-distance exp(−k) from that of h, and (2) there is an efficient ((n)-time) simulator SA such that for any input pair (a, b), there is {REALΠ, A(k, (a, b))}kεN{IDEALΠ, A, S
The notion of a symmetric computationally-private information retrieval (SPIR) protocol is used (i.e., Alice has a string aε{0, 1}n while Bob has an index iε[n]). The randomized mapping is h(a, i)=ai, and an SPIR protocol is a private protocol for h.
It is known how to construct an SPIR protocol from a PIR protocol (namely, a protocol for SPIR which relaxes privacy to only require that there is a simulator SB in the above definition for a private two-party protocol Π, rather than both simulators SA and SB). The PIR to SPIR transformation only incurs an O*(1) factor blowup in communication, computation, and number of rounds. Let C(n) be the communication of a PIR protocol with O*(n) work per party and O*(1) rounds. C(n) can be as low as O*(1). It is assumed that such a scheme exists in the following.
As an example, two parties are said to jointly evaluate a circuit with ROM if the (randomized) mapping the parties compute can be expressed as a circuit whose gates, in addition to those of a complete basis on bitstrings, can be lookup gates. Here, Alice (resp. Bob) builds a table RAε{0, 1}n (resp. RB), and the lookup gate, given a pair (A, j) (resp. (B, j)), outputs RA(j) (resp. RB(j)).
Given a PIR (and hence an SPIR) scheme with C(n)=O*(n), any circuit with ROM Λ can be privately computed with O*(|Λ|) communication, O*(n|Λ|) work, and O*(|Λ|) rounds, where |Λ| is the number of gates in Λ.
A standard composition theorem will now be given. An oracle-aided protocol using an oracle functionality privately computes h if there are simulators SA and SB as in the above definition for a private two-party protocol Π, where the corresponding views of the parties are defined in the natural manner to include oracle answers. Suppose there is a private oracle-aided protocol for h given oracle functionality , and a private protocol for computing h. Then the protocol defined by replacing each oracle-call to by a protocol that privately computes is a private protocol for h.
Transformation of an Approximation Protocol into a PAP
The following is a detailed discussion on transforming any two-party protocol for approximating a function ƒ(x, y) of the form ƒ(x, y)=Σj=1ng(xj, yj), for any non-negative efficiently computable function g, into a PAP for ƒ(x, y) with the same communication, computation, and round complexity, up to an O*(1) factor. The computation also increases by an additive O*(n), but this does not affect the asymptotic complexity of any problem considered here, because all problems here require at least linear time. Despite the intuition that designing PAPs for functions is more difficult than feeding a protocol for an approximation into an SFE or an FHE scheme, the transformation provided by the PAP generator 110 shows there is still a generic compiler of an approximation protocol into a private one for a very large class of functions. While two parties are used here, the PAPs of embodiments of the present invention are also applicable to more than two parties.
The PAP generator 110 first transforms an approximation protocol into an FPA using an importance sampling procedure such as the g-Sampler protocol discussed below with respect to
where B is a known upper bound on Σj=1ng(xj, yj). In this context, “secret shares” mean that that the first party obtains i⊕XOR r and the second party obtains r, where r is a random bitstring, and so the parties do not know index i, though if their outputs are taken together, they determine i.
Given a protocol TPAP for (O(1/log n), ⅓)-approximating Σj=1ng(xj, yj), the PAP generator 110 first amplifies TPAP's success probability to 1−exp(−k) by independent repetition, taking the median. There is also an assumed public upper bound B on Σj=1ng(xj, yj) for all x and y. For problems considered in this discussion, one can take B=(Mn)O(1). This embodiment of the present invention designs an efficient method for two parties to sample from the distribution on [n]∪⊥:
where
Some embodiments do not achieve a protocol sampling exactly from π, but they show how to sample from a distribution π′ with l1-distance exp(−k) from π, where k is a security parameter. The protocol starts by one party sending a seed of a pseudorandom generator to the other party, determining a pseudorandom string σ shared by both parties. This is the standard model, not the common reference string model.
A complete binary tree is considered on n coordinates. A probability ri is assigned to each leaf i of based on the execution of TPAP with random string σ as follows. Once σ is fixed, an approximation aS
for each (vj, wj) pair along this path. Since TPAP provides an (O(1/log n), exp(−k))-approximation, a telescoping product is obtained, and the following can be shown.
The concept is for the parties to perform a binary search on the coordinates of [n] by, starting from the root, applying TPAP independently on a node v and its sibling w, and choosing which node to recurse on based on the values aS
otherwise w is recursed on. Upon reaching a single coordinate i, the value g(xi, yi) is obtained by exchanging xi and yi.
This embodiment uses the technique of rejection sampling, which is a technique to generate samples from a probability distribution function ƒ(z) by using a distribution g(z), with the restriction that ƒ(z)<Vg(z), where V>1 for some bound V, and which is often easier to sample from than ƒ(z). This restriction cannot hold for all z since ƒ(z) and g(z) are distributions; in this embodiment, the only z for which it will not hold is z=⊥. Rejection sampling is used to adjust the probability of outputting i so that it equals
The probability to reject the sample i knowing g(xi, yi) and computing ri, and rejecting with probability
can be determined. To do the rejection sampling, the probability, in this embodiment, is an overestimate of
with overwhelming probability, over the choice of σ, as otherwise this is not a valid probability, and the protocol is not simulatable. For correctness, this must hold even when B≧2Σj=1ng(xj, yj). Indeed,
since
If i is rejected, this probability mass contributes to π′(⊥). Rejection sampling is only possible because embodiments zoom in on individual coordinates, for which the exact probability
can be efficiently computed.
Given the procedure of this embodiment, an information-truncation technique is leveraged. A coin is set to 1 if and only if (iff) the character ⊥ is not sampled by the importance sampling procedure of this embodiment. The local rejection probabilities in the protocol of this embodiment collectively add up, over the n coordinates, to the probability that the coin toss is 0. The coin has expectation
This is done independently for O*(1) coins. If most of the coins are 0, then B is halved and the process is repeated. This process of halving B depends only on the value Σj=1ng(xj, yj), so is simulatable. When B is close to Σj=1ng(xj, yj), with overwhelming probability a large fraction of coins will be 1, and Σj=1ng(xj, yj) can be (ε, δ)-approximated. In this embodiment application of the information-truncation technique is simpler because with importance sampling each coin toss involves all coordinates.
Transforming this FPA into a PAP can be done using FHE. However, if TPAP is a simultaneous protocol with shared randomness, the weaker assumption of symmetric computationally-private information retrieval (SPIR) with O*(1) communication and O*(n) work can be used. This is true for almost all applications of the illustrated embodiments of the present invention, which have sketching algorithms. In an SPIR protocol, there is a user with an index iε[n]{1, 2, . . . , n} and a server with a string xε{0, 1}n who execute a protocol for which the user learns only xi, while a server learns nothing about i, assuming both parties must run in (n) time. A known construction coupled with a symmetric version satisfies this under the well-studied Φ-Hiding Assumption. If one is willing to lose a factor of nγ for arbitrarily small constant γ, one can just assume additively homomorphic encryption, for which there are many more schemes.
To perform the transformation of the FPA to a PAP based on SPIR, the seed is exchanged to generate σ in the clear. In contrast to σ, the randomness used to perform the binary search is unknown to the parties, and the traversal to a leaf i of , together with the computation of ri, is done obliviously. At a given level i in the tree, each party prepares a sketch for all possible 2i internal nodes. Then SPIR can be used inside of a secure circuit ROM 111 to retrieve the sketches corresponding to the children of the current node in level i−1, combine the sketches, and choose which node to traverse in the sample according to the outputs of TPAP. In this way, the parties do not learn which nodes are traversed. Upon reaching a single coordinate i, the value g(xi, yi) is obtained using SPIR, and secret-shared by the parties.
Transformation Protocol
and is 0 otherwise. If
abort and output fail. The entire procedure is repeated
times, where ε in (0, 1) is an accuracy parameter, obtaining coins C1, . . . , Cs.
Step 3 shows that the process of
or B<1, where t can be any value, such as 8 in this embodiment. Step 4 shows that the output is
which is a private (ε, δ)-approximation protocol for ƒ(x, y)=Σj=1ng(xj, yj). Using an alternative notation,
is outputted as an estimate to g(a, b).
Sampling Protocol
on the left and right child, L and R, of q. By the properties of FHE, these values L and H are unknown to the parties.
As can be seen from
and
In step 1, an initialization process is performed where S=[n], δ=exp(−k),
β=1, and q to be a pointer to the root of a complete binary tree on n leaves. S is a simulator and is discussed below with respect to
In step 2, for j=1, 2, . . . , log n, in the j-th iteration, the following is performed. In sub-step 2(a), the PAP generator 110, for both Alice and Bob, breaks the coordinate set [n] into
contiguous blocks of coordinates x1, . . . , x2
on xl and yl for each lε[2j], using σ as the randomness for each execution. Let the resulting states of TPAP be stateA(1), stateA(2), . . . , stateA(2j) and stateB(1), stateB(2), . . . , stateB(2j), the ROM tables of the parties.
For example, for j=1 and x1, . . . , xn/2, the output of TPAP is Outa1 (stateA(1)), for j=1 and xn/2+1, . . . , xn, the output of TPAP is Outa2 (stateA(2)), for j=1 and y1, . . . , yn/2, the output of TPAP is Outb1 (stateB(1)), and for j=1 and yn/2+1, . . . , yn, the output of TPAP is Outb2 (stateB(2)). For j=i (some value between 1, 2, . . . , log n) and x1, . . . , xn/2i, the output of TPAP is Outa1 (stateA(1)), for j=i and xn/2i+1, . . . , x2n/2i to xn−n/2i+1, . . . , xn. the output of TPAP is Outa2 (stateA(2)) and Outa2
In sub-step 2(c), the secure circuit ROM performs the following algorithm. In sub-step 2(c)(i), the secure circuit ROM 111 maintains the state of q internally (it is secret-shared between the two parties). In sub-step 2(c)(ii), the secure circuit ROM 111 views the set [2j] as the internal nodes in the j-th level of a complete binary tree, using SPIR to retrieve stateA(L), stateA(R), stateB(L) and stateB(R), where L and R are the left and right child of q, respectively. For example, when j=i (shown in the example above) private information retrieval is performed where the value of Choice from previous iteration is used to privately and efficiently retrieve Outa2*Choice−1, Outa2*Choice, Outb2*Choice−1, and Outb2*Choice.
In sub-step 2(c)(iii), the secure circuit ROM 111 combines stateA(L) and stateB(L) to obtain
For example, L is set equal to Outa1−Outb1 and pL is the estimator associated with TPAP on input L. In another example, L is set equal to Outa2*Choice−1−Outb2*Choice−1 and pL is the estimator associated with TPAP on input L. The secure circuit ROM 111 combines stateA(R) and stateB(R) to obtain
For example, R is set equal to Outa2−Outb2 and pR is the estimator associated with TPAP on input R. In another example, L is set equal to Outa2*Choice−1−Outb2*Choice−1 and pL is the estimator associated with TPAP on input L. R is set equal to Outa2*Choice−Outb2*Choice and pR is the estimator associated with TPAP on input R. In sub-step 2(c)(iv), suppose first that (pL, pR)≠(0, 0). The secure circuit ROM 111 sets q to point to L with probability
and otherwise sets q to point to R. In the first case it sets
In the second case, the secure circuit ROM 111 sets
If (pL, pR)=(0, 0), the secure circuit ROM 111 outputs a pointer q to ⊥ and β remains the same. Using the first example discussed above, Choice set equal to 1 with probability of
and 2 with probability
Using the second example discussed above, Choice is then set equal to 2*Choice−1 with probability of
In sub-step 2(c)(v), if j=log n, the secure circuit ROM 111 outputs a secret-sharing (e, f) of q and β to the two parties.
In step 3, the PAP generator 110, for each of Alice and Bob, creates ROM tables for the entries of x and y, respectively. In step 4, the secure circuit ROM performs the following algorithm. In sub-step 4(a), the secure circuit ROM 111 uses inputs e and f to reconstruct q and β. If q points to ⊥, the secure circuit ROM outputs a secret-sharing of ⊥ to the two parties. Using the examples discussed above, Choice points to an index in {1, 2, . . . , n} and private information retrieval is used to obtain xChoice and yChoice. Otherwise, in sub-step 4(b), the secure circuit ROM uses SPIR to retrieve xq and yq, and computes g(xq, yq). In other words, the state of the previous iterations is used to compute the probability p that the protocol sets Choice to the current value. In sub-step 4(c), the secure circuit ROM puts
If p>1, output fail. Otherwise, with probability p, the secure circuit ROM, in sub-step 4(d), outputs a secret sharing of q to the two parties, else output a secret sharing of ⊥. In other words, a coin is outputted which is 1 with probability of
and is 0 otherwise. If
abort and output fail. In step 5, the entities output the output of the secure circuit evaluation in step 4.
Thus, the PAP generator 110 transforms any two-party protocol for approximating a function ƒ(x, y) of the form ƒ(x, y)=Σj=1ng(xj, yj), for any non-negative efficiently computable function g, into a PAP for ƒ(x, y) with the same communication, computation, and round complexity, up to an O*(1) factor (the computation also increases by an additive O*(n)).
In one embodiment, the parties run in O*(n) time with respect to the protocols of
A function h′ is functionally private with respect to a function h if there is an (n)-time simulator S for which for any input x, {S(h(x))}{h′(x)}. The illustrated embodiment defines a private approximation protocol of a function h. A two-party private (ε, δ)-approximation protocol of h is a private protocol that computes a randomized mapping ĥ satisfying the following two properties: 1) ĥ is functionally private for h, and 2) ĥ is an (ε, δ)-approximation of h.
It can be assumed, without loss of generality, that n is a power of 2. First, the importance sampling with regard to g is defined. In the g-sampling functionality, both parties receive integers B and k, as discussed above with respect to
where
The output is a secret-sharing of a random Iε[n]∪{⊥} from a distribution π′ with ∥π′−π∥1≦exp(−k). Throughout, TPAP (n′, ε′, δ′) is a protocol for (ε′, δ′)-approximating Σjg(xj, yj) on n′ coordinates. Suppose TPAP has r(n′, ε′, δ′) rounds, c(n′, ε′, δ′) total communication, and t(n′, ε′, δ′) total time. The importance sampling procedure provided by the protocol of
It will now be shown that for ζ=Θ(1/log n), the g-Sampler protocol correctly implements g-sampling functionality. Let I be the value secret-shared by the two parties upon termination of the protocol. It needs to be shown that I is sampled from a distribution π′ that has l1 distance exp(−k) from π. Consider the complete binary tree on coordinate set [n], and consider the 2n−1 subsets Sv associated with nodes v of . Since δ=exp(−k), by a union bound, for any subset Sv of coordinates associated with a node v of , TPAP on vectors x, y restricted to coordinates in Sv succeeds in providing a (1±ζ)-approximation with probability at least 1−(2n−1)exp(−k)=1−exp(−k). Let the random string σ used by the protocol be fixed, and condition on the event ε of it having this property. The protocol does not actually invoke TPAP on all subsets Sv, though it is assumed it is correct on all such Sv.
Fixing σ, all invocations of TPAP become deterministic, and so for each node vε, there is a well-defined probability rv, over the coin tosses of the binary search in step 2(c)(iv) that the protocol reaches node v. Namely, suppose v is at shortest path distance l from the root v0 of . Let v0, v1, v2, . . . , vl=v be the unique path from the root of to v. Let w1, w2, . . . , . . . , wl be the siblings of v1, v2, . . . , vl1, respectively. Then,
where the pv
Since it conditions on event ε, using the non-negativity of g, a telescoping is obtained:
for a small enough ζ=Θ(1/log n).
An analogous argument shows also that
Notice that these bounds on rv also hold if ΣjεS
But β=rq for a leaf qε, and by the above
and so p≦1. Hence, a fail is not outputted in step 4(c). It follows, for the fixed choice of σ, that the probability coordinate I=i is outputted is
Since there is a distribution, for fixed σ, it follows that
Event ε occurs with probability 1−exp(−k), and the above holds for any choice of σ for which ε occurs.
It will now be shown that the g-Sampler protocol can be implemented in O*(c(n, ζ, ⅓)) communication, a total of O*(t(n, ζ, ⅓)+n) time, and O*(r(n, ζ, ⅓)) rounds. In an embodiment where TPAP is a simultaneous protocol, there are log n iterations of step 2. In the j-th iteration, both parties invoke TPAP 2j times on inputs of size n/2j to achieve a (ζ, exp(−k))-approximation. Here, c(n, ζ, δ)=O(k)·c(n, ζ, ⅓), t(n, ζ, δ)=O(k)·t(n, ζ, ⅓), and r(n, ζ, δ)=O(k)·r(n, ζ, ⅓), since TPAP may be independently repeated O(log 1/δ) times and then calculate the median of its outputs.
Step 3 and step 4 of the g-Sampler protocol shown in
For the embodiment where TPAP is a general protocol, the entire g-Sampler protocol can be implemented using FHE. In the j-th iteration of step 2 of the g-Sampler protocol shown in
on the left and right child, L and R, of q. Since FHE only increases communication, round, and time complexities by a O*(k) factor (assuming the original time complexity is at least linear), this completes the proof.
It will now be shown that Main protocol of
Since B is halved in step 2c, by linearity of expectation, E[Ψ]=Σj=1ng(xj, yj). For the concentration, with probability 1−exp(−k), if B≧Θ(k)·Σj=1ng(xj, yj), then
On the other hand, if B=O(k)·Σj=1ng(xj, yj), then for sufficiently large l=O*(1), by a Chernoff bound:
and by a union bound one can assume this holds for all such values of B. If Σj=1ng(xj, yj)=0, Main outputs 0. Else, there is a B for which
it follows that in step 3
and this sum provides a (1±ε)-approximation to
with probability 1−exp(−k).
Now it will be shown that Main is functionally private. As can be seen from the exemplary pseudo code 400 of
and (b) B=B/2. In step 4, the above process is performed until
or B<1. In step 4, the output is
The probabilities zj=1 in the simulated and the real view differ only by a factor of 1±exp(−k). It follows that the distributions of Ψ and have Ψ′ have l1-distance exp(−k), which completes the proof.
Next it will be shown that the protocol is private and efficient and that Main satisfies the requirements of the definition given above with respect to a private two-party protocol Π of a randomized mapping h. The first part follows from the above. Based on the discussion above with respect to the g-Sampler protocol privately implementing the g-sampling functionality and the discussion with respect to a private oracle-aided protocol for h, the calls to g-Sampler can be replaced with an oracle functionality. Based on the discussion above with respect to a PIR (and hence an SPIR) scheme with C(n)=O*(n), the functionality in step 2 can be implemented privately. For efficiency, there is only an O*(1) overhead in each of these measures from that of protocol g-Sampler, so the lemma follows from the above discussion with respect to the g-Sampler protocol being implemented in O*(c(n, ζ, ⅓)) communication.
Accordingly, embodiments of the present invention provide private approximation protocols (PAPs) for various approximation functions For example, one embodiment provides the following general transformation: any two-party protocol for outputting a (1+ε)-approximation to ƒ(x, y)=Σj=1ng(xj, yj) with probability of at least ⅔, for any non-negative efficiently computable function g, can be compiled, via the PAP generator, into a two-party private approximation protocol with only a polylogarithmic factor loss in communication, computation, and round complexity. In general it is insufficient to use secure function evaluation or fully homomorphic encryption on a standard, non-private protocol for approximating f. This is because the approximation may reveal information about x and y that does not follow from ƒ(x, y).
In
The following are various examples of how the transformation discussed above can be applied. The first example is with respect to lp-Distances. Combining the above transformation with lp-estimation algorithms, for g(xj, yj)=|xj−yj|p near-optimal O*(n1-2/p) communication, O*(n) computation, and O*(1) round PAPs for the lp-distance, p>2, as well as a near-optimal O*(1) communication, O*(n) computation, and O*(1) round PAP for the l0-distance are obtained. No sublinear communication PAPs were known for these problems.
Even though PAPs or FPAs are known for pε(0, 2], the framework of embodiments of the present invention has several advantages. One is that the transformation avoids some rounding issues of real numbers needed to ensure FPA in previous works; in one embodiment the parties can compute g(xi, yi) to arbitrary precision after communicating xi and yi, where i is the coordinate sampled by the importance sampling procedure. Another advantage is that embodiments of the present invention transform any protocol for lp into a PAP, making new tradeoffs possible. Embodiments of the present invention can use protocols more suitable for inputs given as a list of ranges, with faster update time, or that use less randomness. For example, one embodiment improves the update time for l2 by a factor of k using a known algorithm with ε=1/log n (to do binary search), while for pε(0, 2) one embodiment improves by a factor of k/(loglog n) using a known algorithm. The communication of one embodiment is a factor of log2 n/k times that of a known algorithm.
The following example is with respect to heavy hitters and compressed sensing. Letting z=x−y, one embodiments want an r-sparse vector {tilde over (z)} with ∥z−{tilde over (z)}∥pp≦(1+ε)∥z−zopt∥pp, where zopt is an r-sparse vector minimizing ∥z−zopt∥pp. It is known that if only zopt is leaked, then Ω(n) communication is required. The problem is relaxed by allowing ∥z∥2 to also be leaked, and it is known how to near-optimally solve the heavy hitters problem for pε{1, 2} in this case.
Plugging the private lp protocols of one embodiment into the main protocol of a known algorithm, this embodiment improves this by showing how to near-optimally solve the problem of finding {tilde over (z)} with ∥{tilde over (z)}−z∥pp≦(1+ε)∥zopt−z∥pp leaking zopt and ∥z∥pp for every p≧0. If pε[0, 2], the communication is O*(1), while if p>2 the communication is O*(n1-2/p), which is required. The information this embodiment leaks is more natural than that leaked in the known algorithm, which for p=1 leaks ∥z∥2 and {tilde over (z)} rather than ∥z∥1 and {tilde over (z)}, the latter being equivalent to leaking ∥z−z∥1 and {tilde over (z)}, the error incurred by the sparse representation. One minor point is that the one embodiment needs a non-private near-optimal heavy-hitters protocol for every lp.
Another example is with respect to general similarity measures. While the transformation of one embodiment gives near-optimal PAPs for any function of the form ƒ(x, y)=Σj=1ng(xj, yj), for non-negative g, one may want to know for which g the one embodiment obtains PAPs with O*(1) computation, O*(n) computation, and O*(1) rounds. For this, the one embodiment uses a known theorem, which says the following for functions g(xj, yj)=h(xj−yj). Define πε(x) with respect to h, for ε>0, as πe(x)=min{x, min{|z|ε+:|h(x)−h(x+z)|>εh(x)}. Then a function h is tractable if h(1)>0 and ∀k, ∀N0∃t∀x, yε+, ∀Rε+∀ε:
This intuitively corresponds to functions h(x) that grow slower than x2. If h is tractable, h(0)=0, h is non-decreasing on ≧0, and h(x)=h(−x), then h can be computed in O*(1) space and 1-pass in a data stream. Assuming h can be computed in O*(1) time, the total time is also O*(n). It was observed that the known algorithm computes a linear sketch, thereby defining a sketching protocol, and via the transformation of one embodiment of the present invention, the first, and in fact near-optimal, PAP for any such h, which includes functions as bizarre as h(x)=(x(x+1))0.5 arctan(x+1). There is nothing close to an NBE for these problems, much less a sharply concentrated one.
An additional example is with respect to max-dominance norm, dominant lp-norms, and distinct summation. The Max-Dominance Norm is useful in financial applications and IP network monitoring. Alice has xε{0, 1, . . . , M}n, Bob has yε{0, 1, . . . , M}n, and the max-dominance norm is Σj=1n max(xj, yj). This problem, and its generalization, the dominant lp-norm (Σj=1n max(xj, yj)p)1/p for p>0 have been studied. There are no sharply concentrated NBEs known for p>0. For example, the estimators Z are distributed as p-Fréchet, which, if the dominant lp-norm is c, have Pr[Z>z]=1−exp(−cpz−p). For p≦1, there is no expectation, while for general p these are heavy-tailed, so there is a non-negligible (1/(n)) probability of observing a value that is (n) times c. Nevertheless, the known algorithms give (ε, δ)-approximations for these problems in O*(1) space, and by the transformation of one embodiment of the present invention, near-optimal PAPs are obtained. The one embodiment also gets a near-optimal PAP for the related distinct summation problem in sensor networks, which also does not have a sharply concentrated NBE. Here, for each jε[n] there is a vjε{1, . . . M} and Alice has either (j, vj) or (j, 0), while Bob has either (j, vj) or (j, 0). The problem is to compute Σdistinct(j, v
The next example is with respect to entropy with relative error. Entropy
is defined for inputs x, y with (x+y)iε≧0 for all iε[n]. Here, if xi+yi=0,
is interpreted as 0. The variables xi or yi are allowed to be negative, but require their sum to be non-negative. This is the strict turnstile model in streaming, for which entropy is well-studied, and sketching algorithms with relative error, O*(1) space and update time are known. There are no known NBEs concentrated enough to achieve relative error. The natural NBE is to sample a coordinate i with probability
and output
However, while the estimator is unbiased, the concentration is poor and can only be used to achieve additive error. One embodiment of the present invention achieves relative error. H(x, y) is not in the class of functions handled by the transformation of the one embodiment. The crucial observation is that for any parameter T≧Σj=1nxj+yj, the function
also has an efficient relative error algorithm, given the values T and Σj=1nxj+yj. Indeed, the one embodiment runs an efficient algorithm for H(x, y), gets Ĥ, and outputs
The additive error is at most
The one embodiment fixes T=Σj=1nxj+yj and in recursive calls in the binary search uses the same value of T rather than ΣjεSxj+yj for the set S under consideration (so the one embodiment recursively computes HT rather than H). In the outer level of recursion, H(x, y)=HT(x, y), and HT has the form of the transformation, so the one embodiment gets a PAP for H(x, y) with relative error. The one embodiment does not need FHE, since Σj=1nxj+yj can be obtained using SFE.
Another example is with respect to lp-sampling and cascaded moments, with applications. An important primitive is to return a sample according to the distribution π, which is used for purposes other than estimating Σj=1ng(xj, yj). This is useful for cascaded moments, earthmover distance, and non-bipartite matching, as well as machine learning problems such as classification and minimum enclosing ball (here g(z)=z2), and forward sampling in a database. There are no known NBEs for any of these problems, much less sharply concentrated ones. The importance sampling procedure of embodiments of the present invention directly and near-optimally solves this sampling primitive privately.
As an example application, it is known to estimate the cascaded moment Fq(Fp(A)) of a n×d matrix A, defined as Σi=1n(Σj=1d|Ai, j|p)q, for integers q, p and give a near-optimal O*(n1-2/(qp)d1-2/p) space algorithm for integers q≧p≧2. It is also known how to achieve near-optimal space for q=1 and any p, and near-optimal space for Fq(P) for any q. To obtain a PAP, one embodiment of the present invention first uses the importance sampling procedure to sample a row Ai with probability
for a constant C>1 and an upper bound B on Fq(Fp(A)). The crucial observation is that Fq(Fp(A))=Πj
However, the one embodiment cannot do this in a black box fashion, since it needs an approximation s to the probability (i, j1), . . . , (i, jq) is sampled to then compute |Ai, j
for a constant C′ it can ensure is at least 1; and then do rejection sampling to output a coin with bias
Using the information-truncation technique, the one embodiment thus obtains a PAP with only an O*(1) overhead.
Yet a further example is given with respect to subspace approximation and sampling blocks. Approximating a point set by a subspace is known in the linear algebra field. The particular form considered is in the form of regression, and in the form of approximation to a fixed subspace. In the setting of one embodiment of the present invention, Alice has n×d matrix A, Bob has n×d matrix B, and C=A+B, representing n records each with d attributes. They want to secret share a core-set, i.e., a small weighted subset of rows of C so that later, for any fixed j-dimensional subspace F of d, cost(C, F)=Σi=1ndist(Ci, F) can be (1+ε)-approximated from the core-set with functional privacy and probability 1−exp(−k). Here, dist is l2-distance of a point to a subspace.
One embodiment first reviews a core-set construction, where the main algorithms are DimReduction and AdaptiveSampling. Assume the dimension j of the query subspace is constant. It is known how to efficiently obtain an O(1)-approximation Dj to the best j-subspace using approximate volume sampling. Then, r=O(ε−2 log 1/δ) samples s1, . . . , sr drawn with replacement from C, where
Point si is assigned weight
For each si, let si′=proj(si, Dj), the projection of si onto Dj, which is assigned a weight of
Finally, all points are projected onto Dj. In recursive steps, an O(1)-approximation Dj−1 to the best j−1-subspace of proj(C, Dj) is found, and the above sampling procedure is repeated. The recursion stops when all points are projected to the origin. The weighted core-set is the union of the si and si′ over the j+1 stages. It has been shown that for any fixed subspace F, the sum of (weighted) distances of core-set points to F is an unbiased estimator of cost(C, F) and is an (ε, δ)-approximation.
While some embodiments of the present invention have an NBE, and in this case making δ=exp(−k), a sharply concentrated one, the obstruction is that there is no way of implementing the NBE in a communication-efficient manner Indeed, even obtaining an approximation to each ∥Ci∥2 required Ω(n) communication, and it is unclear how to use these to obtain an NBE for subspace approximation. First the PAP of one embodiment is described assuming additively homomorphic encryption, which achieves O*(d2) communication, O*(nd) work, and O*(1) rounds. Then it is shown how to reduce the communication to near-optimal O*(d) assuming FHE.
Consider the quantity F1(l2(C))=Σi=1n∥Ci∥2. One embodiment uses the same approach as for cascaded moments to first sample a row Ci with probability
using that an O*(1)-communication and O*(nd)-computation protocol for (ε, δ)-approximation to F1(l2) exists. Now ∥Ci∥2 cannot be expressed as a low-degree polynomial, but the one embodiment uses SPIR to retrieve Ai, Bi, then compute ∥Ci∥2 exactly with O*(d) communication, which allows rejection sampling to be done to output a coin with bias
for an upper bound B. One embodiment repeatedly halves B until a sample Ci
and is additively shared. An SFE computes the d×d projection matrix P1 corresponding to Ci
Given this implementation of approximate volume sampling, implementing a known algorithm can again be done by sampling a homomorphically encrypted row according to its l2 norm (these rows are now normal and projection vectors). Inductively, the entire procedure of the known algorithm can be implemented this way. Setting δ=exp(−k), one embodiment of the present invention gets a sharply concentrated NBE. The critical use of the transformation of the one embodiment was to privately obtain a sample according to its l2-norm in an unbiased way. The PAP of the one embodiment generalizes to sampling rows (blocks) according to any norm (not just l2).
To achieve communication O*(d), note that the projection matrices Pi have rank at most j, so can instead be communicated using FHE with O*(d) bits. There is an Ω(d) lower bound, which follows even to store a core-set consisting of a single point.
A further example is given with respect to l2-distance to independence of datasets. In the streaming version of the problem: Alice has (i, j, ai, j)ε[n]2×{0, 1, . . . , M}, and Bob has (i, j, bi, j)ε[n]2×{0, 1, . . . , M}. Define the joint probabilities
and marginals
This obtains an (ε, δ)-approximation for h(a, b)=Σi, j(pi, j−qirj)2 in O*(1) space in O*(n2) time. The algorithm chooses independent 4-wise independent vectors u, vε{−1, +1}n, maintains s=Σi, juivj(ai, j+bi, j), t1=ΣiuiΣj(ai, j+bi, j), t2=ΣjvjΣi(ai, j+bi, j), and L, and computes
It averages out O(−2) independent copies, and takes the median of O(log 1/δ) independent averages. The algorithm is not an NBE due to the median operation.
To obtain a PAP, one embodiment of the present invention combines the techniques used for entropy and cascaded moments. First, the one embodiment treats q, r, and L=Σi′, j′ai′, j′+bi′, j′ as fixed, coming from the outer level of recursion. Define
The key observation is that the sketch provides an (ε, δ)-approximation even if p, q, and r are arbitrary vectors (of dimension n2 and n, respectively). The one embodiment samples an i*ε[n], expressing h(a, b, q, r, L) as
and uses binary search to obtain an i*ε[n] with probability
for an upper bound B on h(a, b, q, r, L) and a C>1 that can be computed. In the binary search, the one embodiment sums over all i and j in sketches t1, t2, and L above, but for s only sums over the i in the current candidate set (though other embodiments sum over all jε[n]).
Given the fixing of i*, a coordinate j* is sampled next. This is done by halving the candidate set for j* recursively. In the first step of the binary search, since the parties do not know i* (it is secret-shared), they construct sketches siA(L)=Σj=1n/2uivjai, j, siA(U)=Σj=n/2+1nuivjai, j, siB(L)=Σj=1n/2uivjbi, j, and siB(U)=Σj=n/2+1nuivjbi, j for each iε[n], and SPIR is used for an SFE to retrieve si*A(L), si*A(U), si*B(L), and si*B(U). Future steps are similar, resulting in a sampled pair (i*, j*) with probability
for a value C′>1 that can be computed, and an upper bound B on h(a, b, q, r, L)=h(a, b). Via rejection sampling, one or more embodiments can flip a coin with probability
and these embodiments can halve B, etc., in a simulatable way to obtain an (ε, δ)-approximation of h(a, b).
Non-Limiting Examples
Aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.), or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system. Also, aspects of the present invention have been discussed above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium include computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. A computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiments above were chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Number | Name | Date | Kind |
---|---|---|---|
6931400 | Indyk et al. | Aug 2005 | B1 |
7237116 | Dwork et al. | Jun 2007 | B1 |
7415464 | Indyk et al. | Aug 2008 | B1 |
7487150 | Brown et al. | Feb 2009 | B2 |
7562077 | Bisson et al. | Jul 2009 | B2 |
7627602 | Indyk et al. | Dec 2009 | B2 |
20060095378 | Aggarwal et al. | May 2006 | A1 |
20060245587 | Pinkas et al. | Nov 2006 | A1 |
20070081664 | Avidan et al. | Apr 2007 | A1 |
20080275908 | Indyk et al. | Nov 2008 | A1 |
20100057805 | Indyk et al. | Mar 2010 | A1 |
20100063974 | Papadimitriou et al. | Mar 2010 | A1 |
Entry |
---|
“Polylograithmic Private Approximations and Efficient Matching”, Piotr Indyk and David Woodruff, TCC 2006, LNCS 3876, pp. 245-264, 2006. |
“A Firm Foundation for Private Data Analysis”, Cynthia Dwork, Microsoft Research, ACM, pp. 1-8, Jan. 2011. |
“On Communication Protocols that Compute Almost Privately”, Marco Comi, Bhaskar DasGupta, Michael Schapiram and Venkatakumar Srinivasan, pp. 1-12, Feb. 7, 2011. |
Freedman, Michael J., et al., “Efficient Private Matching and Set Intersection,” EUROCRYPT 2004, pp. 1-8. |
Gentry, C., “Fully Homomorphic Encryption Using Ideal Lattices,” STOC '09, May 31-Jun. 2, 2009, Bethesda, Maryland, Copyright 2009, ACM 978-1-60558-506-02/09/05. |
Guha, S., et al., “Sketching Information Divergences,” COLT 2007: 424-438. |
Guha, S., et al., “Streaming and Sublinear Approximation of Entropy and Information Distances,” SODA '06, Jan. 22-26, Miami, Florida, Copyright 2006, SIAM ISBN 0-89871-605-05/06/01. |
Harvey, M.J.A., et al., “Sketching and Streaming Entropy Via Approximation Theory,” 49th Annual Symposium on Foundations of Computer Science, FOCS 2008, pp. 489-498, 2008. |
Hemenway, B., et al., “Public Key Encryption Which is Simultaneously a Locally-Decodable Error-Correcting Code,” Electronic Colloquium on Computational Complexity, Report No. 21, Publication: Mar. 13, 2007. |
Indyk, P., et al., “Optimal Approximations of the Frequency Moments of Data Streams,” STOC '05, May 22-24, 2005, Baltimore, Maryland, Copyright 2005 ACM 1-58113-960-0/05/0005. |
Indyk, P., et al., “Polylogarithmic Private Approximations and Efficient Matching,” Copyright Springer-Verlag Berlin Heidelberg 2006. |
Jayram, T.S., et al., “The Data Stream Space Complexity of Cascaded Norms,” 2009 50th Annual IEEE Symposium on Foundations of Computer Science, Atlanta, Georgia, Oct. 25-27, 2009, pp. 765-774. |
Kane, D.M., et al., “An Optimal Algorithm for the Distinct Elements Problem,” PODS'10, Jun. 6-11, 2010, Indianapolis, Indiana, Copyright 2010 SCM 978-1-4503-0033-9/10/06. |
Kawachi, A., et al., “Multi-Bit Cyrptosystems Based on Lattice Problems,” Public Key Cryptography 2007, pp. 315-329. |
Melchor, C.A., et al., “Additively Homomorphic Encryption with d-Operand Multiplications,” published on the web at: http://eprint,iacr.org/2008/378.pdf, 2008. |
Kilian, J., et al., “Fast Private Norm Estimation and Heavy Hitters,” R. Canetti (Ed.): TCC 2008, LNCS 4948, pp. 176-193, 2008, Copyright International Association for Cryptologic Research 2008. |
Melchor, C.A., et al., “Lattice-based Homomorphic Encryption of Vector Spaces,” ISIT 2008, Toronto, Canada, Jul. 6-11, 2008, 978-1-4244,2571-6/08, copyright 2008 IEEE. |
Melchor, C.A., et al., “A Lattice-Based Computationally-Efficient Private Information Retrieval Protocal,” WEWORC Jul. 2007, Bochum, Germany; Cryptology ePrint Archive: Report 2007/446. |
Misra, J. et al., “Finding Repeated Elements,” Department of Computer Science, Cornell University, Ithaca, NY., TR 82-505, Jul. 1982. |
Monemizadeh, M., et al., “1-Pass Relative-Error Lp-Sampling with Applications,” 20th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2010) pp. 1143-1160. |
Nelson, J., et al., “A Near-Optimal Algorithm for L1-Difference,” CoRR Electronic Colloquium on Computational Complexity (ECCC) 16:46 (2009). |
Number | Date | Country | |
---|---|---|---|
20120260348 A1 | Oct 2012 | US |