The present invention relates to the field of distributed computing, such as cloud computing. More specifically, the present invention is related to a method of securely executing an unbounded input stream by non-interactive, multi-party distributed computation.
Cloud computing is a form of distributed computing over a network, with the ability to run a program on many connected computers at the same time. The same concept is also used for private storage. Such distributed computing is directed to running applications that share data or support the critical operations of an enterprise, with rapid access to flexible and low cost IT resources. These services are based on-demand delivery of IT resources via the Internet with pay-as-you-go pricing, and are offered by several vendors such as Microsoft and Amazon Web Services. However, the cloud computing model cannot really protect user's privacy, since the user cannot be sure that there will be no leakage of some data, on which he has no control.
Information theoretically secure multi-party computation implies severe communication overhead among the computing participants, as there is a need to reduce the polynomial degree after each multiplication. In particular, when the input is (practically) unbounded, the number of multiplications and therefore, the communication bandwidth among the participants may be practically unbounded. In some scenarios, the communication among the participants should better be avoided altogether, avoiding linkage among the secret share holders. For example, when processes in computing clouds operate over streaming secret shares without communicating with each other, they can actually hide their linkage and activity in the cloud. An adversary that is able to compromise processes in the cloud may need to capture and analyze a very large number of possible shares.
If a dealer wants to repeatedly compute functions on a long file with the assistance of in servers, the dealer does not wish to leak either the input file or the result of the computation to any of the servers. There are two constraints: (1) The dealer is allowed to share each symbol of the input file among the servers and is allowed to halt the computation at any point (the dealer is otherwise stateless). (2) each server is not allowed to establish any communication beyond the shares of the inputs that it receives and the information it provides to the dealer during reconstruction.
Secure multi-party computation (MPC) is a powerful concept in secure distributed computing. The goal of secure MPC is to enable a set of in mutually distrusting parties to jointly and securely compute a function ƒ of their private inputs, even in the presence of a computationally unbounded active adversary Adv. For example, two millionaires can compute which one is richer, without revealing their actual worth. In secure MPC, two or more parties want to conduct a computation based on their private inputs, but neither party is willing to disclose its own input to anybody else.
Secure multi-party computation participants can compute any function on any input, in a distributed network where each participant holds one of the inputs, ensuring independence of the inputs, correctness of the computation, and that no information is revealed to a participant in the computation beyond the information that can be inferred from that participants' input and output. Like other cryptographic protocols, the security of MPC protocol can rely on different assumptions:
Secure multi-party computation can be realized in various settings for computing general functions. However, the general scheme may be impractical due to efficiency reasons, partly due to the communication required among the participants.
In communicationless information theoretically secure multi-party computation over long input streams, a dealer D may secretly share an initial value among the in servers (participants). Subsequently, the dealer is responsible for handling the input stream (or an input file) and distributing appropriate shares to the participants. If the dealer is assumed to be a stateless dealer, the dealer is allowed to temporarily store the current input to the system, process the input and send (not necessarily simultaneously) secret shares of the inputs to the participants. One of the participants may act as the dealer, or the participants may alternate among themselves in serving as the dealer. In such a case, one participant communicates with the rest to convey the input (shares), still the inherent quadratic complexity needed to reduce the polynomial degree in the classical information theoretically secure multi-party computation should be avoided. Moreover, in case the input symbols have been shared and assigned to the participants in the initialization phase, every participant can independently (and asynchronously) process the shares of the input, and sends the result when the global output has to be determined. For example, assigning shares of a file up-front to participants to allow repeated search of patterns, without revealing neither the file nor the search result to the participants. No participant returns any information back during the execution of the algorithm. At any point in the execution, the dealer may ask some participants to send their results back, then the dealer can reconstruct the actual result of the algorithm.
Benaloh et al (“Secret sharing homomorphisms: Keeping shares of a secret sharing” CRYPTO, Lecture Notes in Computer Science, pp 251-260, 1986) describes the homomorphism property of Shamir's linear secret sharing scheme, with the help of communication to decrease the polynomial degree. Cramer et al. (“Share conversion, pseudorandom secret-sharing and applications to secure computation”, Lecture Notes in Computer Science, pp 342-362, 2005) presented a method for converting shares of a secret into shares of the same secret in a different secret-sharing scheme using only local computation and no communication between players. They showed how this can be combined with any pseudorandom function to create, from initially distributed randomness, any number of Shamir's secret-shares of (pseudo)random values without communication. Damgard et al. (“Efficient conversion of secretshared values between different fields”, IACR Cryptology ePrint Archive, 2008) showed how to effectively convert a secret-shared bit over a prime field to another field. By using a pseudorandom function, they showed how to convert arbitrary many bit values from one initial random replicated share.
Waters (“Functional encryption for regular languages. In Safavi-Naini and Canetti, pp 218-235 provides a functional encryption system that supports functionality for regular languages. In this system a secret key is associated with a deterministic finite automaton (DFA) M. A ciphertext, ct, encrypts a message msg associated with an arbitrary length string w. A user is able to decrypt the ciphertext ct if and only if the automaton M associated with his private key accepts the string w. Motivated by the need to outsource file storage to untrusted clouds while still permitting limited usage of that data by third parties, Mohassel et al. (“An efficient protocol for oblivious dfa evaluation and applications”, Lecture Notes in Computer Science, pp 398-415, 2012) presented practical protocols by which a client (the third-party) can evaluate a DFA on an encrypted file stored at a server (the cloud), once authorized to do so by the file owner. However, all the above schemes are based on unproven, commonly believed to be hard mathematical tasks and are not information theoretically secure.
Dolev et al. (“Secret swarm unit, reactive k-secret sharing” Lecture Notes in Computer Science, pp 123-137, 2007 and “Reactive k-secret sharing”, Ad Hoc Networks, 2012) presented the settings for infinite private computation and presented few functions that can operate under a global input. Dolev et al. (“Swarming secrets”, 47th annual Allerton conference on Communication, control, and computing, 2009) presented schemes that support infinite private computation among participants, implementing an oblivious universal Turing machine. At each single input of the machine, participants need to broadcast information in order to reduce the degree of the polynomial used to share secrets. Based on combination of secret-sharing techniques and the decomposition of finite state automata, Dolev et al. “Secret sharing krohn-rhodes: Private and perennial distributed computation”, ICS, pp 32-44, 2011) proposed the first communicationless scheme for private and perennial distributed computation on common inputs in a privacy preserving manner, assuming that even if the entire memory contents of a subset of the participants are exposed, no information about the state of the computation is revealed. This scheme does not assume a priori bound on the number of inputs. However, the scheme assumes a global input which reveals information on the computation and the computational complexity of the algorithm of each participant is exponential in the automata number of states. Relying on the existence of one-way functions or common long one time pads, Dolev et al. showed how to process a priori unbounded number of inputs for inputs over a Finite State Automaton (FSA) at a cost that is linear in the number of FSA states. Although the authors can hide the current state of the FSA, the dealer must supply the input symbols in plain text to each participant.
Ostrovesky et al. (“Private searching on streaming data. Journal of Cryptology, 2007) defined the problem of private filtering where a data stream is searched for predefined keywords. The schemes are also implemented by Paillier homomorphic cryptosystem. The proposed scheme has been by reducing the communication and storage complexity.
Fully Homomorphic Encryption Gentry et al (“Fully homomorphic encryption using ideal lattices”, STOC, pp 169-178 ACM, 2009) presented the first fully homomorphic encryption (FHE) scheme which is capable of performing encrypted computation on Boolean circuits. A user specifies encrypted inputs to the program, and the server computes on the encrypted inputs without gaining information concerning the input or the computation state. Following the outline of Gentry's, many subsequent FHE schemes are proposed and some of which are even implemented. However, the FHE schemes that follow the outline of Gentry's original construction are inefficient in that their per-gate computation over-head is a large polynomial in the security parameter and are furthermore only computationally secure.
All the above schemes are based on unproven commonly believed to be hard mathematical task and therefore, are not information theoretically secure.
It is therefore an object of the present invention to provide a method for securely executing an unbounded input stream by non-interactive, multi-party distributed computation of a specific type of automata.
It is another object of the present invention to provide a method for securely executing an unbounded input stream by non-interactive, multi-party distributed computation, in which computation is carried out by several participants over unbounded stream of secret shared inputs.
It is a further object of the present invention to provide a method for securely executing an unbounded input stream by non-interactive, multi-party distributed computation, in which the participants do not communicate among themselves throughout the execution.
Other objects advantages of the present invention will become clear as the description proceeds.
The present invention is directed to a method of securely executing an unbounded or practically unbounded input stream, by non-interactive, multi-party computation, comprising the following steps:
The automaton may be a reset automaton, or a permutation automaton, where all the component permutation automata are powers of the same automaton.
The results and inputs of the first equation may be used to compute the result of the subsequent equation.
Each cascaded equation may be mapped to an automaton by mapping variables of the equations into a node of the automaton.
Several cascade automata may be executed in parallel, to get a product of automata.
At the execution stage, the dealer may repeatedly send secret shares of the input stream and each party computes new values.
In one aspect, the execution of cascaded equations automata is performed by:
The communication-less information theoretically secure multi-party computation may be performed over practically infinite input streams, or oven infinite input streams.
The dealer may temporarily store and process the input stream, and send different secret shares of the input streams to the parties, which do not communicate with each other.
The parties may not return any information back.
At any point in the execution, in response to a call to the parties from the dealer to send their partial results back, the dealer may reconstruct the actual computation result, based on the partial results.
The series of cascaded equations may be executed serially, starting from the first equation, then the second equation and so forth, until the execution of the last equation is completed.
The cascaded equations may be executed by the parties by:
In one aspect, during the initial stage:
The automaton may be executed to obtain:
A string matching search may be performed on a file by:
A copy of the automaton may be sent to each cloud in different time.
The present invention is also directed to a method of securely executing a bounded input stream, by non-interactive, multi-party computation, comprising the following steps:
Wrong shares elimination may be carried out whenever one or more parties send corrupted information.
The accumulating automaton may be a DAG Accumulating Automaton (DAA) represented by a Directed Acyclic Graph (DAG) structure.
The accumulating automaton may be marked by a vector of values, one integer value for each node in the accumulating automaton.
The accumulating automaton may be executed by:
In one aspect, whenever communicationless multi-party computation is required, using in servers, the following steps are performed:
The DAA may be executed to obtain:
The DAA may be implemented as a flip flop automaton.
The present invention is further directed to a system for securely executing an unbounded or practically unbounded input stream of symbols, by non-interactive, multi-party computation, which comprises:
In the drawings:
The present invention proposes a scheme for information theoretically secure, non-interactive, multi-party computation of a specific type of automata. The computation is carried by several participants over unbounded stream of secret shared inputs and these participants do not communicate among themselves throughout the execution. At any stage of the scheme, the input symbol and current state of the original automaton are concealed perfectly against any coalition of participants that is not larger than a given threshold.
The scheme performs the computation correctly for any finite-state automaton, which can be described as a cascade product (or equivalently wreath product) of component automata of two types.
A component automaton is either a reset automaton, or a permutation automaton, where all the component permutation automata are powers of the same automaton.
It is required that the parties process an unbounded stream of input and that the scheme be non-interactive. In addition, it is required that the input stream is not public, but is shared among the parties so that any small coalition of participants can't obtain any input symbol. Using this approach, both the state of A and the input stream are secret for any small coalition of parties.
A scheme to correctly compute the final state of an automaton is presented, where all parties share the FSA A and a dealer has a secret initial state. The dealer distributes shares of the secret state to the participants, which then receive a stream of input.
For each input symbol that arrives, each of the participants receives only a share of the symbol. This way, any small enough coalition of parties (excluding the dealer) does not have any information on the initial state or on any input symbol. Finally, given a signal from the dealer, the participants terminate the execution and submit their internal state to the dealer, who computes the current state that defines the computation result.
In the proposed scheme, the dealer correctly computes the final state of A if A can be represented as a cascade (wreath) product of component automata of the following two types: the accumulating automata and the cascaded equations automata. Both types of automata will be used in the construction of secure and private multi-party computation among participants that use no communication among themselves while processing practically or really unbounded input stream.
According to the present invention, a very long file can be secret shared and stored. None of the parties can gain any information about the files. String matching (searches) can be repeatedly performed by these cloud agents without them knowing anything concerning the search result.
It is assumed that there is one dealer D who wants to perform secure private computation over a very long input stream which may be actually unbounded. The dealer uses in cloud servers or agents P1, . . . , Pm which perform a computation over the input stream received from D. The dealer D sends different input shares to every agent. Agents do not communicate with each other. Any agent cannot learn anything about the original inputs that D partitions to shares, as the dealer uses Shamir's secret sharing to partition any symbol of the original input to be sent to the agents. At any given stage, the dealer D may collect the state of the agents and obtain the computation result. The agents use memory that is logarithmic in the length of the input, and therefore, can accommodate practically unbounded inputs.
String matching is a basic task used in a variety of scopes. A pattern (string) has to be found as part of text processing, also as part of malware (virus) defense, pattern recognition, bioinformatics and database query. It is possible to use the method proposed by the present invention to perform string matching that can support database updates, such as delete or insert operations.
The inputs are text and a pattern, the pattern is usually much shorter than the text. The goal is to find whether the pattern appears in the text or not.
A simplified non-secure version of the algorithm will be described first, followed by the way to obtain information theoretically secure computation extending the simplified version.
Each input value assigns a new (integer) value to every node. Ni(j) denotes the value of the node Ni immediately after step j. According to the pattern, an input vector {right arrow over (v)} is defined, in which each element matches one corresponding element in the pattern. Since the pattern consists of four characters, {L, O, V, E}, a vector of four binary values that represents each possible character in the pattern is used. If the input character does not appear in the pattern, then the value of the vector {right arrow over (v)} is set to (0,0,0,0). In particular, when the input symbol is O then the vector {right arrow over (v)} is set to (0,1,0,0) and when the input symbol is, say C, then the vector {right arrow over (v)} is set to (0,0,0,0). The value of N1 is initialized to be 1 and is unchanged during the entire string matching process. For any given input vector (v1,v2,v3,v4), the values of all the marking of nodes of the graph are simultaneously computed as follows
N
2
(i+1)
=N
1
(i)
·v
1
N
3
(i+1)
=N
2
(i)
·v
2
N
4
(i+1)
=N
3
(i)
˜v
3
N
5
(i+1)
=N
5
(i)
+N
4
(i)
˜v
4 Eq. (1)
Equation 1 defines the transition functions for the string matching algorithm. N5, which is an accumulated node, accumulates values, while the rest of the nodes recompute values based only on values of the neighboring nodes.
At any time, the value of the node N5 can be checked. If N5>0 then there is at least one match. Actually, the value of the node N5 encodes the number of times the pattern has occurred in the input stream. It is assumed that the number of occurrences does not exceed the maximal integer that the system can maintain and represent for N5.
The following example presents a secure multi-party string matching algorithm using Shamir's secret sharing scheme to mimic the algorithm presented above. Among the whole protocol, the computation field is a big finite field. It is assumed that all the computations will not overflow during the execution of the protocol.
Nodes' values are shared among several participants using secret sharing and so are the entries of the vector that represent each symbol of the input text. It is assumed that the input symbols are represented by secret shares of polynomial of degree 1. Since the transition function includes multiplication, the degree of the polynomial that encodes the value of a certain node is one more than the degree of the preceding node. In this particular example, at least six participants should be used to ensure that the result encoded in N5 can be decoded.
For simplicity, it is assumed there are six participants P1, . . . , P6 that undertake the task of multi-party computation string matching. For the five nodes of the graph, five random polynomials f1 to f5 are defined, where fi is of degree i. Each corresponding polynomial is used to secret share each node's initial value among the six participants, where each partner Pi receives one share. The initial share of the node N1 that is maintained by the participants Pi is denoted by SP
Each symbol a is mapped to an input vector {right arrow over (v)}. Then each element in the input vector {right arrow over (v)} is secret shared into six parts by a random polynomial of degree 1. Each share of the input vector is then sent to one of the participants. For the participant Pi, 1≦i≦6, the corresponding shares of the input vector are denoted (Si,v
Immediately after processing the kth input symbol, the value of the (share of) node Nj that is stored by the participant P, is denoted as SP
S
P
,N
(k+1)
=S
i,v
S
P
,N
(k+1)
=S
P
,N
(k)
·S
i,v
S
P
,N
(k+1)
=S
P
,N
(k)
·S
i,v
S
P
,N
(k+1)
=S
P
,N
(k)
·S
i,v
S
P
,N
(k+1)
=S
P
,N
(k)
+S
P
,N
(k)
·S
i,v
Whenever it is desired to compute the result of the algorithm, all the participants are asked to send the value that corresponds to N5 back. Having the shares of all participants, it is possible to construct the actual value of N5 using Lagrange interpolation. The value obtained indicates whether the search is successful in finding the string or not.
The greatest value is associated with the node N5 where this value represents the number of times the pattern was found in the text, namely, is bounded by the length of the input text. Thus, for every practical system a field that can be represented by a counter of, say, 128 bits will surely suffice.
The participants do not know the inputs and the results during the entire execution of the string matching. It is possible to secure the pattern by executing such string matching over all possible strings, collect all results, and compute only the result of the pattern of interest.
The above method also works for simultaneous multiple strings matching, which means that it is possible to search more than one string simultaneously. An example of a directed graph for matching more than one string at the same time is described in
To allow any string matching, the basic wildcard characters (characters that can be used to substitute for any other character or characters in a string) “?” and “*”, will be implemented.
String Matching Algorithm with Question Mark in the Pattern
A character “?” is a character that may be substituted by any single character of all the possible characters. The directed graph for the matching algorithm that includes a question mark in the pattern is described in
The transition function for this algorithm given in Equation 2 is similar to the one defined by Equation 1. In Equation 2, each node value is computed depending on the input and/or the previous state of the node. At step k, under each input, for each node Ni, the next value is computed as follows:
String Matching Algorithm with a Star Wildcard in the Pattern
A wildcard character “*” is a character that may be substituted by any number of the characters from all the possible characters. The directed graph for the matching algorithm for a pattern with a star is described in
The transition function for this algorithm given in Equation (4) is similar to the one defined by Equation (3). In step k, under each input, for each node Ni, is computed as follows:
In subsection 2.1 it is shown how to perform a basic string matching algorithm on a directed graph. In subsection 2.2 a secure and private implementation of the algorithm in the scenario of multi-party computation without communication, is detailed. Based on the basic implementation methods for implementing complicated string matching algorithms with wildcards in the pattern are presented. Thus, it is possible to implement (practically) any string matching algorithm securely and privately without communication between participants. The limitation of the value in the accumulating nodes is only theoretic, as for any text length n (say, even of practically not existing length of 2128 characters) and a pattern that yields l accumulating nodes, l·log n bits are needed to encode a state. The field of the numbers should be n or (slightly) larger.
The string matching scheme is generalized by defining general accumulating automata (AA) and then, ways are shown to implement DAG accumulating automata (DAA) that are directed acyclic (not necessarily connected) graphs structure (DAG automata are natural extensions of tree automata, operating on DAGs instead of on trees). Then, ways to mark the AAs and the corresponding semantics for such marking are defined.
Accumulating automata are state-transition systems, as defined next:
Definition 1 An accumulating automaton is a triple A=(V, Σ, T) where:
One may consider a more expensive operation, where δ(p,r,α)=q, p·r·α=q (or even an operation that multiplies the marking of more than two nodes). This type of operation yields an addition of the degree of the polynomial used to secret share the node.
An accumulating automaton is represented by a (possibly disconnected) directed graph where each regular node is depicted by a circle, accumulating node by two concentric circles, and transitions by (directed) arcs. Input symbols are labeled as symbols above the corresponding transitions.
Definition 2 (DAG Accumulating Automata—DAA): An accumulating automaton that defines a graph G that is acyclic, namely, without cycles and self-loops is a DAG accumulating automaton. In other words, DAG accumulating automaton is an accumulating automaton for which it holds that for any p in V, there does not exist α1, . . . , αnεΣ and δ1, . . . , δnεT, such that
Moreover, for every p and q in V, if there exists αεΣ such that δ(p,α)=q then p≈q.
Definition 3 (Marking of accumulating automata): A marking of an accumulating automaton A=(V,Σ,T) is a vector of values, one integer value for each node in V. A marked automaton A is a 4-tuple (V,Σ,T,M), where M is the marking vector.
Definition 4 (Execution semantics of AA): The behavior of an accumulating automaton is defined as a relation on its markings, as follows. Assuming that immediately after the j step, node pi has the value nP
A simple example of a DAG accumulating automaton DAAαβγ =(V,Σ,T) is illustrated in
The initial marking of the automaton is:
The initial marking automaton is depicted in
Executing the DAA means to retrieve symbols one by one from the input stream and input to the DAA. The input triggers transitions of the automaton, resulting in a new marking. Assuming that the input symbol is α, then the input vector is set to {right arrow over (v)}=(v0, v1,v2, v3)=(0,1,0,0), and the new marking of the automaton computed.
The transitions are computed as follows
N
1
(1)
=v
0=0;
N
2
(1)
=N
1
(0)
·v
1=1
N
3
(1)
=N
2
(0)
·v
2=0;
N
4
(1)
=N
3
(0)
·v
3=0
Here, the new marking of the automaton is as in
The marking of DAAαβγ is changed by the input symbol that is sent by the dealer. At any time, it is possible to check the marking of DAAαβγ. If the marking is (0,0,1,0), then the input stream is αβ. An accumulating automaton DAAαβγ can be used to accept the language αβγ. The marking of DAAαβγ reveals whether the input stream is accepted or not.
The marking of the automaton under all possible input streams to check whether the automaton represents the function properly or not, will be analyzed. Prior to the first input the marking of DAAαβγ is (1,0,0,0) and the state of the automaton is “rejected”. If (1) the first input symbol is not α; (2) the first input symbol is α, the second symbol is not β; (3) the first two input symbols are αβ, the third symbol is not y, then the marking of DAAαβγ is (0,0,0,0), and therefore the state of the automaton is “rejected”. In all the three cases above, any successive additional input symbol will not change the marking of the automaton to (0,0,0,1), thus, implying that the whole input stream will be rejected. In other words, if and only if the first three input symbols are αβγ, then the marking of DAAαβγ is (0,0,0,1), and the state of the automaton is “accepted”. Any extra input(s) will change the marking of the automaton to (0,0,0,0), and the state of the automaton is also changed to “rejected”.
It is assumed that one dealer wants to execute DAA under a long input stream with the help of in servers without the leakage of the marking of the automaton and the whole input to the automaton. The dealer may secret share the marking of the DAA into in shares and assign each share to one of the servers. When the dealer wants to execute the DAA, he secretly shares each input into in shares and sends each share to a distinct server. Each server will manipulate its local DAA share and local input share to obtain a share of the new marking of the DAA. At some point, the dealer will ask all the servers to send shares back and use these shares to construct the current marking of the original DAA.
An unprivileged subgroup of the servers will have no information concerning the inputs (but an upper bound on its length) and/or the computation result. The servers do not know, in terms of information theoretical security, the actual value of the input sequence and the marking of the DAA.
Before stating the relationship between DAA and secret sharing, a route and the polynomial degree of a node in (the graph G of) a DAA are defined, as well as the polynomial degree of the entire DAA. The accumulating field of a DAA is also defined. A sequence of nodes {Ni
δj
The longest route always starts in a free node, i.e., a node with no incoming arcs. Let t be the secret sharing threshold, the minimal number of participants needed to reveal the automaton state, where t−1 is the polynomial degree in which the marking of the free nodes and the inputs are encoded.
Assuming t to be the secret sharing threshold, for any node N in a DAA, if the maximal length of a route from a free node to Ni is len, the polynomial degree of Ni is deg=(len+1)(t−1). The greatest polynomial degree of a node in a DAA is defined to be the polynomial degree of the DAA.
An accumulating automaton with cycles (beyond self-cycles with corresponding character 1 as demonstrated in the sequel) implies an infinite polynomial degree.
For any DAA with polynomial degree d, it is possible to implement and execute the DAA among d participants without communication and hide the (practically) unbounded input stream except an upper bound on the length of the input.
The maximal number that should be represented by a marking variable in a DAG accumulating automaton DAA is defined as accumulating field of DAA.
A sufficient accumulating field should be used to avoid overflow during the execution. The total number of accumulating nodes an≦|V| and the maximal number of active outgoing edges aoe≦|V| of a node, imply a bound on the accumulating field. Each edge is active when the dealer assigns 1 to the label of the edge. Unlike traditional deterministic automaton, in this case, there can be several edges from one node with the same label that lead to (at most |V|−1) distinct nodes. Note that aoe is bounded by |V|−1. The worst case is considered, where all accumulating nodes are lined one after the other (possibility according to a topological sort output), each multiplying its value by the number of outgoing arcs as an input to the next node in the line. Basically, for bounding the possible values, the maximal value that can be accumulated in the ith node is considered to be the value that is added after multiplication by aoe, to the marking of the (i+1)st node with each input.
For an input stream of length n and a constant sized DAA the computing field of each node is in Θ(log n) bits.
Some applications of a DAG accumulating automaton will be described, which can recognize regular language, context free language and context sensitive language. Also several extensions are present, to the transition function of directed accumulating automaton, namely: the possibility of the dealer to ignore characters, the possibility of loops with unconditional arcs, denoted by the label 1, and harvesting of result by comparing values. In some cases, the graph of the DAA is not connected, thereby allowing the implementation of every connected component by a different set of participants. The structure and initial marking of each DAA that can recognize a particular language in the above classes, are given. Every DAA can be securely and privately executed according to the presented scheme.
Assuming an automaton Aff depicted in
DAG accumulating automaton DAAff of flip flop automaton can be found in
The alphabet of DAAff is Σ={α,β}. On initializing the automaton, N1 is set to 1, N2 is set to 1 and N3 is set to 0. Let the (k+1)th input symbol be mapped to {right arrow over (v)}=(v0,v1,v2). The dealer will send different mapping vector depending on different input symbol. If the input symbol is α, {right arrow over (v)} is set to (1,1,0). If the input symbol is β, {right arrow over (v)} is set to (1,0,1). If the input symbol is γ, the dealer will discard it. Such an action is allowed by the dealer, as well as sending spontaneous inputs and several characters in one input vector simultaneously. Then, the new value of all the nodes is computed as follows
N
1
(k+1)
=v
0
N
2
(k+1)
=N
1
(k)
·v
1
N
3
(k+1)=N
1
(k)
·v
2 Eq. (5)
After any input symbol, it is possible to check the marking of DAAff. If N2 is 1, the current state of automaton Aff is S0. If N3 is 1, the current state of automaton Aff is S1.
According to the transitions of DAAff, it can be seen that if the input symbol is α, N2 will be set to 1. Also, if the input symbol is β, N1 will be set to 0 and N2 will be set to 1.
DAG accumulating automaton of the algorithm described in
The first node N1 is initially set to 1 while all the other nodes are initially set to 0. For each input symbol, the new marking of the automaton is computed. Let the (k+1)th input symbol be mapped to {right arrow over (v)}=(v1,v2). If the input symbol is α, {right arrow over (v)} is set to (1,0). If the input symbol is β, {right arrow over (v)} is set to (0,1).
The new value of all the regular nodes is computed as follows:
N
1
(k+1)
=N
7
(k)
N
2
(k+1)
=N
1
(k)
·v
1
N
3
(k+1)
=N
2
(k)
·v
2
N
4
(k+1)
=N
3
(k)
·v
1
N
6
(k+1)
=N
1
(k)
N
7
(k+1)
=N
6
(k) Eq. (6)
The new value of accumulating node N5 is computed as follows
N
5
(k+1)
=N
5
(k)
+N
1
(k)
·v
2
+N
2
(k)
·v
1
+N
3
(k)
·v
2 Eq. (7)
After any input symbol, it is possible to check the marking of DAA(αβα)*. Only if N1=1 and N5=0, the input stream is accepted, otherwise rejected.
It should be noted that among the self-loop defined by N1, N6 and N7, the degree for the secret sharing is not changed, since it involves multiplication by a constant 1.
According to the transitions of DAA(αβα)*, it is clear that in the initial marking of the automaton, N4 is set to 1, N5 is set to 0. Also, if the input stream is (αβα)*, N4 will be set to 1, N5 stay 0. Also, if the input stream is not (αβα)*, N4 will be set to 0 and/or N5 will not be 0.
The free node N1 is initially set to 1 while all the other nodes are initially set to 0. For each input symbol, the new marking of the automaton is computed. Let the (k+1)th input symbol be mapped to {right arrow over (v)}=(v0, v1, v2) where v0 is always set to 0. If the input symbol is α, {right arrow over (v)} is set to (0,1,0). If the input symbol is β, {right arrow over (v)} is set to (0,0,1).
The new value of all the regular nodes is computed as follows
N
1
(k+1)
=v
0
N
2
(k+1)
=N
1
(k)
·v
1
+N
7
(k)
N
3
(k+1)
=N
2
(k)
·v
1
N
4
(k+1)
=N
3
(k)
·v
2
N
5
(k+1)
=N
1
(k)
·v
1
+N
4
(k)
·v
1
N
6
(k+1)
=N
2
(k)
N
7
(k+1)
=N
6
(k) Eq. (8)
The new value of the accumulating node N8 is computed as follows
N
8
(k−1)
=N
8
(k)
+N
1
(k)
·v
2
+N
2
(k)
·v
2
+N
3
(k)
·v
1
+N
4
(k)
·v
2
+N
5
(k)
·v
2 Eq. (9)
After any input symbol, it is possible to check the marking of DAAα(αβα)*. Only if N5=1 and N8=0, the input stream is accepted, otherwise rejected.
According to the transitions of DAAα(αβα)* it is clear that in the initial marking of the automaton, N5 is set to 0, N8 is set to 0. Also, if the input stream is α, N5 will be set to 1, N8 will stay 0. Also, if the input stream is α(αβα)*, N5 will be set to 1, N8 will stay 0; (4) if the input stream is not α(αβα)* or α, N5 will be set to 0 or N8 will not equal 0.
Recognizing the Context Free Language as αsβs
Initial Marking and Execution of DAAα
All the free nodes N1, N3, N5 are initially set to 1 while the other nodes are initially set to 0. Let the (k+1)th input symbol be mapped to {right arrow over (v)}=(v0, v′0, v″0, v1, v2), where v0, v′0 will always be set to 1 and v″0 will always be set to 0. When the new marking of the automaton is computed, v0 is given to N1, v′0 is given to N3 and v′0 is given to N5. If the input symbol is α, {right arrow over (v)} is set to (1,1,0,1,0). If the input symbol is β, {right arrow over (v)} is set to (1,1,0,0,1).
The new value of all the regular nodes is computed as follows
N
1
(k+1)
=v
0
N
3
(k+1)
=v′
1
N
5
(k+1)
=v″
1 Eq. (10)
All the accumulating nodes are computed as follows
N
2
(k+1)
=N
2
(k)
−N
1
(k)
·v
1
N
3
(k+1)
=N
4
(k)
−N
3
(k)
·v
2
N
6
(k+1)
=N
6
(k)
−N
5
(k)
·v
2
N
7
(k+1)
=N
7
(k)
−N
6
(k)
·v
2
N
8
(k+1)
=N
8
(k)
−N
7
(k)
·v
1
+N
5
(k)
·v
2 Eq. (11)
Result of DAAα
After any input symbol, it is possible to check the marking of DAAα
Correctness of DAAα
According to the transitions of DAAα
Recognizing the Context Sensitive Language αsβsγs
Initial marking and execution of DAAα
All the free nodes N1, N3, N5, N7 are initially set to 1 while the other nodes are initially set to 0. Let the (k+1)th input symbol be mapped to {right arrow over (v)}=(v0, v′0, v″0, v″0, v1, v2, v3), where v0, v′0, v″0 will always be set to 1 and v″0 will always be set to 0. When the new marking of the automaton is computed, v0 is given to N1, v′0 is given to N3, v″0 is given to N5 and v″0 is given to N7. If the input symbol is α, {right arrow over (v)} is set to (1,1,0,1,0). If the input symbol is β, {right arrow over (v)} is set to (1,1,0,0,1).
The new value of all the regular nodes is computed as follows
N
1
(k+1)
=v
0
N
3
(k+1)
=v′
1
N
5
(k+1)
=v″
1 Eq. (12)
All the accumulating nodes are computed as follows
N
2
(k+1)
=N
2
(k)
+N
1
(k)
·v
1
N
4
(k+1)
=N
4
(k)
+N
3
(k)
·v
2
N
6
(k+1)
=N
6
(k)
+N
5
(k)
·v
1
N
7
(k+1)
=N
7
(k)
+N
6
(k)
·v
0
N
8
(k+1)
=N
8
(k)
+N
7
(k)
·v
1
+N
5
(k)
·v
2
In Equation 12, there are three equations e1, e2, e3 with three variables N1, N2, N3 and three inputs v1, v2, v3.
When the input stream is actually unbounded, the present invention used a cascaded equations automata (which is a novel type of automata).
Cascaded equations will be defined first, as well as their execution. Then mapping of the cascaded equations into an automaton will be described.
Definition 7 (Execution of cascaded equations) Cascaded equations is a series of equations e1,ef2, . . . , ef, where the results and inputs of the first equations e1, ef2, . . . , ei are used to compute the result of the next equation ei+1. On the other hand, an equation et cannot use the result of any ej such that j<i. Cascaded equations are computed serially from e1 to ef. The first equation is computed, then the second and so on. At the end, the last equation is computed.
Given the following cascaded equations:
e
1
:N
1
(k+1)
=N
1
(k)
+v
1
e
2
:N
2
(k+1)
=N
2
(k)
+N
1
(k+1)
v
2 Eq. (13)
There are two equations, e1 and e2, in the cascaded equations described in Eq. (13). The two equations compute a vector of variables (N1,N2) using a vector of inputs (v1, v2). Before executing the cascaded equations, the variables of the vector are initialized. Then, at the execution stage, new values for N1 and N2 are compute in a sequential fashion using modular two arithmetics, first using e1 to compute N1 and then using e2 to compute N2. The input symbols (v1, v2) may have one of the possible values (00),(01),(10) and (11). The state of the automaton is defined by a vector of the values of N1 and N2. The vector may have the following values (00),(01),(10) or (11). A node of the automaton is denoted s(N1, N2) and the input vector is denoted (11), (10), (01) and (00) by α, β, γ and τ respectively. By computing the cascaded equations using modular two arithmetics, it is possible to obtain the automaton depicted in
Next, the following cascaded equations are given:
e
1
:N
1
(k+1)
=v
1
e
2
:N
2
(k+1)
=N
2
(k)
+N
1
(k+1)
·v
2 Eq. (14)
By computing the cascaded equations in Eq. (14) using arithmetic modular two, it is possible to get the corresponding automaton which is depicted in
Secret sharing is used to allow secure multi-party executions of the cascaded equations automata. The execution of cascaded equations automata is performed into three stages: initial stage, execution stage and collection stage. The automaton in
Each variable's values in the cascaded equation automata are shared among several participants using secret sharing. Entries of the vector that represent each symbol of the input stream are also secret shared. For the particular example in
The dealer maps each input symbol a to an input vector {right arrow over (v)}. Then each element in the input vector {right arrow over (v)} is secret shared into three parts by a random polynomial of degree 1. Each share of the input vector is then sent to one of the participants. Each participant computes the new value of N1 and N2, according to Eq. 13. Then, every participant gets the new share of N1 and N2.
Whenever it is desired to compute the result of the algorithm, all the participants are asked to send the value that corresponds to N and N2 back. Having the shares of all participants, it is possible to reconstruct the actual value of N1 and N2 using Lagrange interpolation. The value obtained indicates the current state of the automaton in
Mapping from cascaded equation to automaton Each equation in cascaded equations has a result. The results of the equations are selected to define a vector, the values of which encodes a node in the cascaded equations automaton. A vector of variables of the cascaded equations is regarded as the input symbols to the mapping automaton.
Every cascaded equations can be mapped to an automaton by mapping variables of the equations into a node of the automaton.
The cascaded equations automata scheme information theoretically secures the inputs and the states of the automaton.
A product of automata may be defined by executing several cascade automata in parallel. Two or more cascade equations with the same input can be merged together to obtain a new automaton.
Given that A=A1× . . . ×Ak is a cascade product of automata and B is a permutation automaton, |A1|= . . . =|Ak|=|B| and assuming that for every i=1, . . . , k the automaton Ai is either a reset automaton or a permutation automaton that can be represented by a cascaded equation, where all transitions are in the same cyclic group as the transitions of B, Then, A can be secretly shared for unbounded split input by n+1 parties with threshold 1 where n is computed as follows:
Let Φi be a function of the input and the states of A1, . . . , Ai−1 that outputs the input for Ai. By representing Φi as a multivariate polynomial, its highest degree is of the form
x
1
α
· . . . ·x
i
α
ni is Defined to be ni=n1·α1+ . . . +ni−1·αi−1. Then, n is defined by max(n1, . . . , nk).
This result can be further generalized by having B (and each At) be either a reset automaton or a set of non-intersecting permutation automata (i.e., there are several non-intersecting sets of nodes, where each node is a permutation automaton). One additional generalization is the use of other modular operation (beyond mod 2) and hence larger fields.
The realization of non-permutation automaton as illustrated in
Secure and Private Repeated Computations on a Secret Shared File
The methods introduced above are implemented on a fixed (large) file. Firstly, the file (e.g., biometric data) is secret shared and the shares are stored in clouds for future computation. Then it is possible to repeatedly and iteratively compute (for example, search the file for different strings) on the secret shared file by constructing the accumulating automaton for the needed computation and sending a copy of the automaton (possibly in different times) to each cloud that maintains shares of the file. Then, each cloud perform calculations on the accumulating automaton using their file share as the input. At the end, each cloud sends the final state of the accumulating automaton back as an answer for the computation request. The final states received from the cloud allow the reconstruction of the state of each node of the accumulating automaton to obtain the computation result (for example, whether or not the string was found). This scheme is depicted in
In this stage, the basic parameters for the whole scheme are defined: the Alphabet (e.g., ASCII, binary) that the scheme works on, the computation field of all the accumulating automata and the highest polynomial degree the system can deal with.
In this stage, the given file f, the chosen Alphabet and number of clouds are used to output secret shares of each character of the file, where each character is encoded by a vector of secret shares, one secret share for each possible character.
This stage uses the user computation task as an input and outputs an automaton.
This stage uses the accumulating automaton and the shares of the file to output the result of the computation. The result is the share of the final marking of the accumulating automaton.
This is the final stage, in which the user receives the marking shares to output the computation result.
If a provider Peter wants to store a network log file in clouds and a user user David wants to search the string “attack America” in the file. But Peter does not want to give the whole file to David in clear text.
Firstly, Peter uses the Initialization stage to produce stream of shares of his log file (vector of shares for each character, character after a character) and then store each stream in a different cloud (or cloud virtual machine) not necessarily simultaneously. Clouds are not aware about their counterparts in the process. Then, David uses the Automaton Construction stage to get an accumulating automaton for the searching task (in some cases it is possible to give different independent parts of the accumulating automaton to different clouds). David sends the accumulating automaton to each cloud. Every cloud runs the Automaton Execution stage on its share of the file and the accumulating automaton. Each cloud sends the marks of the final states of the accumulating automaton back to David. David executes the Result Reconstruction stage to find the computation result. During the whole procedure, no cloud knows the exact network log file and only David knows the computation result.
It is possible to execute any string matching privately and securely in terms of information theoretically security. Other canonical examples of regular languages, context free languages and context sensitive languages can be computed efficiently in terms of information theoretical security. Remote authentication and data stream processing systems using cloud services can be implemented based on the schemes proposed by the present invention. It is also possible to design a general accumulating automata (in the style of FPGA), in which each original symbol is mapped to several symbols, so that the dealer is able to choose the non-participating arcs by always assigning zero to their labels. The information sent by malfunctioning participants or even malicious participants may be eliminated from the collected information by standard error correcting schemes, such as the Berlekamp Welch method (described in U.S. Pat. No. 4,633,470).
The above examples and description have of course been provided only for the purpose of illustration, and are not intended to limit the invention in any way. As will be appreciated by the skilled person, the invention can be carried out in a great variety of ways, employing more than one technique from those described above, all without exceeding the scope of the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IL2014/050372 | 4/23/2014 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
61815748 | Apr 2013 | US | |
61870838 | Aug 2013 | US |