This disclosure is related to the technical field of machine learning, and in particular, to method, device and system for building blockchain-based secure aggregation in federated learning with data removal.
Over the past few years, Blockchain has drawn significant attention from both academy and industry. Blockchain is a novel paradigm where distrustful parties make transactions and manage data without involving a trustworthy third-party. Blockchain achieves tamper-resistance and traceability for the transactions, offering anonymity and decentralization for the parties. Due to these features, Blockchain can be applied to a wide spectrum of applications, such as cryptocurrency, financial services, crowd-sourcing systems, Vehicular Ad Hoc Networks (VANETs) or the like.
Federated learning (FL), also known as collaborative learning, is a machine learning technique that trains an algorithm via multiple independent sessions, each using its own dataset. This approach stands in contrast to traditional centralized machine learning techniques where local datasets are merged into one training session, as well as to approaches that assume that local data samples are identically distributed. Federated learning addresses critical issues such as data privacy, data security, data access rights and access to heterogeneous data. Its applications engage industries including defense, telecommunications, Internet of Things, and pharmaceuticals.
Blockchain-based federated learning provides a potential approach to addressing these critical issues in connection with the federated learning, due to the characteristics of the blockchain. However, FL's evolving landscape introduces fresh challenges, such as the “Right to be Forgotten” (RTBF). The privacy and security problems in the scenario of RTBF have not been adequately considered. Therefore, there's need for new approaches for the Blockchain-based federated learning.
In a first aspect of the disclosure, there is provided a method for building blockchain-based secure aggregation in federated learning with data removal, applied by a server in a system for building blockchain-based secure aggregation in federated learning with data removal, comprising:
In an implementation, the method further comprises: for a (i+1)-th iteration, sending a list of a third quantity of client nodes together with an aggregate result of model training information previously transmitted from the second quantity of client nodes, to each of the third quantity of client nodes, for allowing the third quantity of client nodes to reconstruct local models to be used in the (i+1)-th iteration, wherein the third quantity of client nodes are a subset of the second quantity of client nodes.
In an implementation, the symmetric bivariate polynomial denoted by F(x, y) and the asymmetric bivariate polynomial denoted by G(x, y) are selected and sent to the client nodes by a one-time dealer in an initialization process.
In an implementation, for any client node id, the pairwise seed computed for another client node with identifier id′ from the symmetric bivariate polynomial is fid1(id′)=F(id, id′), where F(id, id′)=F(id′, id).
In an implementation, the cypher text cid generated by the client node id is generated by:
wherein xid represents the model training information obtained by the client node id, U1 represents a set of identities of the first quantity of client nodes, PRG represents a Pseudo-Random Generator, gid2(0) represents the private seed computed from the asymmetric bivariate polynomial G(0, id), p represents a roughly λ-bit prime, and λ is a security parameter.
In an implementation, the method further comprises:
In a second aspect of the present disclosure, there is provided a method for building blockchain-based secure aggregation in federated learning with data removal, applied by a first client node in a system for building blockchain-based secure aggregation in federated learning with data removal, comprising:
In an implementation, the symmetric bivariate polynomial F(x, y) and the asymmetric bivariate polynomial G(x, y) are selected and sent to the client nodes by a one-time dealer in an initialization process.
In an implementation, the method further comprises:
In an implementation, for the first client node id, the pairwise seed computed for another client node with identifier id′ from the symmetric bivariate polynomial is fid1(id′)=F(id, id′), where F(id, id′)=F(id′, id).
In an implementation, the cypher text cid generated by the client node id is generated by:
wherein xid represents the model training information obtained by the client node id, U1 represents a set of identities of the first quantity of client nodes, PRG represents a Pseudo-Random Generator, gid2(0) represents the private seed computed from the asymmetric bivariate polynomial G(0, id), p represents a roughly λ-bit prime, and λ is a security parameter.
In an implementation, the method further comprises:
In an implementation, the reconstructing the local model to be used in the (i+1)-th iteration comprises:
In an implementation, the method further comprises:
In a third aspect of the present disclosure, there is provided a system for building blockchain-based secure aggregation in federated learning with data removal, comprising a server, a plurality of client nodes, and a block chain, wherein:
In an implementation, the symmetric bivariate polynomial is F(x, y) and the asymmetric bivariate polynomial is G(x, y), the pairwise seed computed for another client node with identifier id′ from the symmetric bivariate polynomial is fid1(id′)=F(id, id′), where F(id, id′)=F(id′, id), the cypher text cid generated by the client node id is generated by:
wherein xid represents the model training information obtained by the client node id, U1 represents a set of identities of the first quantity of client nodes, PRG represents a Pseudo-Random Generator, gid2(0) represents the private seed computed from the asymmetric bivariate polynomial G(0, id), p represents a roughly λ-bit prime, and λ is a security parameter.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description serve to explain the principles of the present disclosure.
In order to more clearly illustrate technical solutions in embodiments of the present disclosure or in the conventional technology, the accompanying drawings to be used in the description of the embodiments or the conventional technology will be briefly introduced below. Obviously, other drawings may be obtained from these drawings by the skilled in the art without any creative effort.
The exemplary embodiments of the present disclosure are described below in detail with reference to the drawings. It should be understood that the exemplary embodiments described below are used only to illustrate and interpret the present disclosure and are not intended to limit the present disclosure.
It should be noted that the exemplary embodiments of the present disclosure and features in the exemplary embodiments may be combined with each other in the case of no conflict, and all the combinations fall within the protection scope of the present disclosure. In addition, although a logical order is shown in the flowchart, the steps shown or described may be performed in a different order from the order here in some cases.
In implementations, a computing device that performs a method for building blockchain-based secure aggregation in federated learning with data removal may include one or more processors (CPU, Central Processing Module), an input/output interface, a network interface and a memory.
The memory may include a volatile memory, a random access memory (RAM) and/or a non-volatile memory and other forms in a computer readable medium, for example, a read-only memory (ROM) or a flash RAM. The memory is an example of the computer readable medium. The memory may include a module 1, a module 2, . . . , and a module N (N is an integer greater than 2).
The computer readable medium includes non-volatile and volatile media as well as removable and non-removable storage media. A storage medium may store information by means of any method or technology. The information may be a computer readable instruction, a data structure, and a module of a program or other data. A storage medium of a computer includes, for example, but is not limited to, a phase change memory (PRAM), a static random access memory (SRAM), a dynamic random access memory (DRAM), other types of RAMs, a ROM, an electrically erasable programmable read-only memory (EEPROM), a flash memory or other memory technologies, a compact disk read-only memory (CD-ROM), a digital versatile disc (DVD) or other optical storages, a cassette tape, a magnetic disk storage or other magnetic storage devices, or any other non-transmission media, and may be used to store information accessible to the computing device.
In this disclosure, a method, device and system for building blockchain-based secure aggregation in federated learning with data removal are provided.
Reference is made to
The core attraction of FL lies in its decentralized nature, where data remains locally stored among participating clients for training a machine learning model. Each client 102 shares model training information such as local model parameters or gradients (rather than data) to a server 101 which then combines all the model parameters or gradients to create a global model in a sharing round; this global model is then sent back to the clients for further training on their local data, and the process repeats, until the model convergences.
Nonetheless, as FL has gained prominence, it has also become a focal point for various privacy threats and integrity-compromising attacks. Specifically, the disclosure of local data privacy can inadvertently occur through the leakage of local gradient information. Furthermore, the adoption of global models introduces its own set of privacy risks. For example, a participating client may craft malicious updates with the intention of stealing private information from other participants. This can be achieved by observing the variations between the global models during successive training processes, such as discerning individual typing habits in a keyboard application. On the flip side, trust in the server 101 is not always guaranteed, and there is a potential for the server 101 to incorrectly aggregate models or return a global model that has been compromised, thereby posing a significant threat to data value and data privacy.
FL's evolving landscape introduces fresh challenges, such as the “Right to be Forgotten” (RTBF). In this context, participating clients may find the need to revoke the usage of their data during FL training, referred as the paradigm of federated unlearning, e.g., a client may request an application to stop tracking and collecting their text data.
There is need for addressing the privacy and integrity concerns in the RTBF context for two critical reasons below:
Still referring to
In some implementation, the training procedure is addressing the optimization problem of minMΣi=1nf(M, Di), where f(M, Di) is the objective function for measuring the effectiveness of the parameters M in modeling the local data Di, and n is the number of participating clients 102. Initially, the server 101 distributes a global model with initial parameters M to all clients 102. Each client 102 participates in solving the optimization problem of minM
where η is the learning rate and Bi is a batch of data randomly chosen from the local data Di. During a sharing round, the server 101 selects a subset of clients 102 to share their local models with the server 101, which then integrates them into a global model by
according to an aggregation (or an averaging) rule. The aggregate result is then downloaded again by the clients 102, and is used to update their local models, and the clients 102 continue training over the same local data. Let Mj and Mji denote the global model and the i client's local model at the jth iteration, respectively.
The sever 101 coordinates the whole training procedure. The server 101 may store the per-client 102 local models and the global models in persistent storage in each sharing round.
There are a large number of clients 102 in dynamic connection with the server 101 via a network, e.g., WiFi. Each client 102 trains a local model on its local training data, and reports local model parameters to the server 101 in selected iterations.
During the process of cross-iteration training, some cases may arise, owing to various constraints faced by end clients 102, such as limited computing power, communication constraints, short battery life, or lack of interest in participating. With reference to the
In some examples, there is a potentially malicious server 101 capable of executing erroneous aggregate computations. The server 101 could produce malformed aggregate outcomes. Furthermore, the server 101 might encompass deliberate substitution of model parameters from some participating clients 102, or the strategic deletion of model parameters from clients 102 requesting data erasure. Detecting such misbehavior becomes more intricate during unlearning. For instance, the server 101 might aim to exclude an honest client's model update during regular learning, or irregularly incorporate a model update from a revoked client 102 during unlearning. However, honest clients 102 lack the ability to differentiate between revoked and non-revoked clients 102, exacerbating the challenge of identifying the server 101 misbehavior. The server 101 could also glean sensitive insights from each client's training data by initiating inference attacks on local or global models across multiple training iterations. In an especially malicious scenario, the server 101 might collaborate with a small subset of participating clients 102 to execute inference attacks. This disclosure seeks to alleviate privacy threats stemming from data deletion. In this context, adversaries could compromise the privacy of deleted data by analyzing models before and after unlearning.
It is assumed that clients 102 are honest-but-curious, which means they can honestly follow the training protocol, but try their best to learn private information of counterparts. Moreover, due to the inherent nature of FL, participating clients 102 consistently acquire global models irrespective of any revocation. Consequently, data that has been revoked might be susceptible to an attack from a client 102 capable of launching membership inference attacks on a global model before and after its deletion, similar to the malicious server 101 scenario previously mentioned. It is important to note that there are no overlapping data points across clients 102′ individual training sets. An embodiment of the present disclosure accounts for a fraction α of clients 102 that may drop out, as well as a fraction β that might become corrupted. Notably, the combined value of α and β does not exceed ½. An embodiment of the present disclosure imposes a requirement where at least a fraction ω of clients 102 must remain non-malicious to preserve privacy throughout each round of secure aggregation. It is notable that ω is set such that ω≥1−α−β. Lastly, an embodiment of the present disclosure assumes the presence of an authenticated secure channel between any two clients 102, possibly achieved through a signature-based mechanism or similar means.
Model Privacy. In normal case, both local model privacy and global model privacy are protected: (1) Local model privacy. Each client's local models are protected against the server 101 and other clients 102. Aggregate results do not reveal any individual local model of a client. (2) Global model privacy. Aggregate results can only be accessed by participating clients 102, and cannot be obtained by the server 101.
Aggregation Verifiability. Some embodiments of the present disclosure provide aggregation verifiability for a participating client 102 to fast verify if aggregate outcomes are generated correctly. Regardless of deletion, the client 102 can be convinced that the aggregate out-comes are properly generated, rather than unexpectedly removing a participating client's local models, or non-compliantly including a revoked client's local models.
Now referring to
In S201, the server selects, from the plurality of client nodes in the system, a first quantity of client nodes, to participate in an i-th iteration of the federated learning, where i is an integer.
In S202, the server sends a list of the selected client nodes to each of the first quantity of client nodes.
For example, the server may send a list U1 of the identifiers of the first quantity of client nodes, to each of the selected client nodes.
In S203, the server may acquire model training information transmitted by each of a second quantity of client nodes among the first quantity of client nodes. The model training information is transmitted from the client nodes to the server in a form of cypher text. The client nodes separately train respective local models with their own local training data to obtain respective model training information.
In an embodiment of the present disclosure, the cypher text is generated by each client node through performing homomorphic encryption on his model training information based on pairwise seeds computed for each of the first quantity of client nodes from a symmetric bivariate polynomial and a private seed computed from asymmetric bivariate polynomial.
In an embodiment of the present disclosure, the symmetric bivariate polynomial F(x, y) and the asymmetric bivariate polynomial G(x, y) are selected and sent to the client nodes by a one-time dealer in an initialization process.
In an embodiment of the present disclosure, during the initialization process, system parameters are initialized, including n selected client nodes, their public and distinct identifiers id∈U={1, . . . , n} which are not zero, security parameter λ, threshold value t and h, the length of model training information (for example the gradient) parameters m, a large prime finite field R=Zp, where p is a roughly λ-bit prime larger than n, a pseudo-random generator PRG {0, 1}*→Zm.
In an embodiment of the present disclosure, the one-time dealer is trusted, and is configured to select two secret keys s1, s2 from Zp; generate a symmetric bivariate polynomial associated with s1, F(x, y)=a00+a10x+a01y+a11xy+ . . . +at, xtyt(mod)p in which ai,j=aj,i, ai,j, aj,i∈Zp, i, j∈[0, t] and let F(0,0)=s1, and then share each client node id∈ [n] with the secret key shares{fid1(y)=F(id, y), fid2(x)=F(x, id)}, by separately substituting id for variable x and for variable y; generate an asymmetric bivariate polynomial of s2, that is, G(x, y)=b00x+b01y+b11xy+ . . . +bt,hxtyh(mod p) in which bi,j≠bj,i, bi,j, bji∈Zp, i∈[0, t], j∈[0, h], and let G(0,0)=s2, and then share each client node id∈[n] with the secret key shares {gid1(y)=G(id, y), gid2(x)=G(x, id)}.
In a preferable embodiment of the present disclosure, in the polynomials F(x, y) and G(x, y), the degree t, h satisfies the relationship of (t+1)t−1<h<n.
By utilizing the symmetric and asymmetric bivariate polynomials obtained in the initialization process, each client node can perform homomorphic encryption on his own model training information to obtain the cypher text of the model training information.
For example, for any client node id, the client node id can calculate a pairwise seed computed for another client node in U1 with identifier id′ from the symmetric bivariate polynomial as pairwise seeds fid1(id′)=F(id, id′). Due to the symmetric characteristics of F(x, y), F(id, id′)=F(id′, id). In addition, the client node id can calculate, from the asymmetric bivariate polynomial, a private seeds gid2(0).
In an exemplary embodiment of the present disclosure, with the calculated seeds and together with the initialized pseudo-random generator PRG, the client node id can calculate the cypher text cid of the model training information xid, for example,
As described above and will be understood by those skilled in the art, there may be cases such as rejection or dropout in which some of the client nodes in U1 may be unwilling to share his model training information in the i-th iteration or may drop out owing for example to disconnecting from the server or running out of local resources. In such scenarios, the server could not acquire the cypher text of model training information from all of the first quantity of client nodes, but only from a subset U2 of the first quantity of client nodes, namely, acquiring the cypher text of model training information from the second quantity of client nodes. In other words, U2⊆U1 and |U2|>t.
In S204, the server may aggregate the cypher text of the model training information, acquired from the second quantity of client nodes, to obtain an aggregate result.
In an embodiment, the server may obtain the aggregate result Σcid∈Rm.
In S205, the server may broadcast a list of the second quantity of client nodes and the aggregate result via a blockchain.
In an embodiment of the present disclosure, the server may broadcast the list ({id}id∈U
In an embodiment of the present disclosure, the client nodes may go through a consistency check thereafter. In this consistency check, each client node agrees the set of surviving counterparts, and the set size is larger than t. Otherwise, the consistency check is failed.
In an embodiment of the present disclosure, in the case the consistency check is successful, one client node from the second quantity of client nodes U2 can request the server to remove all contributions of its model training information that has been included in the previously generated aggregate result. By following the rule of RTBF, the revocation may be achieved in the following way, in accordance with an embodiment of the present disclosure.
For example, after receiving the request for revocation, in the (i+1)-th iteration, the server may determine currently on-line client nodes. For example, the on-line client nodes include the client nodes from the second quantity of client nodes except those who have requested revocation. In this case, the on-line client nodes form a sub set U3 of the second quantity of client nodes, in a third quantity.
In an embodiment of the present disclosure, the method may further include:
S206, for a (i+1)-th iteration, sending, by the server a list of the third quantity of client nodes together with an aggregate result of model training information previously transmitted from the second quantity of client nodes, to each of the third quantity of client nodes.
The server may send ({id}id∈U
Each of the third quantity of client nodes, i.e., the on-line client node, can reconstruct local models to be used in the (i+1)-th iteration, based on ({id}id∈U
In an embodiment of the present disclosure, each client node id from the third quantity of client nodes, can receive ({id}id∈U
As can be understood, the iterations repeat until the model convergences.
The method according to embodiments of the present disclosure, as described above, allows dropout-resiliency and corruption-tolerance guarantees, reduces communication and computational costs on the server side while not sacrificing the round complexity of communication, and protects the global model privacy against the server.
Furthermore, the method according to embodiments of the present disclosure, as described above, allows the data privacy regarding any revoked client against the remaining clients who can always obtain the aggregate models before and after data revocation.
In addition, the method according to embodiments of the present disclosure, as described above, integrates a lightweight homomorphic authentication mechanism.
In accordance with an exemplary implementation of the present disclosure, deletion privacy may be protected against potential inference attacks from curious remaining client nodes, by slightly adding discrete Gaussian noise to clients' local models before aggregation.
To be specifically, in S2031, the server may acquire the cipher text cid and a tag tagid transmitted by each of the second quantity of client nodes, where the cipher text cid and the tag tagid are computed by MaskAndMAC (pp, sk, id, xid)→(cid, tagid), wherein pp represents a public parameter component available to the server and the client nodes, sk represents secret key secretly shared among the client nodes.
In S2041, the server aggregates respective tags into an aggregate tag.
In S2051, the server broadcasts the aggregate tag associated with the aggregate result via the blockchain.
In this way, the deletion privacy may be further protected.
Now referring to
In S301, the first client node id may receive, from the server in the system, a list U1 of selected client nodes, where the selected client nodes are in a first quantity and are selected to participate in an i-th iteration of the federated learning, where i is an integer. The first quantity of client nodes are selected by the server from the plurality of client nodes in the system, and the first client node is one of the first quantity of the selected client nodes.
In S302, the first client node may acquire model training information by training a local model with local training data.
The first quantity of client nodes may separately train their local model with their own local training data to acquire respective model training information.
In S303, the first client node may generate cypher text of the model training information by performing homomorphic encryption on the model training information based on pairwise seeds computed for each of the first quantity of client nodes from a symmetric bivariate polynomial and a private seed computed from asymmetric bivariate polynomial.
In an embodiment of the present disclosure, the symmetric bivariate polynomial F(x, y) and the asymmetric bivariate polynomial G(x, y) are selected and sent to the client nodes by a one-time dealer in an initialization process.
In an embodiment of the present disclosure, during the initialization process, system parameters are initialized, including n selected client nodes, their public and distinct identifiers id∈U={1, . . . , n} which are not zero, security parameter λ, threshold value t and h, the length of model training information (for example the gradient) parameters m, a large prime finite field R=Zp, where p is a roughly λ-bit prime larger than n, a pseudo-random generator PRG {0, 1}*→Zm.
In an embodiment of the present disclosure, the one-time dealer is trusted, and is configured to select two secret keys s1, s2 from Zp; generate a symmetric bivariate polynomial associated with s1, F(x, y)=a00+a10x+a01y+a11xy+ . . . +at, xtyt(mod)p in which ai,j=aj,i, ai,j, aj,i∈Zp, i, j∈[0, t] and let F(0,0)=s1, and then share each client node id∈[n] with the secret key shares {fid1(y)=F(id, y), fid2(x)=F(x, id)}, by separately substituting id for variable x and for variable y; generate an asymmetric bivariate polynomial of s2, that is, G(x, y)=b00x+b01y+b11xy+ . . . +bt,hxtyh(mod p) in which bi,j≠bj,i, bi,j, bji∈Zp, i∈[0, t], j∈[0, h], and let G (0,0)=s2, and then share each client node id∈[n] with the secret key shares {gid1(y)=G(id, y), gid2(x)=G(x, id)}.
In a preferable embodiment of the present disclosure, in the polynomials F(x, y) and G(x, y), the degree t, h satisfies the relationship of (t+1)t−1<h<n.
By utilizing the symmetric and asymmetric bivariate polynomials obtained in the initialization process, each client node can perform homomorphic encryption on his own model training information to obtain the cypher text of the model training information.
For example, the first client node id can calculate a pairwise seed computed for another client node in U1 with identifier id′ from the symmetric bivariate polynomial as pairwise seeds fid1(id′)=F(id, id′). Due to the symmetric characteristics of F(x, y), F(id, id′)=F(id′, id). In addition, the client node id can calculate, from the asymmetric bivariate polynomial, a private seeds gid2(0).
In an exemplary embodiment of the present disclosure, with the calculated seeds and together with the initialized pseudo-random generator PRG, the first client node id can calculate the cypher text cid of the model training information xid, for example, by cid←xid+Σid′∈U
In S304, the first client node may send the cypher text of the model training information to the server.
As described above and will be understood by those skilled in the art, there may be cases such as rejection or dropout in which some of the client nodes in U1 may be unwilling to share his model training information in the i-th iteration or may drop out owing for example to disconnecting from the server or running out of local resources. In such scenarios, the server could not acquire the cypher text of model training information from all of the first quantity of client nodes, but only from a subset U2 of the first quantity of client nodes, namely, acquiring the cypher text of model training information from the second quantity of client nodes. In other words, U2⊆U1 and |U2|>t. In this way, the server may obtain the aggregate result Σcid∈Rm.
As described above and will be understood by those skilled in the art, there may be case of revocation, in which one client node from the second quantity of client nodes U2 can request the server to remove all contributions of its model training information that has been included in the previously generated aggregate result. In this case, the server may determine currently on-line client nodes. For example, the on-line client nodes include the client nodes from the second quantity of client nodes except those who have requested revocation. In this case, the on-line client nodes form a sub set U3 of the second quantity of client nodes, in a third quantity.
In an embodiment of the present disclosure, the method may further include the following steps.
In S305, for an (i+1)-th iteration, the first client node may acquire, from the server, a list of the third quantity of client nodes together with an aggregate result of model training information previously transmitted from the second quantity of client nodes.
In S306, for the (i+1)-th iteration, the first client node may reconstruct a local model to be used in the (i+1)-th iteration by using the list of the third quantity of client nodes together with the aggregate result of model training information previously transmitted from the second quantity of client nodes.
In an exemplary implementation, the reconstructing the local model to be used in the (i+1)-th iteration may include:
To provide improved privacy protection with regard to data revocation, discrete Gaussian noise may be added to clients' local models before aggregation, as described with reference to
In S3031, for the i-th iteration, the first client node may compute the cipher text cid and a tag tagid by MaskAndMAC(pp, sk, id, xid)→(cid, tagid), wherein pp represents a public parameter component available to the server and the client nodes, sk represents secret key secretly shared among the client nodes, and xid represents the model training information of the first client node id.
In S3032, the first client node may send the cipher text cid and the tag tagid to the server.
In this implementation, the server may aggregate respective tags into an aggregate tag and broadcast the aggregate tag associated with the aggregate result via the blockchain.
In S3033, for the (i+1)-th iteration, the first client node may acquire, from the block-chain, the aggregate tag, and before reconstructing the local model, verify correctness of the aggregate result by checking validity of the aggregate tag against the aggregate result.
The methods for building blockchain-based secure aggregation in federated learning with data removal may find application in a variety of fields, such as Internet of Things (IoT), healthcare, etc.
In the following, a method (or may be referred to as protocol) for building blockchain-based secure aggregation in federated learning with data removal according to another embodiment of the present disclosure is described with reference to
All participating clients have access to a public bulletin board, e.g., blockchain. The blockchain may store all historical masked aggregate models and the corresponding identities of the involved clients. The blockchain is very useful for consistency check among clients at runtime. This work employs a permissioned blockchain involving a small group of members who are organized by a secure consensus protocol. Off-the-shelf blockchains, such as Hyperledger, BCOS can be adopted.
In this embodiment of the present disclosure, bivariate polynomial-based secret sharing (BPSS) is adopted. A dealer can hide a secret in the constant term of the bivariate polynomial. The dealer may compute two polynomials related to the secret as well as a participant's identifier, and then, securely distribute the two polynomials (instead of a point) to the participant. Formally, a bivariate polynomial is defined as F(x, y)=Σi=0tΣj=0tai,jxiyi mod Zp with the same degree t in variate x and in variate y, where p is a large prime and ai,j∈Zp. A dealer chooses a secret s∈Zp and let F(0,0)=s. According to the coefficient features, a bivariate polynomial can be grouped into symmetric bivariate polynomial (SBP) if ai,j=aj,i, and asymmetric bivariate polynomial (ABP) if ai,j≠aj,i. Such features bring some advantages to the BPSS. Concretely, for symmetric BPSS, a participant i is configured with shares {F(i,y)=F(x,i)}. A pair of shared keys then can be established between participant i and j due to F(i,j)=F(j,i). For asymmetric BPSS, there are two pairwise shared keys between participant i and j, that is, F(i,j) and F(j,i).
The derived secret sharing mechanism satisfies the additive homomorphism property. Formally, let χ be the share domain, Y be the secret domain and ⊕ be the addition operation. Denote Fr as a secret reconstruction function for a (t+1, n)-BPSS mechanism. The previously described BPSS mechanism has the additive homomorphism property, if for all sets of (t+1) shares, whenever y=Fr(x1, . . . , xt+1) and y′=Fr(x1, . . . , xt+1) then y⊕y′=Fr(x1⊕x′1, . . . , xt+1⊕x′t+1).
A secure Pseudo-Random Generator (PRG) is required in this embodiment of the present disclosure, which extends a fixed-size uniformly random seed to a m-length random number of a prime field as output. The PRG is secure if the output is computationally indistinguishable from a uniformly selected number from the same prime field. More importantly, a key-homomorphic PRG mechanism may be adopted that enables ΣPRG(ki)=PRG(Σki). The main reason is that the PRG computation of each seed for masking gradients dominates the computation overhead of clients.
In S401, one-time dealer is Setup. System parameters are initialized, including n selected clients, their public and distinct identifiers id∈U={1, . . . , n} which are not zero, security parameter λ, threshold value t and h, the length of gradient parameters m, a large prime finite field R=Zp. p is a roughly λ-bit prime, which is larger than n. A pseudo-random generator PRG {0, 1}*→Zpm.
A trusted dealer is entitled to:
Note that in the above polynomials, the degree t, h satisfies the relationship of (t+1)t−1<h<n.
In S402, masking and uploading for round 1 is executed by a client. The Client id is to:
In S403, aggregation is performed by the server. The server is to:
In S404, consistency check for round 2 is performed, in which each client agrees the set of surviving counterparts, and the set size is larger than t. Otherwise, the protocol aborts.
In S405, client unmasking for round 3 is executed. The client id is to:
//pairwise shared keys gid1(id′), gid′2(id) and gid′1(id), gid2(id′) have been built between id and id′ owing to the Dealer Setup round.
To provide better understanding of the embodiments of the present disclosure, a four-client scenario with the foregoing protocol that does not consider deletion privacy is described.
One-Time Dealer Setup: A dealer generates an SBP F(x, y)=2+x+y+4xy and an ABP G(x, y)=3+x+2y+xy+4y2+2xy2. Herein, the coefficients are from a non-zero field R. Each client id∈{1, 2, 3, 4} is shared with secret keys as follows:
With the setup, the protocol involves four clients, and allows one dropped client and one corrupted client at any time of subsequent execution.
Masking and Uploading: Individual client id∈{1, 2, 3,4} masks its local gradient xid which belongs to the non-zero field R (assuming |xid|=1 for simplicity), by using secret keys as follows:
It is assumed that the 4th client drops out in this round, so the gradients the server receives are from the first three clients. Finally, the gradient sum is Σid∈{1,2,3}cid.
Client Unmasking: With Σid∈{1,2,3}cid returned by the server, the clients collaborate to unmask the sum. It is assumed that client 2 is corrupted in this round. The collaboration of the remaining honest clients with id=1 and id=3 should successfully decrypt and obtain Σid∈{1,2,3}cid. Note that F (1, 2), F (1, 3), F (2, 1), F (2, 3), F (3, 1), F (3, 2) can be mutually cancelled out, and therefore, the next-step computation is to remove F (2, 4), F (1, 4), F (3, 4) and to recover G(0, 2), G(0, 1), G(0, 3). It is noted that two pairwise shared keys {G(1, 3), G(3, 1)} are pre-established between client 1 and 3, such that they can privately exchange the Lagrange Components of shares
and
as well as
and
via private channels. Then, F(x,4) and G(x,2) are and reconstructed via Lagrange interpolation, and thus F (2, 4) and G(0, 2) are obtained. Similarly, the clients can recover G(0, 1) and G(0, 3) in a similar way. As a result, client 1 and 3 can obtain the gradient sum Σid∈{1,2,3}xid.
It is noted that the total number of clients should be at least larger than three timing number of corrupted clients (introduced in the next section). It is emphasized that the clients 1 and 3 are honest, and they do not collude to recover F (x, y), in the masking and uploading phase, to recover an individual client's gradient.
To provide better understanding of the embodiments of the present disclosure, a general description of the foregoing protocol is provided with four rounds. A centralized server coordinates the communication and aggregates the trained gradients of n clients which are selected in an iteration of the FL training. A gradient is denoted to be an m-length vector with elements in a field of Zp. The identity of each client is unique and denoted as id∈{1, . . . , n}, for easy of presentation. Denote that the client id's local gradient is xid which is a m-length vector in Zp, i.e., |xid|=m.
To start with, a trusted dealer is entitled to generate two secrets, and to configure the shares of the secrets for each client using polynomial derivatives. A symmetric bivariate polynomial has the degree t in variable x and y, and an asymmetric bivariate polynomial has the degree t in x and the degree h in y, where (t+1)t−1<h<n. Recall that the degree t and the degree h utilized by the previously described toy protocol are 1 and 2, respectively. Owing to such a setup, it is allowed dropped or corrupted clients that may occur at any time, and the remaining clients are able to recover the final sum, as long as the honest client count is more than t. Moreover, every pair of two clients shares two pairs of common keys for privately exchanging messages, such that secure recovery can be completed at the client side, rather than the server side. Also, the clients are not required to establish key agreement, e.g., Diffie-Hellman key agreement. Notice, a consistency round is involved for ensuring that all surviving clients have a consistent view about counterparts. With such a consistent view, surviving clients are able to exchange shares via private communication, in order to recover the aggregate local models.
It is noted that t<n/3 can ensure the accurate reconstruction of secrets. This relationship is established based on principles from coding theory. In scenarios where the client count is represented by n, the system can tolerate up to (n−t)/2 instances of corruption. Hence, the condition t<(n−t)/2 holds great significance to guarantee the precision of secret reconstruction. In practice, the condition t<(n−t)/2 remains satisfied, provided that the two polynomials F (x, y) and G(x, y) are designed according to the parameter setting (t+1)t−1<h<n.
The communication cost for the server is dominated by receiving the masked inputs from participating clients, which is O(mn). The server's computation cost is only caused by aggregating the clients' masked inputs and each one is m length, resulting in O(m).
To provide better privacy protection, a refined protocol is further provided according to another embodiment of the present disclosure, in presence of data revocation. Let each client obtain a noisy aggregate result of all clients' local models, when the unmasking phase is done. It aims to provide client-level DP, which prevents non-revoked clients from learning private data associated with a revoked client.
Note that the privacy requirement is weaker than that of local DP, and the noise introduced at the individual client level is not adequate to guarantee DP on its own. Therefore, the privacy preservation of a noisy local model greatly depends on the security of the foregoing protocol. For example, a delayed client's noisy local model is privacy-preserving, only if the majority of clients and the server are honest, since the foregoing protocol allows client dropout and corruption at any time.
In considered scenario, DP is defined associated with the addition or exclusion of all local data belonging to a single client (cf. D′), which achieves client-level privacy.
Definition 1: (Differential Privacy) The mechanism is (ε, δ)-differentially private, where ε>0 and δ≥0, iff for all possible output sets S and for any two neighboring datasets D1 and D2 differing by a dataset D′, i.e., D1−D2=D′or D2−D1=D′, have Pr[M(D1)∈S]≤(1+ε)·Pr[M(D2)∈S]+δ.
In the above definition, ε>0 bounds the distinguishability of all possible output sets on the adjacent datasets D1 and D2; δ≥0 refers to the probability that (1+ε) fails to bound the datasets D1 and D2, using the noisy mechanism M. M:D→R is defined with a domain D, e.g., training sets, and with a range R, e.g., all trained global models. The notion of DP implies that the parameter distributions of two models trained on datasets D1 and D2 are indistinguishable, where D1 and D2 differ only in a specific client's local data.
Definition 2: (Rényi Differential Privacy) The M mechanism is (α,1/2ε2α)-differentially private, where α∈(1, ∞), iff for all possible output sets S and for any two neighboring datasets D1 and D2 differing by a dataset D′, the Rényi divergence of order a of two outputs M(D1)∈S and M(D2)∈S is represented
Sensitivity for Querying Global Model
Recall that Mi is the local model learnt over the ith client's training set. It is assumed that the size of each client training set is equal, and therefore, the M is influenced by the data associated with a particular client with an equal proportion. It is defined the sensitive of querying M as
If any local model Mi has bounded norm c, that is, ∥Mi∥2≤c, then the SM is bounded at most c. However, a local model may not be bounded by the desired c. Each client is to clip the local model, achieving ∥Mi∥2≤c.
Randomized Hadamard Transform: The DP mechanism can be integrated with the foregoing protocol by doing modular arithmetic. In order to mitigate the errors caused by modular rounding, the Hadamard transform operation as Definition 3 shown will be adopted, which consumes O(m log m) for a m-dimension gradient vector x. Herein, assume m is a power of 2. Notice, the definition uses a Walsh-Hadamard matrix denoted in Definition 4.
Definition 3: The randomized Hadamard transform operation on a m-dimension vector x=(x1, . . . , xm) is defined as
are the diagonal values of the m×m matrix D. Also the inverse transform operation is denoted as
Definition 4: The Walsh-Hadamard matrix Hm∈{−1, +1}m×m is defines as
and when m equals to 1, Hm=(1), which satisfies
Let the server only do the aggregation calculation. For adapting to the modular integer arithmetic of Foregoing protocol, the client rounds the float-type model parameters, and adds noise sampled from discrete Gaussian distribution. Notice, this step happens before the phase of masking and uploading. As demonstrated by Algorithm 1, each client scales and clips the local model xid to ensure ∥xid∥2≤c/γ. Then, each client performs the randomized Hadamard transform operation on xid, and proceeds to round xid, such that the resulting integer-type xid is sufficiently small. After that, the client perturbs the parameters with the noise
Last, each client undoes the rounding of the unmasked aggregate outcome Σxid, that belongs to Zp (see the unmaking phase in
and finally, obtains the float-type Σxid′.
Depart from protecting the privacy of the client inputs on the case of data revocation or non-revocation, it is further addressed the issue of enforcing the authenticity of the server-side computation. Despite the existence of many verifiable aggregation mechanisms, they do not solve the problem comprehensively, since the case of proper deletion at server is not considered. It is specifically considered a case where an honest counterpart proactively revokes his participation from the training process, and it is to convince a surviving client that his local models are properly removed from the global models, instead of being maliciously excluded by the untrusted server. Therefore, an enhanced protocol is proposed to provide the authenticity guarantees of the server aggregation, and mean-while, enable fast verification for resource-restrained clients.
In the enhanced protocol according to an embodiment of the present disclosure, homomorphic message authentication codes (HMACs) are introduced. Let K, M, I be the key space, the message space and the index space. There are three probabilistic polynomial-time (PPT) algorithms as follows:
Zero-Knowledge Proof ZK enables a prover who holds a private witness with to convince a verifier of a statement C(wit)=1 without revealing the witness, where C is the evaluating circuit known by both the prover and verifier. ZK proof protocols (such as zero-knowledge succinct non-interactive arguments of knowledge [18]) now become possible to support generic computation via the reduction to generic NP-complete problem.
In the enhanced protocol according to an embodiment of the present disclosure, at a high-level view, the HMAC is adopted, represented as a tag tag, to verify the correctness of a sum Σcid that aggregates a specific set of cipher text cid, (id∈U). On the case of data revocation, assuming the revocation of the client id∈U's data, it is needed to attest the authenticity of an updated sum Σcid\ by providing an updated tag , as well as an additional proof. This proof is provided by the client . It is an ownership proof attesting that client's is ever included in the Σcid, and meanwhile, not included in Σcid\. Then, the server can forward the proof to convince the other remaining clients that the claimed cipher text Σcid\ is properly generated under the agreement of the client {circumflex over ( )}id.
A tuple of four PPT algorithms is involved in the enhanced protocol according to an embodiment of the present disclosure. For simplicity, it is denoted in the following algorithms that |U| is larger than a pre-configured security threshold. In addition, let ‘msg’ indicate if a client revokes his message or not in an aggregation round. The ‘msg’ can be a message ‘revocation’ or ‘non-revocation’.
The algorithm is run at client. By taking as input the public parameters, the secret key, the cipher text sum, the tag sum and possibly an ownership proof, it verifies the validity of the tag respective to the cipher text, as well as the validity of the ownership proof. If it is valid, the algorithm outputs 1 representing “accept” and the aggregate result x; otherwise, it outputs 0 standing for “reject”.
The enhanced protocol according to an embodiment of the present disclosure is provided below. The enhanced protocol runs the same steps as the previous protocol, except for proof generation in the aggregation phase, and verification required to check the computation correctness of a server in the phase of client unmasking.
One-Time Dealer Setup: There are the same algorithm parameters as that in S401, except for selecting s3∈Zp, and distributing it to all Participating clients via secure channels.
Masking and Uploading (Round 1): The phase achieves the aforementioned MaskAndMAC algorithm. Each client executes the same computation as that in S402, except for computing an HMAC-enabled tag with respect to each ciphertext cid:
Define an one-degree polynomial yid(x)=cid+tagidx∈Zp[x], such that yid(0)=cid and PRG(gid2(0)∥2.
Aggregation: The phase achieves the AggAndProve algorithm. In a normal case, the server aggregates the HMACs of the ciphertexts by computing tag←Σid∈Utagid, in addition to aggregating the ciphertext c←Σid∈Ucid. In a revocation case, upon receiving tag∈T of a revocation requester and a publicly verifiable proof , the server computes c←Σid∈U\cid which can be verified by . Herein, +tag=tag and =Σid∈U\tagid.
Consisteney Check (Round 2)
Client Unmasking (Round 3): The phase achieves the UnmaskAnd. Verify algorithm. Before unmasking the aggregate ciphertext as the previously described S405, each client verifies the correctness of the aggregate ciphertext. On one case, it checks the validity of the tag tag against c, by checking whether Σid∈UPRG(gid2(0)∥2=tag·s3+c mod p holds. On another case, it verifies the validity of against ĉ by checking whether Σid∈U\PRG(gid2(0)∥2=. s3+ĉ mod p holds, and whether the proof is verified. If the aggregate ciphertext is correct, the client decrypts the ciphertext c or ĉ.
It is additionally distributed the secret s3 for HMAC generation in the setup phase. Then, each client uses the secret to compute the corresponding HMAC-based tag for its cipher text. With all HMAC-based tags from clients, the server aggregates the tags. After the aggregate tag and the aggregate cipher text are returned to the clients, they verify the validity of the aggregate cipher text. If the verification is successful, a sufficient number of clients collaborate to decrypt the aggregate cipher text.
In addition, an image processing device is provided in an embodiment of the present disclosure. As shown in
The image processing device may include one or more processors 501, and one processor is shown in
The memory 502 may be used to store software programs and modules, and the processor 501 executes various functional applications and data processing of the image processing device by running software programs and modules stored in the memory 502. The memory 502 may mainly include a program storage area and a data storage area. The program storage area may store an operating system, an application program required for at least one function, and the like. Additionally, the memory 502 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. The input device 503 may be used to receive input numerical or character information, and generate signal input related to user settings and function control of the image processing device.
Specifically in this embodiment, the processor 501 loads the executable files corresponding to the processes of one or more application programs into the memory 502 according to the following instructions, and the processor 501 executes the application programs stored in the memory 502, thus to realize various functions of the image processing device.
In addition, a system for building blockchain-based secure aggregation in federated learning with data removal is provided according to an embodiment of the present disclosure. As shown in
It should be noted that, in this document, relational terms such as “first” and “second” etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply there is such actual relationship or sequence between these entities or operations. Moreover, the terms “comprising”, “comprises” or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also includes other elements not explicitly listed, or elements inherent to such a process, method, article or apparatus. Without further limitation, an element defined by the phrase “comprising a . . . ” does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
The above embodiment is a preferred embodiment of the invention, which is not intended to limit the scope of the present invention. Any changes, modifications, substitutions, combinations, simplifications without departing from the spirit and principle of the invention shall fall within the scope of the invention.
This application claims the benefit of priority from U.S. Provisional Application No. 63/593,847, filed on Oct. 27, 2023. The content of the aforementioned application, including any intervening amendments thereto, is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63593847 | Oct 2023 | US |