Unless otherwise indicated, the subject matter described in this section is not prior art to the claims of the present application and is not admitted as being prior art by inclusion in this section.
Computing systems that provide applications/services to clients (e.g., email systems, social media platforms, Software-as-a-Service (SaaS) systems, etc.) typically implement (1) a registration procedure that enables new clients to submit registration requests and thereby register themselves with the system, and (2) an authentication procedure that enables registered clients to submit authentication requests and thereby authenticate themselves (or in other words, login) to the system. Among other things, these registration and authentication procedures allow different clients to be associated with different privileges, preferences, persistent state, and so on with respect to the provided applications/services.
In scenarios where each registration and authentication request submitted by a client is processed independently by a single machine (i.e., node) of the computing system (referred to herein as “single-node registration and authentication”), the implementation of (1) and (2) is straightforward. For example, with respect to (1), a node can receive a registration request from a new client that includes an email address and password; transmit some secret information (e.g., a verification code) in an email to the submitted email address; receive the secret information back from the client, thereby verifying that the client is the owner of the email address; and create a registration entry for the client comprising the email address and password (or a hash of the password). With respect to (2), a node can receive an authentication request from a client that includes an email address and password; check whether the submitted password matches the password included in a registration entry keyed by the submitted email address; and if this check is successful (which indicates that the client has been successfully authenticated), grant the client access to the system.
Unfortunately, single-node registration and authentication is vulnerable to attacks by adversaries and is generally unable to guarantee correctness, security, or privacy for the registration and authentication procedures if just one node becomes corrupted. As used herein, the property of correctness means that honest (i.e., uncorrupted) clients can successfully complete registration and authentication, the property of security means that dishonest clients cannot login to the system using the passwords/credentials of other clients (or as non-registered clients), and the property of privacy means that the passwords/credentials of honest clients cannot be obtained by the adversary.
For example, assume there are three nodes S1, S2, and S3 that each implement single-node registration and authentication and an adversary A corrupts (i.e., takes control of) S1, while leaving S2 and S3 uncorrupted. In this scenario, because each node handles registration and authentication requests independently of the others, adversary A has free reign to censor clients (or other words, prevent clients from registering and authenticating) through S1, impersonate clients (or in other words, login to the system using the passwords/credentials of other clients) through S1, and perform other problematic actions, such as initiate an offline brute-force attack to try and obtain the plaintexts of hashed client passwords.
In the following description, for purposes of explanation, numerous examples and details are set forth in order to provide an understanding of various embodiments. It will be evident, however, to one skilled in the art that certain embodiments can be practiced without some of these details or can be practiced with modifications or equivalents thereof.
The present disclosure is directed to techniques for implementing distributed registration and authentication, or in other words the collaborative processing of client registration and authentication requests by multiple nodes in a computing system. In one set of embodiments these techniques leverage threshold secret sharing, which is a cryptographic method for sharing a secret among N parties in a manner that requires at least T+1 of the N parties to cooperate in order to reconstruct/reveal the secret, where T is some threshold value less than N. An example of such a scheme is Shamir's secret sharing. In another set of embodiments these techniques also leverage additively homomorphic encryption, which is a type of encryption that enables users to perform additive computations on encrypted data without first decrypting that data.
With the techniques of the present disclosure, a group of N nodes can efficiently perform distributed registration and authentication in a correct, secure, and privacy-preserving fashion, even if up to T of the N nodes are corrupted by an adversary (subject to certain constraints on T and the nature of the network interconnecting the nodes and clients). These and other aspects are described in further detail below.
As noted in the Background section, according to one approach known as single-node registration and authentication, each node S1 can handle client registration and authentication requests independently. For example, if node S1 receives a registration request from client C, S1 can process that registration request in its entirety without any interaction or communication with other nodes. Similarly, if node S2 receives an authentication request from client C after it has been registered via node S1, S2 can process that authentication request in its entirety without any interaction or communication with other nodes. However, this single-node approach quickly falls apart in the case where an adversary is able to take control of any individual node (i.e., Scorrupted) because the adversary can thereafter censor clients via Scorrupted (resulting in a loss of correctness), impersonate clients via Scorrupted (resulting in a loss of security), and mount an offline attack to invert hashed client passwords via Scorrupted (resulting in a potential loss of privacy).
To address the foregoing and other similar issues, embodiments of the present disclosure provide a distributed registration and authentication approach that enables nodes S1, . . . , SN of computing system 104 to process each client registration/authentication request in a collaborative manner. In various embodiments, this distributed approach relies on a threshold secret sharing scheme like Shamir's secret sharing that provides two functions: Share and Reconstruct. The Share function takes as input a secret s and outputs N cryptographic portions or “shares” of s (collectively known as a “sharing” of s and as denoted as [s]). Each share si in [s] can be distributed to a party in a set of N parties, which allows the parties to carry out various operations on secret s using their respective shares (e.g., secure multiparty computations) while keeping s itself hidden from each party.
The Reconstruct function takes as input the shares created via the Share function and outputs the original secret s. Significantly, this Reconstruct function requires at least T+1 shares in order to reconstruct/reveal s, where T is some threshold number less than N. This guarantees that a coalition of up to T out of the N parties cannot learn secret s by colluding and disclosing their shares of s to each other; instead, a coalition of at least T+1 parties, and thus T+1 shares, are required. For this reason, [s] is sometimes referred to as a “T-out-of-N” sharing of s. In the case of Shamir's secret sharing, each share si corresponds to a point p(i) on a T-degree polynomial p, with point p(0) being set to s. It is well known that such a T-degree polynomial cannot be uniquely interpolated with less than T+1 points, and thus this construction hides the secret encoded at p(0) from any subset of at most T parties.
In addition to the above, threshold secret sharing schemes are linear in nature, which means that given two sharings [x] and [y] where each party i holds xi and yi, the parties can locally obtain the sharing [z] where z=αx+βy+γ for some arbitrary public values α, β, γ.
With the foregoing explanation of threshold secret sharing in mind, the distributed registration procedure of the present disclosure can generally proceed as shown below. In certain embodiments, this implementation assumes threshold value T is less than N/3, where T is the maximum number of nodes out of S1, . . . , SN that may be corrupted by an adversary (such that the adversary is able to control the node's internal state and network communications); all network messages passed between client C and nodes S1, . . . , SN, as well as between the nodes themselves, are secured from tampering and eavesdropping via a mechanism such as Transport Layer Security (TLS); and network 108 is synchronous, which means that if a receiver does not receive an expected network message from a sender within a bounded period of time, the receiver proceeds with its configured processing using a default message containing some predefined value (e.g., zero or null).
Further, the distributed authentication procedure of the present disclosure can generally proceed as shown below. In certain embodiments, the same constraints/assumptions noted with respect to the distributed registration procedure also apply to this distributed authentication procedure.
With these distributed implementations, client registration and authentication can be achieved while advantageously guaranteeing the properties of correctness, security, and privacy. For example, regarding correctness, an honest client will always be able to register and login by virtue of the characteristics of the threshold secret sharing scheme, as long as there are at most T corrupted nodes. Accordingly, an adversary cannot deny service or censor specific clients.
Regarding security, at the time a given client attempts to authenticate/login by submitting a communication endpoint address and a sharing [R′] based on its authentication credential R, if that address was not previously registered with the shares of [R] by nodes S1, . . . , SN, the delta computed by the nodes at step (2) of the distributed authentication procedure will not equal zero (for example, it may be undefined). Thus, each uncorrupted node will output a result indicating that the client has not been successfully authenticated, thereby denying access to the client.
And regarding privacy, because the original random value R generated at the time of registration is unknown to the nodes, the probability that an adversary guesses R is 1/ where is the threshold secret sharing scheme's underlying field. With a large enough field, an adversary cannot feasibly guess R. Further, the distributed authentication procedure above makes it impossible to run an offline brute-force attack to guess R, as each guess (which requires the submission of an authentication request) involves processing/interaction by all nodes S1, . . . , SN. Yet further, in certain embodiments the foregoing distributed registration and authentication procedures can be enhanced such that, at the time of securely computing the deltas at steps (4) and (2) respectively, nodes S1, . . . , SN can multiply each delta by a fresh random value Q. This enhancement, which is described in greater detail in sections (3) and (4) below, ensures that the nodes cannot learn anything regarding original random value R in the case where the client sends a sharing of a value R′ that is different from R.
Starting with step 202, client C can send a registration request to nodes S1, . . . , SN that includes a communication endpoint address ADDR owned by C (or alternatively, owned by a user/individual operating C). This communication endpoint address can be, e.g., an email address, a telephone number, an ID/username associated with a messaging application, or the like.
At step 204, nodes S1, . . . , SN can receive the registration request and generate/establish a T-out-of-N sharing of a secret random value R (denoted as [R]) via a threshold secret sharing scheme, such that (1) each node Si holds a share Ri of [R], and (2) no node knows the value of R. For example, in one set of embodiments nodes S1, . . . , SN can receive their respective shares R1, . . . , RN from a trusted third-party that invokes the Share function of the threshold secret sharing scheme on R. In another set of embodiments, two nodes (e.g., S1 and S2) can create sharings of randomly selected values Y and Z where Y is only known to S1, Z is only known to S2, and random value R=Y+Z. Nodes S1 and S2 can then distribute the shares of [Y] and [Z] to the other nodes and each node can compute its share of [R] as the sum of its shares of [Y] and [Z]. With this method, no single node will know both Y and Z and thus cannot learn R.
Upon generating/establishing [R], each node S1 can send its respective share Ri to client C via C's communication endpoint address ADDR (step 206).
At step 208, client C can receive the shares of [R] at ADDR and can reconstruct R (via the threshold secret sharing scheme's Reconstruct function) using the received shares. Client C then generate a new T-out-of-N sharing of the reconstructed value of R (denoted as [R′]) comprising shares R′1, . . . , R′N (step 210) and can send share R′i to each node Si (step 212).
At step 214, nodes S1, . . . , SN can receive shares R′1, . . . , R′N and can securely compute, via a secure multiparty computation (MPC) protocol, [Δ]=[Q]·([R]−[R′]), where [Q] is a new sharing of a non-zero secret random value Q that is generated by the nodes for this specific registration request. As mentioned previously, the use of Q in this computation prevents the nodes from learning anything regarding R in the scenario where R′ is different from R and thereby bolsters the privacy of the solution. In alternative embodiments where this additional degree of privacy preservation is not needed, the nodes can simply compute [Δ]=[R]−[R′].
Nodes S1, . . . , SN can thereafter collectively reconstruct/reveal delta value Δ using their shares of [Δ] (step 216) and each node Si can check whether Δ=0 (step 218). If the answer is yes (which means R=R′), node Si can conclude that client C correctly re-shared R at step 208 and thus is the owner of communication endpoint address ADDR. Accordingly, node Si can create and store a local registration entry Ei for client C that includes ADDR and the node's share Ri of [R] (step 220).
On the other hand, if the answer at step 218 is no, node Si can conclude that client C is not the owner of the communication endpoint address (or otherwise re-shared a value R′ that is different from R in violation of the registration procedure). In this case, node Si can take no action for registering client C (step 222).
Finally, at step 224, client C can store R as its authentication credential for logging into system 104 and workflow 200 can end.
Starting with steps 302 and 304, Client C can generate a new T-out-of-N sharing of R (denoted as [R′]) comprising shares R′1, . . . , R′N and can send an authentication request to each node Si that includes ADDR and share R′i. In various embodiments, sharing [R′] is not the same as sharing [R′] described in distributed registration workflow 200; instead, [R′] here is a completely new sharing of R that is created by client C for this specific authentication request.
At step 306, each node Si can receive the authentication request, match the communication endpoint address ADDR included in the authentication request to the address identified in its local registration entry Ei, and retrieve share Ri from Ei. Nodes S1, . . . , SN can then securely compute, via an MPC protocol, [Δ]=[Q]·([R]−[R′]) using the shares of R′ included in the received authentication requests and the shares of [R] retrieved from the local registration entries (step 308). As discussed with respect to workflow 200, [Q] is a new sharing of a non-zero secret random value Q that is generated by the nodes for this specific authentication request and is used to obfuscate the true value of [R]−[R] in cases where R≠R′, thereby preventing the nodes from learning anything regarding R.
At steps 310 and 312, nodes S1, . . . , SN can collectively reconstruct/reveal delta value Δ using their shares of [Δ] and each node Si can check whether Δ=0. If the answer is yes (which means R=R′), node S1 can conclude that client C has been successfully authenticated and can output a result bi indicating authentication success (step 314).
On the other hand, if the answer at step 312 is no, node Si can conclude that client C has not been successfully authenticated. In this case, node Si can output a result bi indicating authentication failure (step 316).
Finally, at step 318, system 104 can grant client C access if at least N— T nodes have output a result indicating authentication success per step 314 and workflow 300 can end.
One drawback with the distributed registration workflow shown in
To address this issue,
Generally speaking, an AHE scheme guarantees that, for any m and every m-ary linear function ƒ, Dec(sk, Ev(pk, f, ct1, . . . , ctm))=ƒ(Dec(sk, ct1), . . . , Dec(sk, ctm)) where (sk, pk)→KeyGen(λ) for some λ and cti is either the result of Enc(pk;) or Ev(pk, ƒ;). In other words, it doesn't matter whether you (1) first evaluate ƒ on the ciphertexts ct1, . . . , ctm using the Ev function (resulting in ctf) and then decrypt ctf or (2) first decrypt each ciphertext cti separately using the Dec function (resulting in plaintexts pt1, . . . , ptm) and then evaluate ƒ on pt1, . . . , ptm; the results of (1) and (2) are the same.
This means that instead of having each node Si send its share Ri to client C, each node Si can encrypt Ri using an AHE scheme and send the resulting ciphertext cti to a designated node (e.g., S1), which maintains the secrecy of the shares from S1. Designated node S1 can then compute ctf→Ev(pk, f, ct1, . . . , ctN) (where ƒ is the Reconstruct function of the threshold secret sharing scheme used to generate [R]) and can send ctf to client C. Client C can thereafter reconstruct R by computing Dec(sk, ctf) because per the properties of the AHE scheme, Dec(sk, ctf) is equivalent to Reconstruct(Dec(sk, ct1), . . . , Dec(sk, ctN)). Accordingly, with this approach, only a single message needs to be received by client C in order to reconstruct R (i.e., the message comprising ciphertext ctf sent from designated node S1), thereby significantly streamlining the registration process.
Turning now to workflow 400, client C can compute a secret key/public key pair (sk, pk) by invoking the KeyGen function of an AHE scheme (step 402) and can send a registration request to nodes Sp1, . . . , SN that includes a communication endpoint address ADDR owned by C (or alternatively, owned by a user/individual operating C) and the public key pk. This communication endpoint address can be, e.g., an email address, a telephone number, an ID/username associated with a messaging application, or the like.
At step 406, nodes S1, . . . , SN can receive the registration request and generate/establish a T-out-of-N sharing of a secret random value R (denoted as [R]) via a threshold secret sharing scheme, such that (1) each node S1 holds a share Ri of [R], and (2) no node knows the value of R. As mentioned previously, in one set of embodiments nodes S1, . . . , SN can receive their respective shares R1, . . . , RN from a trusted third-party that invokes the Share function of the threshold secret sharing scheme on R. In another set of embodiments, two nodes (e.g., S1 and S2) can create sharings of randomly selected values Y and Z where Y is only known to S1, Z is only known to S2, and random value R=Y+Z. Nodes S1 and S2 can then distribute the shares of [Y] and [Z] to the other nodes and each node can compute its share of [R] as the sum of its shares of [Y] and [Z]. With this method, no single node will know both Y and Z and thus cannot learn R.
At step 408, each node Si can compute ciphertext cti by invoking the Enc function of the AHE scheme on its share Ri using public key pk (i.e., Enc(pk, Ri) and can send cti to a designated node (e.g., S1). In response, designated node S1 can compute ciphertext ctf by invoking the Ev function of the AHE scheme on the Reconstruct function of the threshold sharing scheme, ciphertexts ct1, . . . , ctN, and public key pk (i.e., Ev(pk, Reconstruct, cti, . . . , ctN) (step 410). Designated node S1 can then send a single message comprising ctf to client C via C's communication endpoint address ADDR (step 412).
At step 414, client C can receive ciphertext ctf at ADDR and can compute R by invoking the Dec function of the AHE scheme on ctf using its secret key sk (i.e., Dec(sk, ctf)). Client C can subsequently generate a new T-out-of-N sharing of the reconstructed value of R (denoted as [R′]) comprising shares R′1, . . . , R′N (step 416) and can send share R′i to each node Si (step 418).
At step 420, nodes S1, . . . , SN can receive shares R′1, . . . , R′N and can securely compute, via a secure multiparty computation (MPC) protocol, [Δ]=[Q]·([R]−[R′]), where [Q] is a new sharing of a non-zero secret random value Q that is generated by the nodes for this specific registration request. As mentioned previously, the use of Q in this computation prevents the nodes from learning anything regarding R in the scenario where R′ is different from R and thereby bolsters the privacy of the solution. In alternative embodiments where this additional degree of privacy preservation is not needed, the nodes can simply compute [Δ]=[R]−[R′].
Nodes S1, . . . , SN can then collectively reconstruct/reveal delta value Δ using their shares of [Δ] (step 422) and each node Si can check whether Δ=0 (step 424). If the answer is yes (which means R=R′), node Si can conclude that client C correctly re-shared R at step 418 and thus is the owner of communication endpoint address ADDR. Accordingly, node Si can create and store a local registration entry Ei for client C that includes ADDR and the node's share Ri of [R] (step 426).
On the other hand, if the answer at step 424 is no, node Si can conclude that client C is not the owner of the communication endpoint address (or otherwise re-shared a value R′ that is different from R in violation of the registration procedure). In this case, node Si can take no action for registering client C (step 428).
Finally, at step 430, client C can store R as its authentication credential for logging into system 104 and workflow 400 can end.
In various embodiments, workflow 400 assumes that designated node S1 is semi-honest (i.e., will not actively attempt to sabotage the registration process) and thus will not send an incorrect value for ciphertext ctf to client C at step 412. In addition, workflow 400 assumes that the Reconstruct function of the threshold secret sharing scheme is a linear function. This latter assumption will be true if the threshold secret sharing scheme is, e.g., Shamir's secret sharing.
Certain embodiments described herein can employ various computer-implemented operations involving data stored in computer systems. For example, these operations can require physical manipulation of physical quantities—usually, though not necessarily, these quantities take the form of electrical or magnetic signals, where they (or representations of them) are capable of being stored, transferred, combined, compared, or otherwise manipulated. Such manipulations are often referred to in terms such as producing, identifying, determining, comparing, etc. Any operations described herein that form part of one or more embodiments can be useful machine operations.
Further, one or more embodiments can relate to a device or an apparatus for performing the foregoing operations. The apparatus can be specially constructed for specific required purposes, or it can be a generic computer system comprising one or more general purpose processors (e.g., Intel or AMD x86 processors) selectively activated or configured by program code stored in the computer system. In particular, various generic computer systems may be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations. The various embodiments described herein can be practiced with other computer system configurations including handheld devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like.
Yet further, one or more embodiments can be implemented as one or more computer programs or as one or more computer program modules embodied in one or more non-transitory computer readable storage media. The term non-transitory computer readable storage medium refers to any storage device, based on any existing or subsequently developed technology, that can store data and/or computer programs in a non-transitory state for access by a computer system. Examples of non-transitory computer readable media include a hard drive, network attached storage (NAS), read-only memory, random-access memory, flash-based nonvolatile memory (e.g., a flash memory card or a solid state disk), persistent memory, NVMe device, a CD (Compact Disc) (e.g., CD-ROM, CD-R, CD-RW, etc.), a DVD (Digital Versatile Disc), a magnetic tape, and other optical and non-optical data storage devices. The non-transitory computer readable media can also be distributed over a network coupled computer system so that the computer readable code is stored and executed in a distributed fashion.
Finally, boundaries between various components, operations, and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the invention(s). In general, structures and functionality presented as separate components in exemplary configurations can be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component can be implemented as separate components.
As used in the description herein and throughout the claims that follow, “a,” “an,” and “the” includes plural references unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.
The above description illustrates various embodiments along with examples of how aspects of particular embodiments may be implemented. These examples and embodiments should not be deemed to be the only embodiments and are presented to illustrate the flexibility and advantages of particular embodiments as defined by the following claims. Other arrangements, embodiments, implementations, and equivalents can be employed without departing from the scope hereof as defined by the claims.
The present application is related to commonly owned U.S. patent application Ser. No. 17/543,513 filed Dec. 6, 2021 and entitled “Distributed Registration and Authentication via Threshold Secret Sharing,” The entire content of this related application is incorporated herein by reference for all purposes.