The described technology provides key rotation verification without decryption. Two ciphertext inputs encrypted from a plaintext input by an encryption function using different cryptographic keys are input, wherein the encryption function is selected from a function family having an output space of one or more convex sets. A divergence between the two ciphertext inputs is computed. A membership oracle is executed on the two ciphertext inputs, wherein the two ciphertext inputs are determined to be members of the same convex set of the one or more convex sets if the computed divergence satisfies a separation condition. It is validated that the two ciphertext inputs both correspond to the same plaintext input, responsive to determining that the two ciphertext inputs are members of the same convex set of the one or more convex sets, wherein the two ciphertext inputs do not correspond to the same plaintext input if the two ciphertext inputs are not members of the same convex set of the one or more convex sets.
This summary is provided to introduce a selection of concepts in a simplified form that is further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Other implementations are also described and recited herein.
In data storage, data records are typically secured by cryptographic methods, such as encrypting the data records using one or more cryptographic keys. However, such keys are subject to attack by adversaries, who have an increasing chance of discovering the keys over time. As such, key rotation may be used to revoke old keys and replace them with new keys, thereby substantially resetting the adversary's efforts to obtain access to the encrypted data records. Alternatively, key rotation may be employed to enforce data access revocation or expiration. Generally, key rotation refers to the process of (periodically) exchanging the cryptographic keys that are used to secure the data. For example, the old key is used to decrypt the data records, and then the new key is used to encrypt the data records. In this manner, compromised keys or unauthorized data access can be revoked in favor of secure keys and data access.
In practical applications, an administrator or user may wish to verify that the key rotation was successful. One such verification method would involve decrypting the data record using the new key and then comparing the decrypted plaintext to a known version of the plaintext. Unfortunately, this procedure introduced several security risks and undesirable overhead. For example, the verifier would need access to the known version of the plaintext, which circumvents the security imposed by the encryption in the first place. Furthermore, decrypting the encrypted data record using the new key once again risks exposing the plaintext to unwanted parties. Moreover, verifying key rotation using decryption is a computationally-intensive process, particularly on large databases. As such, more secure and less resource-intensive methods of verifying key rotation would be beneficial.
In one implementation, key rotation is performed by changing the encryption of the plaintext input based on a first cryptographic key from a first ciphertext input to a second ciphertext input as an encryption of the plaintext input based on a second (different) cryptographic key. As such, the ciphertext inputs correspond to the same plaintext 116 if the key rotation was successful, and the described technology validates this without decrypting the ciphertext inputs themselves or requiring a comparison to a known version of the plaintext.
At some subsequent point in time, an administrator or user (collectively referred to as a “user”) wishes to verify that the key rotation was successful and that the new keys 110 were indeed used to encrypt the encrypted data records 102 and may be used to decrypt the encrypted data records 102. However, since the encrypted data is likely stored on third-party server(s), the user also wishes to avoid the security risks and resource utilization associated with actually decrypting the encrypted data records 102 using the new keys and then comparing the decrypted data records with a known version of the plaintext (which the user may or may not even possess). Moreover, doing so would be grossly inefficient. Accordingly, the verification operation 112 validates the successful key rotation without decrypting the encrypted data records 102.
The described technology will be disclosed herein with both formal notation and proofs, as well as a narrative technical description. As an introductory matter, the concept of “learning with errors” or LWE is described. The learning with errors (LWE) problem has emerged as a popular hard problem for constructing lattice-based/post-quantum cryptographic solutions. Many cryptosystems rely on the hardness assumption of the LWE problem, including without limitation identity-based, leakage-resilient, fully homomorphic, functional, public-key/key-encapsulation encryptions, oblivious transfer, (blind) signatures, PRFs (pseudorandom functions), secret sharing, hash functions, secure matrix multiplication computation, verifiable quantumness, non-interactive zero-knowledge proof system for an NP language (e.g., NP is a class of languages where, given x and a proof y, one can deterministically check, in time polynomial in the size of x and y, as to whether y really does prove that x is in the NP language. In addition, the size of y is bounded by some polynomial in the size of x), certifiable randomness generation, obfuscation, and quantum homomorphic encryption.
Definition 1 (Decision-LWE) For positive integers n and q≥2, and an error (probability) distribution χ=χ(n) over q, the decision-LWEn,q,χ problem is to distinguish between the following pairs of distributions:
Definition 2 (Search-LWE˜[05]) For positive integers n and q≥2, and an error (probability) distribution χ=χ(n) over q, the search-LWEn,q,χ problem is to recover s∈qn, given (A, As+e), where
For certain noise distributions and a sufficiently large q, the LWE problem is as hard as the worst-case SIVP (shortest independent vectors problem) and GapSVP (Gap shortest vector problem) under a quantum reduction. The fixed vector s can be sampled from a low norm distribution (in particular, from the noise distribution χ), and the resulting problem is as hard as the original LWE problem. The noise distribution χ can also be a simple low-norm distribution.
Optimization is a fundamental problem in mathematics and computer science, with many real-world applications. One of the most successful continuous optimization paradigms is convex optimization, which optimizes a convex function over a convex set that is given explicitly (by a set of constraints) or implicitly (by an oracle). A convex optimization problem is an optimization problem in which the objective function is a convex function, and the feasible set is a convex set. A function ƒ mapping some subset of n into ∪{±∞} is convex if its domain is convex and for all θ ∈ [0,1] and all x,y in its domain, the following condition holds:
ƒ(θx+(1−θ)y)≤θƒ(x)+(1−θ)ƒ(y).
A set S is convex if for all members x,y ∈ S and all θ ∈ [0,1], we have that θx+(1−θ)y∈S. Concretely, a convex optimization problem is the problem of finding some x* ∈C attaining: inf{ƒ(x): x ∈ C, where the objective function ƒ:⊆n→ is convex, as is the feasible set C. If such a point exists, it is referred to as an optimal point or solution and the set of all optimal points is called the optimal set. If ƒ is unbounded below over C or the infimum is not attained, then the optimization problem is said to be unbounded. Otherwise, if C is the empty set, then the problem is said to be infeasible.
An example environment in which the described technology may be applied is provided below, describing a blockchain environment implementing a cryptocurrency and digital payment system intended to be a blockchain-based cooperative digital storage and data retrieval platform. In this platform, the concept of “space-time” is used to allow the metering of the data stored in the network with an expiry time. The platform aims to provide the functionality of recycling and re-allocating the free storage on participating nodes. The platform can be seen as a blockchain with a marketplace based on the platform's cryptocurrency for selling and buying extra storage capacity. Primarily, there are five actors in this ecosystem:
In an application of the described technology, a user wants the retrieval miners to perform a secure key rotation/update on his or her ciphertext. However, the user would like to verify the integrity of the underlying plaintext (e.g., ensure that the retrieval miner performed a key rotation operation correctly). However, the user wants to verify the correctness of the key rotation/update without first decrypting the ciphertext and comparing the plaintext against a known version of the plaintext. Clearly, there are following two serious drawbacks to such decryption and comparison:
In contrast, the described technology addresses the problem of verifying the integrity and correctness of ciphertext generated via key update(s), such that the verification procedure addresses one or more of the following requirements:
In this context, the described technology provides key rotation verification without decrypting the cybertext to be verified. Suppose :×→X be a function with a convex range X⊂∪{±∞}. Each member function f(k, p) ∈ is indexed by the parameters k ∈ qn and p ∈pm. The challenge in designing such a function family for the example use case is that the function family is to be non-invertible without possessing at least one of the parameters—along with the function output. Multiple quantum-safe function classes exist that can satisfy the requirements necessary for . As a proof, such a function family is constructed by altering a lattice-based key-homomorphic pseudorandom function (PRF) family. The full construction is presented in the following text.
Let l=[log q]. Define a gadget vector as:
g=(1,2,4, . . . , 2l-1)∈ql.
Define a deterministic decomposition function g−1:q→{0,1}d, such that g−1(a) is a “short” vector and ∀a ∈ q, it holds that: g, g−1(a)=a, where · denotes the inner product. The function g−1 is defined as:
g
−1(a)=(x0, x1, . . . , xl−1)∈{0,1}l,
where a=Σi=0l−1 xi2i is the binary representation of a. The gadget vector is used to define the gadget matrix G as:
G=l
n
⊗g=diag(g, . . . , g)∈qn×nl,
where ln is the n×n identity matrix and ⊗ denotes the Kronecker product. The binary decomposition function, g−1, is applied entry-wise to vectors and matrices over q. Thus, g−1 is extended to get another deterministic decomposition function G−1: qn×m→{0,1}nl×m such that, G·G−1(A)=A.
Let T be a full binary tree with at least one node, with T.r and T. denoting its right and left subtree, respectively. For two randomly sampled matrices, A0, A1∈qn×nl, define function AT(X): {0,1}|T|→qn×nl as:
where x=∥xr, for ∈, xr∈{0,1}|T.r|. The KH-PRF function family is defined as:
A
,A
,T
{Fs:{0,1}|T|→X}.
A member of the function family is indexed by the seed s∈{−1,0,1}n as: Fs(x)=s·AT(x)+emodq, where
It is also worth mentioning that using such errors to generate hard LWE instances is a debatable topic as doing so causes a large Rényi divergence from the errors used by other deterministic—and even probabilistic—hard-to-invert functions that are based on LWE. However, since there is no concrete evidence proving that such errors lead to significantly weaker LWE instances, these errors were chosen along with adding an additional constraint which is:
where wt(x) denotes the Hamming weight of x. On the other hand, sampling the seeds as s∈{−1,0,1}n is a proven method to generate hard-to-invert LWE instances. Note that the function family has two parameters, namely the key/seed s and the plaintext/input x.
It can be proven that the output space of is a convex set. The proof follows directly from the output space being qn is a quantum-safe family of key-homomorphic PRFs, and can be used to realize secure, symmetric, bi-directional updatable encryption. Based on the described technology, no two plaintexts can exist in the same smaller convex set within qn. The central idea is to limit the permitted plaintexts such that
for any two plaintexts x1, x2.
To verify that a given ciphertext C corresponds to plaintext P, the verifier has access to a separation oracle for the convex set to which encryptions of P belong within qn. However, in order to implement a separation oracle, a membership oracle is used in one implementation as follows: translate C to the nearest element of a sufficiently “coarse” public subset of p«q, well-separated values in qn (e.g., a subgroup), where p is a prime. Let C′ be a previous encryption of P, i.e., before the latest key update which generated C. Compute the Rényi divergence between C and C′ and if that computed divergence is
(an example separation condition), then output “yes” (validating that the two ciphertexts correspond to the same plaintext), else output “no” (indicating that the two ciphertexts do not correspond to the same plaintext). Using this membership oracle, one or more quantum algorithms can implement a separation oracle with query complexity Õ(1). Accordingly, the membership oracle can determine whether C and C′ correspond to the same plaintext P without decrypting any of them.
With regard to the separation condition, the manner in which the function family is selected allows the system to leverage this separation condition, which essentially states that for carefully selected (e.g., two or more) plaintexts, the probability that any function from the function family maps them to the same convex set is negligible. Hence, the separation condition separates the domain via the separates exhibited in the range of the function family.
A divergence operation 204 computes a divergence between the two ciphertext inputs. In one implementation, the divergence is a Rényi divergence, although other types of divergence computations may be employed. A membership operation 206 executes a membership oracle on the two ciphertext inputs. The two ciphertext inputs are determined to be members of the convex set if the computed divergence satisfies a separation condition. A validation operation 208 validates that the two ciphertext inputs correspond to the same plaintext input, responsive to determining that the two ciphertext inputs are members of the same convex set of the one or more convex sets, wherein the two ciphertext inputs do not correspond to the same plaintext input if the two ciphertext inputs are members of different convex sets of the one or more convex sets. In this manner, the validation of successful key rotation does not require decryption of either ciphertext input and/or the possession of the original plaintext.
In at least one implementation, a key rotator 304 is configured to select the encryption function from the function family having the output space of one or more convex sets. Also, in some implementations, the key rotator 304 is configured to select the encryption function from the function family having the output space of the one or more convex sets.
An input interface 306 is configured to input two ciphertext inputs encrypted from one or more plaintext inputs by an encryption function using different encryption keys. For example, a first ciphertext input was encrypted using an old cryptographic key, and a second ciphertext input was encrypted using a new cryptographic key, after a key rotation action. The encryption function is selected from a function family having an output space of one or more convex sets. In some implementations, the input interface 306 receives the two ciphertexts from a key rotator 304 of the system 300. In other implementations, the input interface 306 receives the two ciphertexts from external systems or from storage.
A divergence evaluator 308 is configured to compute a divergence between the two ciphertext inputs. In one implementation, the divergence is a Rényi divergence, although other types of divergence computations may be employed. A membership verifier 310 is configured to execute a membership oracle on the two ciphertext inputs. The two ciphertext inputs are determined to be members of the same convex set of the one or more convex sets if the computed divergence satisfies a separation condition. A validator 312 is configured to validate that the two ciphertext inputs correspond to the same plaintext input, responsive to determining that the two ciphertext inputs are members of the same convex set of the one or more convex sets, wherein the two ciphertext inputs do not correspond to the same plaintext input if the two ciphertext inputs are not members of the same convex set of the one or more convex sets. In this manner, the validation of successful key rotation does not require decryption of either ciphertext input and/or the possession of the original plaintext.
In an example computing device 400, as shown in
The computing device 400 includes a power supply 416, which is powered by one or more batteries or other power sources and which provides power to other components of the computing device 400. The power supply 416 may also be connected to an external power source that overrides or recharges the built-in batteries or other power sources.
The computing device 400 may include one or more communication transceivers 430 that may be connected to one or more antenna(s) 432 to provide network connectivity (e.g., mobile phone network, Wi-Fi®, Bluetooth®) to one or more other servers and/or client devices (e.g., mobile devices, desktop computers, or laptop computers). The computing device 400 may further include a network adapter 436, which is a type of computing device. The computing device 400 may use the adapter and any other types of computing devices for establishing connections over a wide-area network (WAN) or local-area network (LAN). It should be appreciated that the network connections shown are exemplary and that other computing devices and means for establishing a communications link between the computing device 400 and other devices may be used.
The computing device 400 may include one or more input devices 434 such that a user may enter commands and information (e.g., a keyboard or mouse). These and other input devices may be coupled to the server by one or more interfaces 438, such as a serial port interface, parallel port, or universal serial bus (USB). The computing device 400 may further include a display 422, such as a touch screen display.
The computing device 400 may include a variety of tangible processor-readable storage media and intangible processor-readable communication signals. Tangible processor-readable storage can be embodied by any available media that can be accessed by the computing device 400 and includes both volatile and nonvolatile storage media, removable and non-removable storage media. Tangible processor-readable storage media excludes communications signals (e.g., signals per se) and includes volatile and nonvolatile, removable and non-removable storage media implemented in any method or technology for storage of information such as processor-readable instructions, data structures, program modules or other data. Tangible processor-readable storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CDROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other tangible medium which can be used to store the desired information and which can be accessed by the computing device 400. In contrast to tangible processor-readable storage media, intangible processor-readable communication signals may embody processor-readable instructions, data structures, program modules or other data resident in a modulated data signal, such as a carrier wave or other signal transport mechanism. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, intangible communication signals include signals traveling through wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, and other wireless media.
Various software components described herein are executable by one or more hardware processors, which may include logic machines configured to execute hardware or firmware instructions. For example, the processors may be configured to execute instructions that are part of one or more applications, services, programs, routines, libraries, objects, components, data structures, or other logical constructs. Such instructions may be implemented to perform a task, implement a data type, transform the state of one or more components, achieve a technical effect, or otherwise arrive at a desired result.
Aspects of processors and storage may be integrated together into one or more hardware logic components. Such hardware-logic components may include field-programmable gate arrays (FPGAs), program- and application-specific integrated circuits (PASIC/ASICs), program- and application-specific standard products (PSSP/ASSPs), system-on-a-chip (SOC), and complex programmable logic devices (CPLDs), for example.
The terms “module,” “program,” and “engine” may be used to describe an aspect of a remote control device and/or a physical controlled device 802 implemented to perform a particular function. It will be understood that different modules, programs, and/or engines may be instantiated from the same application, service, code block, object, library, routine, API, function, etc. Likewise, the same module, program, and/or engine may be instantiated by different applications, services, code blocks, objects, routines, APIs, functions, etc. The terms “module,” “program,” and “engine” may encompass individual or groups of executable files, data files, libraries, drivers, scripts, database records, etc.
It will be appreciated that a “service,” as used herein, is an application program executable across multiple user sessions. A service may be available to one or more system components, programs, and/or other services. In some implementations, a service may run on one or more server computing devices.
While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any inventions or of what may be claimed, but rather as descriptions of features specific to particular embodiments of particular described technology. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
Thus, particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In certain implementations, multitasking and parallel processing may be advantageous.
A number of implementations of the described technology have been described. Nevertheless, it will be understood that various modifications can be made without departing from the spirit and scope of the recited claims.