The advancement of science is possible when knowledge is shared and information is exchanged in a seamless manner. In a world where many businesses rely on information as their main assets, analysis over data is a crucial competitive advantage. Consequently, the amount of data processed and stored will continue to increase, creating a demand for virtualized services. To this end, some applications can be provided as cloud computing resources including Internet of Things (IoT), machine learning, virtual reality (VR) and blockchain. As a result, concerns about custody and privacy of data are on the rise.
Modern concealment/encryption employs mathematical techniques that manipulate positive integers or binary bits. Asymmetric concealment/encryption, such as RSA (Rivest-Shamir-Adleman), relies on number theoretic one-way functions that are predictably difficult to factor and can be made more difficult with an ever-increasing size of the encryption keys. Symmetric encryption, such as DES (Data Encryption Standard) and AES (Advanced Encryption Standard), uses bit manipulations within registers to shuffle the concealed text/cryptotext/ciphertext to increase “diffusion” as well as register-based operations with a shared key to increase “confusion.” Diffusion and confusion are measures for the increase in statistical entropy on the data payload being transmitted. The concepts of diffusion and confusion in encryption are normally attributed as first being identified by Claude Shannon in the 1940s. Diffusion is generally thought of as complicating the mathematical process of generating unencrypted (plain text) data from the encrypted (cryptotext/ciphertext) data, thus, making it difficult to discover the encryption key of the concealment/encryption process by spreading the influence of each piece of the unencrypted (plain) data across several pieces of the concealed/encrypted (cryptotext) data. Consequently, an encryption system that has a high degree of diffusion will typically change several characters of the concealed/encrypted (cryptotext/ciphertext) data for the change of a single character in the unencrypted (plain) data making it difficult for an attacker to identify changes in the unencrypted (plain) data. Confusion is generally thought of as obscuring the relationship between the unencrypted (plain) data and the concealed/encrypted (cryptotext) data. Accordingly, a concealment/encryption system that has a high degree of confusion would entail a process that drastically changes the unencrypted (plain) data into the concealed/encrypted (cryptotext/ciphertext) data in a way that, even when an attacker knows the operation of the concealment/encryption method (such as the public standards of RSA, DES, and/or AES), it is still difficult to deduce the encryption key.
Homomorphic Encryption is a form of encryption that allows computations to be carried out on concealed ciphertext as it is concealed/encrypted without decrypting the ciphertext that generates a concealed/encrypted result which, when decrypted, matches the result of operations performed on the unencrypted plaintext.
The word homomorphism comes from the ancient Greek language: óμóç (homos) meaning “same” and μoρφ{acute over (η)} (morphe) meaning “form” or “shape.” Homomorphism may have different definitions depending on the field of use. In mathematics, for example, homomorphism may be considered a transformation of a first set into a second set where the relationship between the elements of the first set are preserved in the relationship of the elements of the second set.
For instance, a map f between sets A and B is a homomorphism of A into B if
where “op” is the respective group operation defining the relationship between A and B.
More specifically, for abstract algebra, the term homomorphism may be a structure-preserving map between two algebraic structures such as groups, rings, or vector spaces. Isomorphisms, automorphisms, and endomorphisms are typically considered special types of homomorphisms. Among other more specific definitions of homomorphism, algebra homomorphism may be considered a homomorphism that preserves the algebra structure between two sets.
An embodiment of the present invention may comprise a method for Homomorphic Encryption (HE) compatible encoding and decoding of rational data for encrypted data transmission with a Fully Homomorphic Encryption (FHE) system between a source computing device and a destination computing device, the method comprising: encoding by the source computing device at least one rational number x/y into at least one integer corresponding to the at least one rational number x/y as a function of p-adic arithmetic performed on each of the at least one rational number x/y such that the at least one integer retains homomorphic properties; encrypting by the source computing device the at least one integer into at least one ciphertext with the FHE system operating on the source computing device; sending by the source computing device the at least one ciphertext to the destination computing device; decrypting by the destination computing device the at least one ciphertext into the at least one integer with the FHE system operating on the destination computing device; and decoding by the destination computing device the at least one integer into the at least one rational number x/y corresponding to the at least one integer as a function of inverse p-adic arithmetic performed on each of the at least one integer.
An embodiment of the present invention may further comprise a PIE (p-adic encoding) system that encodes and decodes rational data with Homomorphic Encryption (HE) compatibility for encrypted data transmission with a Fully Homomorphic Encryption (FHE) system between a source computing device and a destination computing device, the PIE system comprising: the source computing device, wherein the source device further comprises: a PIE encode subsystem that encodes at least one rational number x/y into at least one integer corresponding to the at least one rational number x/y as a function of p-adic arithmetic performed on each of the at least one rational number x/y such that the at least one integer has homomorphic properties and is HE compatible; the FHE system operating on the source computing device that encrypts the at least one integer into at least one ciphertext; a ciphertext send subsystem that sends the at least one ciphertext to the destination computing device; the destination computing device, wherein the destination computing device further comprises: the FHE system operating on the destination computing device that decrypts the at least one ciphertext into the at least one integer; and a PIE decode subsystem that decodes the at least one integer into the at least one rational number x/y corresponding to the at least one integer as a function of inverse p-adic arithmetic performed on each of the at least one integer.
In the drawings,
A large part of current research in Homomorphic Encryption (HE) aims towards making HE practical for real-world applications. In any practical HE, an important issue is to convert the application data (type) to the data type suitable for the HE.
The main purpose of this work is to investigate an efficient HE-compatible encoding method that is generic, and can be easily adapted to apply to the HE schemes over integers or polynomials.
p-adic number theory provides a way to transform rationals to integers, which makes it a candidate for encoding rationals. Although one may use naive number-theoretic techniques to perform rational-to-integer transformations without reference to p-adic numbers, we contend that the theory of p-adic numbers is the proper lens to view such transformations.
In this work we identify mathematical techniques (supported by p-adic number theory) as appropriate tools to construct a generic rational encoder, which is compatible with HE. Based on these techniques, we propose a new encoding scheme PIE (p-adic encoding) that can be easily combined with both AGCD (Approximate Greatest Common Divisor)-based and RLWE (Ring Learning With Error)-based HE to perform high precision arithmetic. After presenting an abstract version of PIE, we show how it can be attached to two well-known HE schemes: the AGCD-based IDGHV (Integer—Dijk, Gentry, Halevi, and Vaikuntanathan) scheme and the RLWE-based (modified) Fan-Vercauteren (FV) scheme. We also discuss the advantages of our encoding scheme in comparison with previous works.
Much of current research and development in HE is focused on efficient implementation with suitable software and/or hardware support and developing practically usable libraries for HE that can be used for various machine learning and data analysis applications. These works clearly aim towards making HE practical for real-world applications.
The state-of-the-art HE schemes are defined to process (modulo) integer inputs or polynomial inputs (with modulo integer coefficients). For a significantly large number of practical applications, an HE scheme should be able to operate on real/rational numbers. In any practical HE an important issue is to convert the application data (type) to the data type suitable for the HE. This is usually achieved by encoding real-valued data to convert it into a “suitable” form compatible with homomorphic encryption. Any encoding must come with a matching decoding. Additionally, such an encoding must be homomorphic with respect to addition and multiplication, and injective. Most importantly, any such encoding technique must be efficient and not hinder the efficiency of the underlying HE scheme.
The interest in HE-compatible encoding to process real/rational inputs efficiently is evident from a number of previous works. In most of the RLWE (Ring Learning with Error) hardness-based homomorphic encryption schemes, a plaintext is viewed as an element of the ring Rt=t[x]/ϕm(x) where ϕm(x) is the m-th cyclotomic polynomial and t is the ring of integers modulo t. Encoding integer input to a polynomial in Rt is relatively straightforward, namely one can consider the base t representation of the integer. For allowing integer and rational inputs one must define encoding converting elements of or (typically represented as fixed-point decimal numbers in applications) into elements of Rr. Previous works have proposed several encoding methods for integers and rationals. One previously taken approach is to scale the fixed-point numbers to integers and then encode them as polynomials (using a suitable base). Another approach is to consider them as fractional numbers. It was shown that these two representations are isomorphic. The latter approach, although avoiding the overhead of bookkeeping homomorphic ciphertext, is difficult to analyze.
All of the aforementioned encodings share a problem; namely, t must have a sufficiently large value for the encoding to work correctly. This large value of t means one may need to choose large parameters for the overall homomorphic encryption scheme hindering the efficiency. A clever solution to this problem was proposed by Chen, Len, Player and Xia (CLPX), which borrows a mathematical technique from Hoffstein and Silverman and combines it with the homomorphic encryption scheme proposed by Fan and Vercauteren (FV). The main idea of the so-called CLPX encoding is to replace the modulus t with the polynomial x−b for some positive integer b and turning the plaintext space into the quotient ring /(bn+1). Note that CLPX encoding converts fractional or fixed-point numbers and the scheme combines it with a modified version (which we will call ModFV) of the original FV scheme.
In the CLPX encoding, the rational (input) domain is a finite subset of and, therefore, is not closed under the usual compositions (addition and multiplication) which can potentially lead to overflow problems. That is, if the composition of two rational inputs lies outside the domain then its decoding (after homomorphic computation) will be incorrect. However, they do not provide any analytical discussion or solution towards solving this problem. The theory behind our encoding, which also transforms fixed-point (decimal) numbers, allows us to provide an analytical solution to this problem.
The main aim of our work is to investigate an efficient HE-compatible encoding method that is generic (not necessarily targeted for a specific HE scheme) and can be easily adapted to apply to the HE schemes over integers or polynomials. The results of this work are as follows:
We show our encoding scheme allows for a much larger input space compared to the previous encoding schemes for an RLWE-based HE without severely compromising the circuit depth that can be evaluated using the HE. To the best of our knowledge this is the first work discussing an encoding scheme for AGCD-based schemes.
We implemented PIE using C++ (together with proof-of-concept implementations of IDGHV (Batch FHE) and ModFV schemes (FHE part of our implementation is not optimized) to estimate the efficiencies of the encoding and decoding. The results of our experiment are given in Section 6.
In this section we introduce the basic ideas and techniques from p-adic number theory that are necessary for developing our encoding scheme. We emphasize that the ideas described in this section are self-contained and do not assume prior knowledge of p-adic number theory.
For a real number r, the functions └r┘, ┌r┐, └r┐ denote the usual “floor”, “ceiling”, and “round-to-nearest-integer” functions. For an integer a, |a|bits denotes the bit length of a. The ring of integers is denoted by , and the field of rationals by . For a positive integer n, /n denotes the ring of integers modulo n. In case n is prime, we sometimes write n. To distinguish this ring (field) from sets of integer representatives, we denote by n the set [−┌(n−1)/2┐, └(n−1)/2┘]⋅. For integers a, n we denote by a mod n the unique integer ā∈n such that n|(a−ā). Similarly, we use the elements of n as representatives of the cosets of /n, and sometimes use n in place of /n, though in this case we are careful to put “mod n” where appropriate. For a polynomial p, └p┐ and [p]n denote the rounding of each coefficient to the nearest integer, and the reduction of each coefficient modulo n. We use everywhere log(·) in place of log2(·). “Input space” will always mean the set of fractions for which encoding correctness holds, and “message space” always means a subset of the input space for which homomorphic correctness (for arithmetic circuits up to a certain depth) holds.
2.2 Results and Techniques from p-adic Arithmetic.
Roughly speaking, p-adic number theory allows us to represent a rational
using integers.
and p is a prime then we have:
where 0≤aj<p and n∈. When n∈+∪{0} the sum in 1 is called a p-adic integer. Equivalently, observe that any rational x/y can be rewritten in the form:
The number v is called the p-adic valuation of x/y. In case v≥0, x/y is a p-adic integer. The ring of p-adic integers is denoted by p.
An r-segment p-adic representation, a.k.a. Hensel code, simply truncates the above sum after j=r−1. In this case, the power series in eq. (1) becomes:
A natural consequence of this truncated representation is a mapping (discussed in detail in Definition 3) from a set of rationals to /pr. This mapping is the main component of our encoding scheme.
A specific set of rational numbers (p-adic numbers) called the Farey rationals are defined as follows.
Definition 1 (Farey Rationals): Given a prime p and an integer r≥1, let
The Farey rationals are defined as:
where gcd(x, y)=gcd(y, p)=1. (gcd is greatest common denominator)
We note that every rational in N has p-adic valuation v≥0, and therefore N⊂p; i.e. every Farey rational is a p-adic integer.
For describing the mapping on which our encoder is based, we need to introduce the Modified Extended Euclidean algorithm MEEA. The MEEA is simply a truncated version of the extended Euclidean algorithm (EEA) and is similarly efficient. We pause briefly to describe the EEA. Recall that the EEA calculates the greatest common divisor of two integers x0, x1 along with the associated Bézout coefficients y, z∈ such that x0·y+x1 z=gcd(x0, x1). The computation generates the tuples (x2, . . . , xn), (y2, . . . , yn), (z2, . . . , zn), and qi=└xi−1/xi┘ such that:
Moreover, for each i≤ n, we have yix1+zix0=xi. The computation stops with xn=0, at which point xn−1=gcd(x0, x1).
Definition 2 (MEEA): Given x0, x1∈, MEEA(x0, x1) is defined as the output (x, y)=((−1)i+1xi, (−1)i+1yi) of the extended Euclidean algorithm (as described above) once |xi|≤N.
Now we are ready to define the necessary mapping from N to p
Definition 3: The mapping Hp
The H-mapping is injective and, therefore, gives a unique representation of each element of N in p
Proposition 1. For all x/y∈N and h∈Hp
Proof. Let x/y∈N, Hp
Proposition 2. The mappings Hp
Proof. (i) Let a/b, c/d∈N. By definition of the Farey rationals, a, b, c, d are co-prime with p. That Hp
Example 1. Given rationals a=12.37 and b=8.3, we choose p=3, r=10. Here N=└√{square root over ((pr−1)/2)}┘=125261. We compute the encodings of a and b as h1 and h2:
We can now compose the rationals with addition, subtraction, and multiplication, and decode to check correctness:
C
Replacing the prime power with a composite. The above results can be extended when pr is replaced by an arbitrary positive integer g. Let p1, . . . , pk be distinct primes, g=p1r
We briefly recall (the integer version of) the Chinese Remainder Theorem (CRT), as it is necessary for our encoding scheme.
Definition 4 (Chinese Remainder Theorem). Let n1, . . . , nk be k co-prime integers, and n=Πi=1kni. The CRT describes the isomorphism /n≅/n1× . . . ×/nk given by:
We denote the x such that x=hi mod ni and (h1, . . . , hk)∈(h1, . . . , hk)∈/n1× . . . ×/nk by CRTn
Remark 1. In the following definition, we abuse notation slightly and identify CRT . . . ( . . . ) not with actual ring elements in /n, but with integer representatives in n.
Definition 5. The injective mapping Hg:N→g and its inverse are defined as:
The following proposition is an extension of proposition 1 for composite g and its proof proceeds similar to the proof of proposition 1.
Proposition 3. Let N=└√{square root over ((g−1)/2)}┘. For all x/y∈N and h∈Hg(N)g,
Proposition 4. The mapping Hg is homomorphic with respect to addition and multiplication, and Hg−1 is homomorphic as in proposition 2.
Proof. Let N=└√{square root over ((g−1)/2)}┘, and u, u′∈N. Using the homomorphic properties of the CRT where necessary, we have:
By proposition 2(i), each
Whence Hg(u+u′)=Hg(u)+Hg(u′). The proof that Hg(u·u′)=Hg(u)·Hg(u′) is analogous.
To establish the homomorphic properties of Hg−1 simply replace pr by g everywhere in the proof of proposition 2(ii).
Example 2. Suppose we have the same rationals of Example 1: a=12.37 and b=8.3. We now choose p=6, r=17 and g=p+1=16926659444737, which yields N=└√{square root over ((g−1)/2)}┘=2909180. The encodings of a and b are:
Again, we compose the encodings, and verify the correctness of the results:
Remark 2. Definitions 3 and 5 coincide when g=pr (a prime power), so one should take the latter as the general definition of H and H−1, picking g to be a prime power when necessary.
Size of the set. The cardinality of N for N=└√{square root over ((g−1)/2)}┘ depends heavily on the choice of g. This is because the number of fractions x/y with |x|, |y|≤N that fail the condition gcd(y, g)=1 depends on the prime factorization of g—the more “small” prime factors g has, the more fractions fail the god condition.
Proposition 5. The cardinality of N for N=└√{square root over ((g−1)/2)}┘ is given by:
where Φ(k)=Σi−1kϕ(i) and ϕ is the Euler's totient function.
Proof. Use the fact that the kth Farey sequence (the kth Farey sequence is the set of reduced fractions in the interval [0, 1) with numerator and denominator each at most k) has length 1+Φ(k), and then enforce the god condition on the Farey rationals.
Simulations show that when g is an odd prime,
This fact will be used for comparison with existing work in section 5.2 below.
Let g be a positive integer, N=└√{square root over ((g−1)/2)}┘, and make N the input space. We define encoding and decoding as follows:
Proposition 6. For all m, m′∈N such that m·m′∈N,
Proof. Use proposition 3(i), proposition 3(iii), and proposition 4.
Corollary 1. Let p be a multivariate polynomial with coefficients in . For all m0, . . . , mk∈N such that p(m0, . . . , mk)∈N,
As indicated in the preceding results, for the encoding (and decoding) to yield the correct result when used in an HE scheme, one must ensure that if two or more elements from N are combined using additions and/or multiplications then any intermediates and the final output must not lie outside the set N. For this reason, we will define the (rational) message space to be the following subset of N:
The main idea behind choosing a subset of N as the set of messages is that when elements from M are combined, the resulting element can be in N. Ensuring the output lands in N induces a bound on the number of computations that can be performed, and determines the choice of parameters involved therein. At this point, one might wonder whether we need to do something similar with the range g of the encoder to make sure that overflow modulo g does not occur during computations. The answer is “no”. This is because proposition 3(iii) along with the above message space restriction imply that overflow modulo g does not affect decoding.
The choice of M depends jointly on the rational data one must encode, and the circuits one must evaluate over those data. We elaborate this in the following section.
3.1 Choosing the Message Space M.
We will describe an arithmetic circuit in terms of the multivariate polynomial it computes. To this end, recall that the 1-norm of a polynomial is simply the sum of the absolute values of its coefficients.
Polynomials with which PIE is compatible. Let d,t denote the set of polynomials in [x1, x2, . . . ] with total degree at most d and 1-norm at most t, whose coefficients have absolute value at least 1. For example, d,t contains polynomials of the form:
where each |cα≥1, and Σα|cα|≤t.
The following proposition establishes an upper bound on the output of a polynomial in d,t when all inputs are from M.
Proposition 7. If x1/y1, . . . , xk/yk∈M, p∈d,t is k-variate, and p(x1/y1, . . . , xk/yk)=x/y, then:
Proof. Note that d,t can be written as, p=Σicipi, where Σi|ci|≤t, each |c|≥1, and each pi is a monomial of degree at most d.
Since deg(pi)≤d, the evaluation pi(x1/y1, . . . , xk/yk) is a fraction of the form:
As each xi/y1∈M, we have |ai|, |bi|≤≤Md.
Since x/y=Σi=11ci·ai/bi, there are nonzero integers α and β such that:
It follows from Σ|ci|≤t and the above bound on |ai|, |bi| that:
The proof is completed by observing that |ci|≥1, for all i, implies I≤t.
Proposition 8. A sufficient condition for compatibility of PIE with polynomials in d,t as in Corollary 1:
Proof. Suppose M is chosen according to eq. (9), and let p∈d,t be k-variate. According to proposition 7, if m≠Mk and p(m)=x/y, then:
Clearly gcd(g, y)=1, since y is a factor of the product of the denominators in m. Thus p(m)∈N, and the proof is completed.
4 PIE with a Batch FHE over Integers
Batch FHE. Let λ be the security parameter, γ and η be the bit-length of the public and secret key respectively, and ρ be the bit-length of noise. Further, choose -bit integers 1, . . . , . The IDGHV scheme is defined as follows.
IDGHV.KGen(1λ, . Choose distinct η-bit primes p1, . . . , , and let π be their product. Choose a uniform 2λ
Let pk={x0, , (xi)1≤i≤τ, } and sk=.
IDGHV.End(pk,m). For m=(m1, . . . , )∈/× . . . ×/, choose a random binary vector b=(b1, . . . , bτ) and output the ciphertext:
The security of the IDGHV scheme is based on the decisional approximate GCD problem (DACD).
4.1 PIE with IDGHV.
Definition 6. Let C be an arithmetic circuit and ρ′=max{ρ+log()+, 2ρ+log(τ)}. C is a permitted circuit if every input being bounded in absolute value by implies the output is bounded in absolute value by 2η−4.
Describing circuits in terms of the multivariate polynomial they compute yields a sufficient condition for determining whether a given circuit is permitted.
Lemma 1. Let C be an arithmetic circuit over the rationals comprised of addition/subtraction and multiplication gates, f be the multivariate polynomial that C computes, and |f|1 be the 1 norm of f. If
then C is a permitted circuit.
One can show that for a circuit with multiplicative depth D, the total degree of the polynomial f computed by the circuit is at most 2D−1+1≈2D−1. Further, we note that maximum value of deg(f) is (roughly) inversely proportional to ||bits=, so the multiplicative depth of permitted circuits decreases as the bit size of the increases.
We assume here that log(|f|1)«η, ρ′, so it suffices to choose such that μ/(ρ′+) is not too small. To this end, suppose we want to support circuits computing a polynomial of degree at most δ. Then we choose <2+, =O(ρ), and η≥ρ′Θ(δ). In particular, we recommend:
P
First, we pause to remind the reader of the relevant parameter sizes for IDGHV.
For ciphertexts of the form c=(q, 1r1+m1, . . . , +), we have |pi|bits=η, ||bits=, and ρ′=max{ρ+log()+, 2ρ+log(τ)}.
In the following discussion, g=, N=└√{square root over ((g−1)/2)}┘, and M is the message space, where M≤N.
Choosing circuits first. Given a set of circuits, we must choose d and t so that d,t contains the polynomials which the circuits in the set compute. To this end, choose d, t to satisfy lemma 1. That is,
We put t=1 for convenience and to maximize the multiplicative depth of permitted circuits, whence the permitted circuits are given by d,1 for d≈(η−4)/(ρ′+)−1. Rewriting eq. (9) to get a bound on |M|bits and using the above values of d, t we obtain:
Note that t may be chosen much larger, though too large a value may force M to be unreasonably small in order to satisfy eq. (9).
Choosing messages first. M must satisfy eq. (9). Thus, circuits which compute polynomials in d,t are permitted as long as
This inequality is satisfied by choosing:
Thus, we may choose:
Note that this will require the values of and to be quite large. E.g., M log(M)≤
Two Encoding Options. There are two ways to combine PIE with IDGHV: using the Chinese Remainder Theorem, and component-wise. The former encodes single rationals, while the latter encodes vectors of rationals. Depending on the application a user can choose one of these two. We elaborate them below.
E
We use the Chinese Remainder Theorem (CRT) to convert the integer output of PIE.Encode to a vector of integers which is the input to IDGHV. We encode and decode with IDGHV as the underlying encryption scheme as follows:
Encoding and decoding above are computed with Hg and its inverse.
Choosing M for CRT Encoding. M must be chosen according to eq. (12). That is,
E
In the component-wise encoding, for each i, PIE.Encode(hi) and PIE.Decode(hi) are computed with as the modulus, i.e., the encoding and decoding functions are HQ
Choosing the Mi for Component-wise Encoding. Since we are encoding with primes i instead of their product, it suffices here to make a minor change to eq. (12). Namely, we put =1. This yields:
Q
Remark 3. |M|bits=23 simply means that the message space is comprised by fractions whose numerators and denominators are up to 23 bits. Note that the co-primality restriction will not apply if M is smaller than every prime factor of g=πi.
Choosing the i appropriately. We emphasize that PIE may be attached to IDGHV regardless of the choice of the i. However, the input space N (of PIE) may be too small to be useful if the number and size of the i are too small. In contrast, note that the i can be small as long as there are “enough” of them. Similarly, if the number of i is small, then their product should be quite large. As an example of the former, if i=3 for i=1, . . . , 5, then the message space of IDGHV is (isomorphic to) /35. The encoding modulus for PIE is 35=243 which is co-prime with 10, so we can encode certain decimal numbers up to precision 2 such as 1.37=137/100.
We can use parameters to determine the size of each element in the corresponding message space by coupling PIE with IDGHV. Let 1, . . . , be distinct primes—public key elements in IDGHV. For encoding a single message, we take the product of all i''s as g and encode the rational message using g. Four different configurations are provided: Toy, Small, Medium, and Large. In the Medium configuration, we have 138 56-bit i's. This gives us a g of roughly 7728 bits with an N of roughly 3864 bits. In the Large configuration, we have 531 71-bit i's. This gives us a g of length roughly 37701 bits with an N of roughly 18850 bits.
A large N resulting from (secure) HE parameters, is very advantageous. For example, if we take N≈218850 and M=264−1 (that allows fractions with numerators and denominators of up to 64 bits to be encoded), then we can use eq. (9) to find sets of polynomials d,t with which PIE is compatible. In this case, we get compatibility with polynomials in 2
5 PIE with Modified Fan-Vercauteren HE
The modified FV scheme. We give a brief description of a modification of the FV
HE scheme that is based on the decisional Ring Learning With Errors (RLWE) problem. The main difference between the modified FV (ModFV) and FV is that the former encrypts integers while the latter encrypts polynomials. In particular, ModFV is obtained from FV by attaching the Hat Encoder. We recall the encoder here.
Definition 7 (Hat Encoder). Let ∥·∥∞ denote the polynomial infinity norm. For m∈/(bn+1), b≥2 and n≥1, let {circumflex over (m)} be the polynomial with lowest degree such that ∥{circumflex over (m)}∥√≤(b+1)/2 and m(b)=m mod bn+1. Such a polynomial always exists and has degree at most n−1.
Roughly speaking, the Hat encoder takes the base-b expansion of m with coefficients in Zb
We are now ready to define ModFV. For n a power of 2 (typically at least 1024), denote the 2nth cyclotomic ring of integers by R=[x]/(xn+1), and let Ra denote the ring obtained by reducing the coefficients of R modulo a. The plaintext space is the ring =b
5.1 PIE with ModFV
We stress that although CLPX uses a function having the same definition as our “H-function”, their approach is not based on techniques from p-adic number theory. Consequently, the decode functions and input spaces differ dramatically between CLPX and PIE. A comparison of the input spaces is provided in section 5.2.
In pairing PIE with ModFV, we distinguish two cases: b″ +1 prime and b″ +1 composite. We note, however, that the definitions of encoding and decoding are identical for both cases. The differences lie in how b and n are chosen, and the resulting input spaces.
Put N=└√{square root over (((bn+1)−1)/2)}┘=└√{square root over (bn/2)}┘ and let M be as in Eq. (8).
That is, M is the set of reduced fractions x/y satisfying: |x|≤M, 1≤|y|≤M, and gcd(bn+1, y)=1. M is chosen to be much smaller than N according to eq. (9) and eq. (14). We define encoding as follows:
bn+1 prime. Note that since bn+1 is prime, the function Hb
Choosing b and n for bn+1 a prime. As one might suspect, there are rather few choices for b and n which make bn+1 prime. The known Fermat primes (primes of the form 22
bn+1 composite. For a composite bn+1, the mapping Hb
Proposition 9. If g is a positive integer and x/y∈N, then Hg(x/y)=xy−1 mod g.
Proof. This is immediate if g is prime, so suppose g is composite with prime factorization
and h=Hg(x/y). By definition 5,
By the definition of the CRT, h is the unique integer in g such that h=hi mod pir
As noted above, bn+1 may be large enough to make factoring infeasible. In this case, determining the entire input space is also infeasible, because one must enforce the condition: gcd(y,bn+1)≠1→x/y∉N. This is not a problem however, as we only need a suitable subset of N; namely M. We note that if y and b have the same prime factors, then gcd(y, bn+1)=1, whence we can encode x/y as long as every prime factor of y is a factor of b. For example, we may choose b=p1p2 . . . pk, the product of the first k primes for some k≥1, meaning we can encode all x/y∈M such that any prime factor of y is one of p1, . . . , pk. This approach can certainly give us a sufficiently large set of fractions as the message space of PIE, though this set may not be the entirety of M.
We further distinguish the case where b=p is prime, for this allows us to encode certain p-adic non-integers (p-adic numbers with negative valuation). In particular, since p and pn+1 are always co-prime, we can encode rationals of the form x/pk (k>0) that are contained in N.
Compatible Circuits. The performance of ModFV is assessed by evaluating so-called regular (arithmetic) circuits. We directly apply the bounds from their analysis on such circuits to our encoder to FV. A regular circuit is parameterized by non-negative integers A, D, L, and consists of evaluating A levels of additions followed by one level of multiplication, iterated D times, where inputs are integers from [−L, L]. Note that such a circuit has multiplicative depth D. The output c of a regular circuit (satisfies:
We define permitted circuits in essentially the same way as Section 4.1.
Definition 8. For fixed A, D, L, an arithmetic circuit Cis a (A, D, L)-permitted circuit if every input being bounded in absolute value by L implies the output is bounded in absolute value by V (A, D, L).
Eq. 13 implies every regular circuit parameterized by A, D, L is an (A, D, L)-permitted circuit. When the context is clear, we will omit “(A, D, L)” and simply write “permitted circuit”.
Lemma 2. Fix non-negative integers A, D, L. Let C be an arithmetic circuit, f be the multivariate polynomial that C computes, |f|1 be the norm of f, and V=V (A, D, L). If |f|1Ldeg(f)<V or equivalently,
then C is a permitted circuit.
Proof. Let C be an arithmetic circuit, and f be the k-variate polynomial which C computes. We can express f in the form Σi=11cifi, where the fi are monomials and the ci are the coefficients.
For x∈[−L, L]k and L=(L, L, . . . , L)∈{L}k, we use the triangle inequality and deg(fi)≤deg(f) to obtain:
The above inequalities yield |f(x)|≤ V, completing the proof.
To guarantee that PIE works seamlessly with ModFV, we must ensure that the maximum degree of polynomials compatible with ModFV does not exceed the maximum degree of polynomials compatible with PIE. Thus, according to lemma 2 and equation 9, we require:
where f computes an (A, D, L)-permitted circuit, and d,t is the set of polynomials with which PIE is compatible. In practice, this inequality is easily satisfied because log(N)/log(M) is quite large and t is chosen to be small.
CLPX adapts the polynomial encoding idea from previous works while addressing the problem of plaintext polynomial coefficient growth. As explained above, to obtain the maximum circuit depth (corresponding to homomorphic computation) for PIE with ModFV we can directly use their analysis. Table 2 shows that when used with PIE scheme, the multiplicative depths of circuits compatible with ModFV are almost the same as when used with CLPX encoding.
The definition of the CLPX input space depends on whether b>2 is even or odd. If b is odd, then bn+1 is even, which means no fractions with even denominators can be encoded, and, moreover, b″ +1 will not be prime. We consider the odd case to be too restrictive, and, therefore, only compare the input space of PIE with the input space of CLPX when b is even.
Proposition 10. For b even, the cardinality of the input space is
By proposition 5 and eq. (7), when bn+1 is prime, the cardinality of N is approximately 0.6(bn+1). Consequently, using proposition 10, we see the cardinality of N is roughly 0.6(b−1)-times (since bn is quite large,
the size of . Thus, our input space is larger when b≥3, and our size advantage is directly proportional to the size of b, as shown in table 3.
For bn+1 composite, our size advantage seems to remain, though it is less clear-cut than the prime case, since our examples use quite small b and n. In table 4, we estimate the size of N by using proposition 5 and the approximation Φ(n)≈3n2/π2. Note that, in practice, the size of b and n will be much larger than the numbers provided in the table, and we cannot speculate to how the relationship between |N| and || varies as b and n become large enough for practical applications.
We implemented PIE (in C++) together with proof-of-concept implementations of IDGHV and ModFV schemes (the FHE part of our implementation is not optimized) using NTL.
Since our encoding does not affect the run time of the underlying HE scheme, we provide benchmark times taken for encoding and decoding only. We estimated the runtime of encoding and decoding using two sets, each containing 10,000 rational numbers. The first set contains rationals with numerator and denominator up to 32 bits and the second set contains rationals with numerator and denominator up to 64 bits. These sets are simply the message space M={x/y||x|≤M, 0<y≤M} for M=232−1 and M=264−1, respectively. Runtimes are obtained as the average runtime over all the elements in each set. The results are shown in table 5. All experiments are done on a Mac-Book Pro with Apple M1 Max, 32 GB RAM, 1TB SSD.
Our implementation of encoding and decoding is not optimized for performance. We have used NTL for computing inverse in the encoding function. For the MEEA in decoding, we implemented the (truncated) extended Euclidean algorithm.
A Supplemental: Encodings with Primes and Prime Powers
Assume we want to encode the following fractions:
Let p=11 and r=3, so pr=1331 and N=└√{square root over ((pr−1)/2)}┘=25. Since the above fractions lie in 25, we can encode them as follows:
Due to the restriction gcd(denominator, pr)=1, many fractions x/y which satisfy |x|, |y|≤ N cannot be encoded. E.g., when pr=113, 23/22 cannot be encoded. Of course, this is because the mapping Hp
Let S be a set of fractions such that:
One can choose a prime that is sufficient for encoding and decoding all fractions by simply checking the largest numerator or denominator in absolute value and set it as the value of b and then find the right prime p such that:
The largest quantity in S is 61, so we set b=61, which means we need a prime p that satisfies:
The smallest prime to satisfy the above inequality is 7451 which gives N=└√{square root over ((7451−1)/2)}┘=61. That allows us to encode all fractions in S. We emphasize that this process works for any finite set of rationals.
Equivalently, one could choose a small prime which is co-prime with all of the denominators, and then choose an exponent r large enough to allow the fractions to be encoded. For example, p=3 is co-prime with all denominators in S, which means we must choose r large enough so that 3r≥2(61)2+1=7443. That is,
So pr=39 also suffices to encode the members of S.
However, can we actually do something with it? If we hope to compute over the image of S, we need to choose a prime (power) that allows “room” for including the outputs of the operations we expect to work with. Instead of choosing a prime from strict parameters, a more conservative approach could be to consider the bit length of the largest numerator or denominator and the function one wishes to compute. If this time we let b be the bit-length of the largest numerator or denominator in absolute value and the function be f (x1, x2, . . . , xn)=x1x2 . . . xn, then we need a prime that satisfies the following inequality:
Say that we have n=5. Since 61 is a 6-bit number, we set b=6. We now need a prime such that:
We choose p=3693628617552068003, a 62-bit prime which gives us the following encodings of the members of S:
and we can check that
which decodes to
and matches
This example shows the intuition behind Proposition 7 and Definition 8.
Extending the set N. While the Farey rationals N have a very simple description and are easy to work with, they have a downside: their size. For example, if p=907, then N =21 and the cardinality of N is 559. This means that 907−559=348 integers in 907 do not have a pre-image (under H907−1) in N. We address this by extending N to a set N,g.
Definition 9 (Extended Farey Rationals). For a positive integer g, the extended Farey rationals are defined as the set of reduced fractions:
Clearly N∩N,g. We also note that for all m∈N,g, Hg−1(Hg(m))=m (generalize proof of Proposition 1(i)). The following lemma provides a necessary, though not sufficient, condition for a rational number to be in N,g.
Proposition 11. Let g be a positive integer, and N=└√{square root over ((g−1)/2)}┘. If x/y∈N,g, then |x|≤N and |y|≤2N+1.
Proof. Let h∈g, and suppose Hg−1(h)=x/y. By definition of MEEA, x/y=xi/yi for some xi, yi computed by the EEA. That |x|≤N is immediate from the definition of Hg−1 (i.e., the stopping condition in MEEA). The outputs of the EEA satisfy:
for all k.
By definition, xi−1 >N. Whence, for N′=√{square root over ((g−1)/2)},
It follows that
completing the proof.
This proposition simplifies the process of deciding whether a given reduced rational number x/y is in N,g:
if and only if
Two Options for the Message Space. For a fixed positive integer g, we now have two sets of rationals which can serve as the domain of the encoder:
The advantage of N is its simplicity. N,g, on the other hand, is larger than N and, when g is prime, has exactly g elements.
In the embodiment shown in
The encrypted data 110 starts at the source computing device as one or more rational numbers (e.g., of the form x/y). An embodiment at the source 102 encodes the rational numbers into corresponding integers as a function of p-adic arithmetic performed on the rational numbers. The p-adic generated integers have homomorphic properties due to the p-adic arithmetic operations and are compatible with a Homomorphic Encryption (HE) system, including existing Fully Homomorphic Encryption (FHE) systems such as AGCD based systems like IDGHV scheme based systems and/or RWLE based systems like ModFV scheme based systems. An embodiment delivers the p-adic generated integer(s) to the Encryption portion of the FHE system 112 running on the source device for encryption of the integer data. The original or result ciphertext(s) 110 received at the destination device 106 is decrypted by the Decryption portion of the FHE system 114 running on the destination computing device 106 into a p-adic compatible integer(s). An embodiment on the destination device 106 decodes the decrypted integer(s) into corresponding rational numbers to obtain the ultimate desired values.
Generally, communications, including concealed/encrypted communications, are bi-directional such that the source 102, intermediary 104, and destination 106 computing devices may change roles as the encrypted data 110 source 102, intermediary 104, and the encrypted data 110 destination 106 as is necessary to accommodate the transfer of data back and forth between the computing devices 102, 104, 106. Notably, the intermediary computing device 104 does not require knowledge of the secret keys to perform the homomorphic arithmetic operations, so it is likely that the intermediary computing device 104 will be at least computationally isolated from the source 102 and destination 106 computing devices. Additionally, while the computing devices 102, 104, 106 are depicted as separate devices in
Further, as shown in
Various embodiments may implement the network/bus communications channel 108 using any communications channel 108 capable of transferring electronic data between the source 102, intermediary 104, and destination 106 computing devices. For instance, the network/bus communication connection 108 may be an Internet connection routed over one or more different communications channels during transmission between the source 102, intermediary 104, and destination 106 devices. Likewise, the network/bus communication connection 108 may be an internal communications bus of a computing device, or even the internal bus of a processing or memory storage Integrated Circuit (IC) chip, such as a memory chip or a Central Processing Unit (CPU) chip. The network/bus communication channel 108 may utilize any medium capable of transmitting electronic data communications, including, but not limited to: wired communications, wireless electro-magnetic communications, fiber-optic cable communications, light/laser communications, sonic/sound communications, etc., and any combination thereof of the various communication channels.
The various embodiments may provide the control and management functions detailed herein via an application operating on the source 102, intermediary 104, and/or destination 106 computing devices. The source 102, intermediary 104, and/or destination 106 computing devices may each be a computer or computer system, or any other electronic devices capable of performing the communications and computations of an embodiment. The source 102, intermediary 104, and/or destination 106 devices may include, but are not limited to: a general-purpose computer, a laptop/portable computer, a tablet device, a smart phone, an industrial control computer, a data storage system controller, a CPU, a Graphical Processing Unit (GPU), an Application Specific Integrated Circuit (ASIC), and/or a Field Programmable Gate Array (FPGA). Notably, the first 102, second 104, and/or third 106 computing devices may be the storage controller of a data storage media (e.g., the controller for a hard disk drive) such that data delivered to/from the data storage media is always encrypted so as to limit the ability of an attacker to ever have access to unencrypted data. Embodiments may be provided as a computer program product which may include a computer-readable, or machine-readable, medium having stored thereon instructions which may be used to program/operate a computer (or other electronic devices) or computer system to perform a process or processes in accordance with the various embodiments. The computer-readable medium may include, but is not limited to, hard disk drives, floppy diskettes, optical disks, Compact Disc Read-Only Memories (CD-ROMs), Digital Versatile Disc ROMS (DVD-ROMs), Universal Serial Bus (USB) memory sticks, magneto-optical disks, ROMs, random access memories (RAMs), Erasable Programmable ROMs (EPROMs), Electrically Erasable Programmable ROMs (EEPROMs), magnetic optical cards, flash memory, or other types of media/machine-readable medium suitable for storing electronic instructions. The computer program instructions may reside and operate on a single computer/electronic device or various portions may be spread over multiple computers/devices that comprise a computer system. Moreover, embodiments may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., a modem or network connection, including both wired/cabled and wireless connections).
The processes 214-216 of the intermediary computing device 204 are not necessary if it is not desired to perform homomorphic calculations with at least one additional ciphertext to obtain a result ciphertext, in which case the original at least one ciphertext may simply be sent to the destination computing device 206 for decryption. Assuming homomorphic calculation operations are desired, at process 214, the intermediary computing device 204 homomorphically computes at least one arithmetic function with the at least one ciphertext and at least one additional ciphertext in accord with to obtain at least one result ciphertext. The potential arithmetic functions are one or more of addition, subtraction, and multiplication. Notably, the intermediary computing device 204 does not have knowledge to be able to decrypt any ciphertext meaning the arithmetic functions performed in process 214 at the intermediary computing device are performed homomorphically with encrypted data. Process 214 performs the necessary operations to perform homomorphic calculations on encrypted data. At process 216, the intermediary computing device 204 sends the at least one result ciphertext to the destination computing device 206.
At process 218, the destination computing device 206 decrypts the at least one ciphertext or the at least one result ciphertext into at least one unencrypted p-adic compatible integer value in accord with the FHE system running on the destination computer 206. At process 220, the destination computing device 206 decodes the integer/result integer into at least one corresponding rational number (e.g., x/y)/result rational number (e.g., xr/yr) using inverse p-adic arithmetic.
The FHE system running on the source device 202 and the destination device should be of the same type. The various p-adic rational number to integer encoding/decoding embodiments are compatible with both AGCD based FHE systems like IDGHV scheme based systems and/or RWLE based FHE systems like ModFV scheme based systems. For the AGCD based FHE system (e.g., IDGHV), the at least one rational number may be a single rational number. In the case of the single rational number for the AGCD based FHE system, the p-adic encoding of the various embodiments further encodes the single rational number as a function of the Chinese Remainder Theorem (CRT) algorithms. For the AGCD based FHE system, when the at least one rational number is a multivariate vector of rational number, the p-adic encoding of the at least one rational number is performed component wise of the multivariate vector of rational numbers. For the RWLE based FHE system (e.g., ModFV), a mapping parameter bn+1 of the p-adic arithmetic has number base b and power n chosen such that the mapping parameter bn+1 is prime. Alternatively, for the RWLE base THE system, the mapping parameter bn+1 of the p-adic arithmetic may have number base b and power n chosen such that the mapping parameter bn+1 is not prime, but that has co-prime factors of mapping parameter bn+1 and mapping of the p-adic arithmetic is also defined by the Chinese Remainder Theorem (CRT) algorithms.
Additionally, while the flow charts and flow chart details described above with respect to
The foregoing description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and other modifications and variations may be possible in light of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical application to thereby enable others skilled in the art to best utilize the invention in various embodiments and various modifications as are suited to the particular use contemplated.
This application is based upon and claims the benefit of U.S. provisional application Ser. No. 63/386,700, filed Dec. 9, 2022, entitled “PIE p-adic Encoding for High-Precision Arithmetic in Homomorphic Encryption,” all of which is also specifically incorporated herein by reference for all that it discloses and teaches.
Number | Date | Country | |
---|---|---|---|
63386700 | Dec 2022 | US |