The invention generally relates to an encryption method of binary data and its application to secure calculation of Hamming distances between two data.
The invention applies especially to the field of biometric identification or authentication.
Many techniques of biometric identification or authentication are already known. In general, they are executed jointly by a control server of an individual or an object, who can carry out acquisition of a biometric datum on an individual or an object, and by a management server of a base comprising N biometric data of the same kind.
The datum of the individual or of the object, acquired by the control server, is compared to all the data of the base so as to identify whether at least one datum of the base corresponds to the acquired datum, and identify the individual or the object as an individual or an object indexed in the base.
For this to happen, it is usual to calculate the Hamming distance between the datum of the individual and one or more data of the base, that is, the number of bits different from one datum to the other. This number can conventionally be calculated by performing the “exclusive OR” operation (known under the acronym XOR) between the two data, then by counting the Hamming weight, that is, the number of bits at 1 of the result obtained.
A major problem in this context is ensuring the confidentiality of data used. Indeed, the database comprises private information which the control server must not be able to access, and inversely the management server must not obtain information on the individual, and especially must not have access to the biometric datum which is exploited.
To respond to this problem, secure calculation techniques have been developed which let servers perform calculations on encrypted data to obtain calculation results without decrypting the data or having access to them.
In particular, a data encryption and secure calculation technique on the encrypted data by this technique has been developed to perform the “exclusive OR” operation between two data.
This technique is described in the publication by S. Goldwasser and S. Micali, Probabilistic encryption and how to play mental poker keeping secret all partial information, in H. R. Lewis, B. B Simons, W. A. Burkhard, L. H. Landweber (eds.) STOC, pp. 365-377. ACM (1982).
The main drawback to this method is that it encrypts only the data bit by bit, which considerably prolongs the calculation time necessary for its execution.
There is therefore a need for development of a faster data encryption method enabling secure calculation of a Hamming distance.
The aim of the invention is to eliminate the insufficiencies of the prior art by proposing a method for data encryption and secure calculation of Hamming distance on whole data, and not bit by bit.
Another aim of the invention is to propose a method for secure identification or authentication of an individual.
In this respect, the aim of the invention is an encryption method of a binary datum characterized in that it comprises the steps consisting of:
Advantageously, but optionally, the encryption method according to the invention can further comprise at least one of the following characteristics:
The invention also proposes a decryption method of an encrypted datum obtained by application to a binary datum of the encryption method described previously, the decryption method comprising:
An application proposed by the invention is a method for secure calculation of the “exclusive or” operation between two binary encrypted data by carrying out the encryption method described hereinabove, comprising the steps consisting of:
Another application proposed by the invention is a method for secure calculation of a Hamming distance between two binary encrypted data by carrying out the encryption method described hereinabove, the method comprising the steps consisting of:
Advantageously, but optionally, the method for secure calculation of a Hamming distance proposed by the invention can further comprise at least one of the following characteristics:
The invention also proposes a method for authentication or identification of an individual, comprising the comparison of a binary acquired datum on the individual with one or more reference binary data acquired on indexed individuals, each comparison comprising calculating the Hamming distance between the datum of the individual and a datum of the base, said calculation being performed by carrying out the method for secure calculation of a Hamming distance described hereinabove.
Advantageously, but optionally, in the method of authentication or identification of an individual, the datum of the individual and the datum or the data of the base are biometric data obtained by encoding the same biometric trait on the individual and the indexed individual(s).
The invention finally proposes a system of identification or authentication of an individual, comprising at least one control server of an individual to be identified or authenticated, and at least one management server of a reference database of indexed individuals, the control server being adapted to perform acquisition of a biometric binary datum of an individual, the control server and the management server being adapted to:
Other characteristics, aims and advantages of the present invention will emerge from the following detailed description with respect to the appended figures given by way of nonlimiting examples and in which:
a and 3b show two variant embodiments of the calculation of a Hamming distance between two data.
Context and Formalism
In what follows, operations are performed on binary data, that is, calculations must be made by numbering in base 2. So especially the nullity of a value corresponds to the nullity in base 2 of said value, that is, the value must be congruous to 0 modulo 2.
The following definition is also noted for hereinbelow: a ds-parse matrix, where d is an integer, is a matrix comprising d non-zero elements on each line, with the rest of the matrix comprising only 0.
Also, the function of homomorphic encryption is introduced for an operation•if, with two encrypted data c1 and c2 obtained by said encryption respectively from data m1 and m2, it is possible to determine the encrypted c3 of a datum m3=m1·m2 by knowing only the public key (and not the secret key) of the encryption employed.
In reference to
The encryption method is an asymmetrical encryption method, based on use of a public key pk accessible to everyone and enabling encryption of data, and a secret key sk accessible only to the recipient of the data, and necessary for performing data decryption.
The method therefore comprises a first step 100 for generating a public key pk and a secret key sk.
The public key pk is a d-sparse matrix M∈(0,1)m×n, that is, the matrix comprises m lines and n columns, m and n being integers, and it comprises on each line d elements equal to 1, the rest of the matrix comprising only 0. d is therefore less than n.
The secret key sk is a set of I indexed sets(Sj)j=1, . . . , c, such that for any j between 1 and I, j∈Sj and Σi∈S
Generation of the public key and of the secret key can be performed in different ways, whereof two preferred embodiments are described hereinbelow.
According to a first embodiment, this step 100 comprises generation 110 of I indexed matrices Hj selected uniformly from the matrices comprising q lines and n columns, and where each line of the matrix contains exactly three 1 and each column contains zero or two 1.
During a step 120, a 3-sparse matrix M is generated comprising m lines and n columns, m being greater than q, the lines of M being selected according to a law of uniform distribution.
During a step 130, I indexed sets Sj are randomly generated, j between 1 and I, each comprising q integer elements between 1 and m, and such that for any j, j∈Sj and Sj∩Sk= for j≠k.
Next, during a step 140, for any j between 1 and l, the lines of M indexed by the elements of Sj are replaced by the lines of H.
The public key pk is therefore M, and the private key sk is the set of Sj {Sj}j∈{1, . . . , l}.
This method produces the characteristics of the public key and of the secret key described hereinabove, and especially the fact that each sum of the lines of the matrix M indexed by the elements of a Sj is zero.
In fact, for each j, q lines of M are replaced by the q lines of the corresponding matrix Hj. Now, each column of Hj comprises just 0 or 2 elements equal to 1. The summation of these lines is therefore zero (that is, congruous to 0 modulo 2).
Alternatively, the generation step 100 of the public key pk and of the private key sk comprises generation, during a step 110′, of I d-sparse indexed matrices Hj, j between 1 and I, d being an even integer greater than 3, and the elements of said matrices being selected according to a law of uniform distribution, each comprising q lines and q/3 columns, where q is strictly less than m.
During a step 120′, a d-sparse matrix M is generated comprising m lines and n columns.
During a step 130′, I indexed first sets Uj⊂(l+1, . . . , m) are randomly generated, j between 1 and I, each comprising q elements, and such that two separate sets Uj and Uk include no common element: Uj∩Uk=.
During a step 140′, I second sets are randomly generated, j between 1 and I, of integers between 1 and n, such that each set Tj comprises q/3 elements.
Next, during a step 150′, elements of M are replaced by elements of each matrix Hj, j between 1 and l, as follows: Mu
During a step 160′, an indexed line ji of M is identified by an element of Uj which is the sum of the lines of M indexed by the elements of a subset Wj of Uj, and this line is permutated with the jth line of M. This line exists given the properties of the matrices and the sets generated during the preceding steps.
The public key Pk obtained is the matrix M and the private key sk is the set {Sj=Wj∪{j}}j∈{1, . . . , l}.
The fact that the sum of the lines of M indexed by the elements of the Sj is zero comes from the fact that the jth line of M is equal to the sum of the lines of Wj and that the additions are made in binary.
Following step 100 for generation of the public key and the private key, the encryption method comprises coding 200 of the binary datum c to obtain an encoded datum y.
Encoding is carried out by means of linear encoding for advantageously resolving the problem known as “wiretap channel”, disclosed and presented in the article by Wyner, A. D.: The wiretap channel, The Bell System Technical Journal 54(8), 1355-1387.
The problem disclosed in this article is proposing linear encoding for encoding a datum A to produce an encoded datum B such that, if B reaches a recipient via a nonnoised line, that is, B reaches its recipient without undergoing modifications, the recipient can decode them to obtain the datum A.
However, if B reaches its recipient via a noised line, that is, the third party has only a partial datum B, typically the case of an attack by a third party, it is impossible to decode it to obtain the datum A.
This type of encoding ensures that even partial knowledge of the encoded datum B produces the decoded datum A.
Coding verifying these properties is for example coding of the type called “coset coding”, also presented in the article.
Referring again to the encryption method, the coding step 200 of the binary datum c is advantageously performed by means of linear coset coding.
This type of encoding exploits a linear code C of parameters [n,k,d] with a control matrix H of dimensions (n-k)*k.
The encoding of a datum m is a datum x such that Htx=m. The operation m=Htx is performed to decode the encoded datum x.
In the case of the encryption method described in reference to
During a step 300, an encrypted datum b is generated such that b=M·x+e+(y1, . . . , yl, 0, . . . , 0), where M is the public matrix, that is, the sparse matrix obtained at step 100, x is a vector in binary column randomly generated, of size n, e is an online vector of randomly generated binary noise, of size m, and the I first bits of the term (y1, . . . , yl, 0, . . . , 0) are the elements of the encoding y of the datum c, and the m-I last bits are 0.
Advantageously, the elements of the noise vector e are Bernoulli variables, that is, they follow a Bernoulli law of parameter E: the elements of e therefore present the value 1 with a probability ∈. To note: e←RBer∈m.
∈ is preferably a very low value, of the order of n−0.2. The role of this noise vector is to make searching of y from b difficult.
The encryption method performed here has a high level of security, especially due to encoding of the datum c by coding verifying the properties of the “wire-tap channel”.
In fact, as indicated earlier, this coding allows that any third party who might get partial knowledge of the encoded datum y would not manage to decode it.
In this case, a third party who might get the encrypted datum b therefore could not manage to decrypt it because, even if he were to get partial information on y, these would give him no information on the datum c. The encrypted datum b obtained therefore includes m bits.
Decryption 2000 of a datum b, comprising m bits, obtained by carrying out the method described hereinabove, will now be described. For this, it is necessary to have the secret key sk, that is, the set of indexed sets Sj.
During a step 2100, the sum of the bits of b indexed by the elements of Sj is calculated for each j between 1 and l, which corresponds to a bit yj of the encoded datum y. The sequence of the yj constitutes the encoded datum y=(y1, . . . , yc).
Indeed, the summation of the elements of M·x indexed by the elements of Sj is zero, due to the choice of Sj. The summation of the elements of b indexed by Sj will therefore give yj, added to a negligible error term. Consequently, the bits obtained by summation of the elements of b, indexed by the sets Sj, are the bits of y, near noise.
During a step 2200, the obtained datum y is decoded by applying decoding of the linear code of the coset type, that is, c=H·y, where c is the binary datum decrypted.
The advantage of the proposed encryption method is being homomorphic for the “XOR” (exclusive OR) operation symbolised by the operator⊕, that is, for two messages c1 and c2 of l bits to be encrypted, the cipher of c1⊕c2 can be obtained from b1 and b2, the data obtained respectively by encryption of c1 and c2.
In this case, the exclusive or of b1 and b2 is a possible cipher of c1⊕c2 by the encryption method 1000, that is, performing the exclusive or operation between b1 and b2 corresponds to encryption of c1⊕c2 by the same encryption method 1000 with the same parameters.
This property derives from the linear character of the coset coding as used here.
Method for Secure Calculation of Hamming Distance
The encryption and decryption method described hereinabove allows performing secure calculation 3000 of Hamming distances between two binary data c1 and c2, this calculation being performed jointly by two processing units U1 and U2.
The notion of “secure” calculation indicates that the result of calculation must be obtained without either processing unit being able to access the data held by the other.
This calculation can be made according to two variants shown respectively in
In reference to
The calculation method comprises obtaining 3100 the cipher of the result of the exclusive OR operation between the nonencrypted binary data E(c1⊕c2), this result being obtained by performing the “exclusive OR” operation between the ciphers: b1⊕b2=E(c1)⊕E(c2), as per the homomorphic properties of the encryption method for the exclusive OR operation described hereinabove.
The method next comprises permutation 3200 of the I first bits of the result obtained at the preceding step by performing randomly selected permutation σ. The result obtained corresponds to the cipher of the permutation of the result of the “exclusive OR” operation between the two non-encrypted data ci, that is, E(σ(c1⊕c2)). However, permutation does not modify the Hamming weight of a sequence of bits.
Because the message σ(c1⊕c2) has the same Hamming weight as c1⊕c2, this Hamming weight therefore corresponds to the Hamming distance between c1 and c2.
Therefore, during a step 3300 it suffices to decrypt the message E(σ(c1⊕c2)) and determine the Hamming weight of the result obtained to obtain the Hamming distance between c1 and c2.
As indicated hereinabove, several implementations of this method by processing units U1 and U2 are possible.
According to a first embodiment, illustrated in
During a first step 3010, each processing unit encrypts the datum which it holds by carrying out the encryption method 1000 described hereinabove. The unit U1 holding the secret key then transfers its encrypted datum E(c1) to the other unit U2 during a step 3020.
Next, the unit U2 conducts the exclusive OR operation 3100 between the two encrypted data, selects and carries out permutation σ 3200 of the I first bits of the result obtained to produce E(σ(c1⊕c2)). The unit U2 transfers this result to the unit U1 during a step 3210 and the unit U1 decrypts the result by carrying out the decryption method 2000 by way of the secret key sk which it holds, to obtain σ(c1⊕c2) and counts its Hamming weight to obtain the Hamming distance between c1 and c2.
Optionally, the result of the Hamming distance between the data can be communicated by unit U1 to unit U2.
According to an alternative embodiment, shown in
This situation applies especially in the case of dematerialised processing of data (“cloud computing”), where the unit U1 is a remote server which stores confidential data of individuals and must not have access to them.
In this situation, it is the unit U1 which carries out the exclusive OR operation 3100 between the two encrypted data, which selects and applies 3200 the permutation σ of the I first bits of the result obtained. Next, during a step 3210, the unit U1 transfers E(σ(c1⊕c2)) obtained at step 3200 to the unit U2.
During a step 3300, by application of the method 2000, by way of the secret key which it holds, the unit U2 deciphers the datum received from the unit U2 to obtain the datum σ(c1⊕c2), counts its Hamming weight and obtains the Hamming distance between c1 and c2.
Optionally, the unit U2 can also transfer the Hamming distance between the data ci to the unit U1.
Application to Identification or Secure Authentication
This calculation method 3000 of a Hamming distance is advantageously applied to identification (comparison of an individual with a plurality of individuals as candidates for detecting correspondence between the individual and one of the candidates) or biometric authentication (comparison of an individual with an individual candidate for detecting correspondence) of an individual.
A biometric datum of an individual is compared to one (in the case of authentication) or more (in the case of identification) data of indexed individuals, each comparison being made by calculation of the Hamming distance between the data.
The biometric data are digital encodings of biometric traits of individuals and must correspond to the same biometric trait so they can be comparable: this trait can be one or two irises, one or more fingerprints, face shape, venous network shape, DNA, palm prints, etc.
A system for biometric identification or authentication 1 of an individual adapted to execution of the method 3000 advantageously comprises a control server SC of an individual to be identified and a management server SG of a biometric database, said base comprising at least one biometric reference datum ci acquired on an individual indexed.
The control server SC advantageously comprises means for acquiring a biometric datum b on an individual to be identified or authenticated, and for example can be a reader of biometric fingerprints or identity document, or a camera.
The control SC and management SG servers are advantageously configured to execute one or the other of the variant embodiments of the method 3000 described hereinabove.
In the execution shown in
Typically, if a Hamming distance between b and one of the data ci is less than a predetermined threshold, a correspondence is detected between the individual on whom the datum b has been acquired and the reference individual on whom the datum ci has been acquired.
In the execution shown in
In terms of the method 3000, the control server obtains the Hamming distance between the datum b and one or more data ci of the base, and in the same way can detect correspondence between the individual and one or more indexed individuals.
An encryption method for securely calculating a Hamming distance on whole data therefore been presented, and no longer bit to bit, this calculation also able to be applied to biometric identification or authentication.
Number | Date | Country | Kind |
---|---|---|---|
1350904 | Feb 2013 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2014/051759 | 1/30/2014 | WO | 00 |