Not applicable.
Not applicable.
This invention relates to the field of secure client-server communication. More specifically, this invention relates to using shuffled biometrical data in asymmetric encoding/decoding authentication scheme and then using this data in secure and privacy-preserving business transactions even where the security of the data transmitted to the server cannot be assumed during transmission or subsequent storage.
Public key cryptography (Asymmetric Cryptography, e.g. RSA) is a method of secure communication where one key is used for encryption and another for decryption. Method is heavily used in modern client-server transactions.
A biometric is a physical characteristic of a person that is used to authenticate a person's identity: voice, fingerprint, retina scan, handwrite signature and so on. An authentication system tests whether the query biometric is close or not to the stored biometric associated with the claimed identity.
Authentication of a person is used in many business transactions including transactions over public networks such as Internet. Despite numerous authentication schemes, neither privacy nor security is guaranteed in such transactions.
Authentication methods in client-server communication are based on using mathematical encryption and/or biometrical sampling.
Mathematical encryption has two main types: symmetrical and asymmetrical. Problems with the prior art will be illustrated here on the example of one of the asymmetrical methods—RSA (U.S. Pat. No. 4,405,829).
RSA makes use of three numbers: module, for example 55, public key, for example 7, and private key, for example 23. Module and public key should be known to anyone who sends a message—in order to encrypt it; private key should be known only to the person receiving the message. Actually this person generates all three numbers and publishes two of them (module and public key). The pair module and public key are simply referred to as public key; the pair module and private key—as private key.
Message may be presented as an integer number. A sender encrypts this number, for example 2, by following way:
27 mod 55=128 mod 55=(2*55+18) mod 55=18,
and sends the result—number 18, to a receiver. The receiver decrypts the result using his private key:
1823 mod 55=181*182*184*1816 mod 55=825552 mod 55=2,
and gets the number encrypted by the sender—number 2.
For any given public key there is only one private key, which gives correct result for any message. If public key is 2-digit or 1-digit numbers like 55 and 7 it is easy to find out corresponding private key. However, if these numbers are 100-digit or better 1000-digit, it is virtually impossible to calculate corresponding private key. That is why nobody except the person who generated all three numbers can decrypt the messages (but it is still easy to generate long private and public keys).
This scheme has two main drawbacks.
First drawback is related to the anonymous character of communication based on RSA. The identity of a person who had published the public key is unknown, so there could be a legal issue related, for example, to the dispute over whether or not a contract had been signed by a user. This drawback was resolved in series of works devoted to building a fuzzy extractor—the algorithm for extracting the keys from biometrical data; see cryptology ePrint Archive: Report 2004/176 by Andrew Burnett, Adam Duffy and Tom Dowling. The scheme proposed in the report ensures that a biometric reading provided by the alleged signer would be enough to verify the signature.
The second drawback of RSA is related to the possibility of breaking it. Despite the fact that mathematically it is virtually impossible to find out the private key, tomorrow's mathematical ideas and technical tools may change the situation. Besides, real-life implementation of RSA (storing of long numbers in computer memory for example) allows some non-mathematical tools like hacking, bribery and blackmail be successfully used.
Those realities dictate strict requirements for verifying the identity of a user including the explicit use of his biometrics (as opposed to the implicit use of biometrics described above in connection with fuzzy extractors). Biometric samples are stored on the server during enrollment process and then they are compared against biometric samples submitted during authentication request. A security breach of the data transmitted to or stored at the server in this case leads to the total loss of client's biometrics: they can be used neither currently nor in the future on any server. Besides, the user's interests may be harmed during the period of time from the moment of compromising the server till the moment it was discovered.
According to U.S. Pat. No. 6,507,912 January 2003 by Matyas Jr. et al., the sampling of biometric characteristics may be performed on client side using the key transmitted from the server to the client. The authentication can be done by comparing key-dependent samples collected during the enrollment process against key-dependent data submitted during the authentication request. This is only a partial solution of the problem because the key and the rule of transformation may be revealed as a result of a breach of the server. Besides, the user's interests may be harmed during the period of time from the moment of compromising the server till the moment it was discovered.
Similar solution is proposed in US Patent Application 20040019570 January 2004 by Bolle R. M et al. The transformation is used to create a distorted biometric. The distorted biometric is used to identify the user without requiring the user to provide actual physical characteristics about himself. This is a partial solution as well due to the nature of proposed distortion. For example, according to this solution, image of fingerprint is divided into nine rectangles which are switched with each other and turned to 90, 180 or 270 degrees. The process of recognition is based on extracting so-called features, such as a ridge bifurcation in a fingerprint, which are sub-characteristics of the overall signal. Based on these features, a more compact template representation is built and used for identification. So the proposed distortion does not destroy features, which are characterizing the user, and hence does not solve completely the privacy problem in case of server compromising. Besides, the user's interests may be harmed during the time range from the moment of compromising the server till the moment it was discovered.
The object of invention is provided by sampling a biometric on the client and shuffling arrays of biometric data on the client. The sequence of shuffling is calculated on the base of a “twister”—a secret number or word or other information known only to the user. The result is a “once-twisted signature”—shuffled arrays of biometric data.
During the enrollment process, the samples of once-twisted signatures are transmitted to the server and stored on the server. Real signatures are not stored anywhere and are never transmitted anywhere. Also during the enrollment process, the user's public key is generated on client from the user's real signature with the help of fuzzy extractor. This public key and the public string from which it is generated are transmitted to server and stored on server.
During subsequent connections to the server, the user's username is transmitted to the server. The server generates a random number, encodes it using the user's public key and sends it back to the client along with the public key/string.
On the client, the user's private key is recreated from the user's real signature with the help of fuzzy extractor. This private key is used to decode the random number received from the server. After that real signature is twisted on client twice—first time using the “twister” known only to the user (by the same rules as it was done during enrollment), second time—using the random number generated on server. The twice-twisted signature is then transmitted to the server.
On the server, user's once-twisted signature is extracted from the database and twisted second time using the random number which was sent to the client. The result is compared against twice-twisted signature received from the client. The method of comparison guaranties that the result of comparison does not depend either on particular “twister” used by the user, or on particular number generated on server; the result depends on weather the real signature used during enrollment closely matches to the real signature used during authentication process or not.
The advantage of the present invention is the improved privacy of the user. It is guaranteed explicitly by one-way shuffling of biometric arrays on the client side using input known only to the user, so there is no way to restore the real signature or its biometric features from intercepted or stolen twisted signatures. Another advantage is that the security required of the server is less crucial. Even if an attacker can steal all information stored on the server, he cannot decode the number sent from the server to a client because he does not have the private key (which is not stored anywhere, but repeatedly recreated on the client from the real biometric of the user). So, the attacker cannot generate correctly twice-twisted signature even if he knows once-twisted signature.
The present invention will now be described more fully using some specific examples of the implementation, see also U.S. patent application Ser. No. 10/725,116, filed on Dec. 2, 2003, and entitled TWISTED SIGNATURE, by Victor Gorelik. The present invention may, however, be embodied in many different forms and should not be construed as limited to the provided examples.
In particular, the present invention may be embodied as systems (apparatus), methods and/or computer program products, or as an embodiment combining software and hardware aspects.
The present invention is valid for different types of biometrical data like voice, fingerprints, retina scan and so on. For purposes of illustration only, the handwriting signature is chosen.
The first step is to get a real biometric sample on client side. For example, a user signs an on-line form using a computer mouse.
The second step of enrollment is to use fuzzy extractor to calculate a public key/string on the basis of this real signature. User's secret twister (for example number “7788”) may be used as an additional input for calculations.
The same twister is used to transform the real signature into a twisted one (see central part of
Both calculation of public key/string and transformation of the real signature may be triggered by clicking button “Twist” of the on-line form.
The last step of the enrollment process is submitting the twisted signature and public key/string to the server, for example, by clicking a button “Submit” on the on-line form. The twisted signature and the public key/string are stored on the server under the user's username, for example “Cindy”.
The first step of verification process is the submission of user's username to server.
The server responds by generating a random number, for example 2, encoding this number using user's public key, for example (7, 55) and sending the result (18) to the client along with the values of public key/string.
During the next step of verification, the user signs the on-line form, and enters the same secret word as he used during the enrollment (“7788” in our example).
Client uses fuzzy extractor and all known information (the real signature obtained on the client during verification, the received public key/string, and, optionally, the “twister”) to recreate the private key ((23, 55) in our example). This private key is used to decode random number received from the server. The result of decoding in our example is 2. In many cases, it would be enough for verification purposes to send this result to the server and make sure it matches the number generated there. However, as mentioned in “Problems with the prior art”, the private key may be eventually constructed from the public key, so explicit use of biometric on the next steps (see below) increases security of the verification.
During the next step of verification, the real signature is twisted for the second time on the client side. First transformation is done with the help of the user's twister (“7788”). The result of this transformation is transformed with the help of the random number received from server (2 in our example). The resulting twice-twisted signature is submitted to the server. The server uses random generated number (2 in our example) to twist the once-twisted signature stored in the server's database, and compares the result of this twisting against the twice-twisted signature received from the client. The method of comparison will be explained later in this description. The result of the comparison is expressed as coefficient between −100% and 100%. If the coefficient is close to 100% verification is granted.
The following is an explanation of how to transform the actual signature into the twisted one on the client side (or once-twisted signature into twice-twisted one) and how to compare two twisted signatures on the server side.
The actual signature can be presented as 3 arrays:
x0, x1, x2, . . . , xN−1,
y0, y1, y2, . . . , yN−1,
t0, t1, t2, . . . , tN−1,
where xi and yi are mouse coordinates at the moment ti, and N is the number of the mouse coordinates during the process of signing the Form. For purposes of illustrating, the pace of signing (array t0, t1, t2, . . . , tN−1,) and additional characteristics (like z-pressure as a function of time) are not considered. Only two arrays: x-array and y-array are considered below; they determine the shape of the signature completely. (Other types of biometric data can also be presented as several arrays of numbers and similar procedures are applied.)
There are N!*N! ways in which the real signature can be twisted by shuffling the original arrays {x0, x1, x2, . . . , xN−1} and {y0, y1, y2, . . . , yN−1 }. To choose one of the ways the use enters the secret distortion word (“7788” in our example). Each character in the word has a numerical value, ASCII code, for example. The sum of these values is equal to 222 in our example. If N is known, let us say, N=100, the value of “shift” 222% 100=22 can be calculated.
The original array {x0, x1, x2, . . . , xN−1} corresponding to the real signature is replaced by the new array:
x0 is replaced by x22,
x1 is replaced by x23,
. . . ,
x77 is replaced by x99,
x78 is replaced by x0,
x79 is replaced by x1,
and so on.
The original array {y0, y1, y2, . . . , yN−1,} corresponding to real signature is replaced by the new array using double shift: 44 instead of 22. This way of shuffling creates a twisted signature, each point of which has x-coordinate equal to x-coordinate of one point of the real signature and y-coordinate equal to y-coordinate of another point of the real signature.
To compare two actual (not twisted) signatures, the technique of correlation coefficients can be used (Miller at al. John E. Freud's mathematical statistics, Prentice Hall, NJ, 1999). Let Cx be the correlation coefficient between the arrays
x1={x10, x11, x12, . . . , x1N−1},
x2={x20, x21, x22, . . . , x2N−1},
of x-coordinates of first and second actual signatures; Cy be the correlation coefficient between arrays
y1={y10, y11, y12, . . . , y1N−1}
y2={y20, y21, y22, . . . , y2N−1},
of y-coordinates of first and second actual signatures. If both Cx and Cy are close to 100% (or their average is close to 100%), these two signatures are close.
Methods of calculating Cx and Cy are identical, so only the case of Cx is described.
Calculation of the correlation coefficient between two arrays x1 and x2 consists of 3 steps.
At the first step, the graphic of each array is shifted in the vertical direction so that the average value of each new array is equal to zero, see
If two original arrays x1 and x2 have exactly the same shape, then new arrays X1 and X2 will have the following property: if, for example, X15 is positive, then X25 is positive as well; if X19 is negative, then X29 is negative as well, and so on.
The second step is the calculation of the product
X10*X20+X11*X21+ . . . +X1N−1*X2N−1 (1)
If original arrays x1 and x2 have the same shape, each term in this expression will be positive (negative multiplied by negative is positive) and the sum will be big. If the original arrays are not exactly the same, but have similar shapes, then most of the terms will be positive and the sum will still be big.
The third step is normalization. Normalization ensures that the correlation coefficient between two arrays of exactly the same shape will be equal to 100%; the coefficient between two arrays with opposite shapes (upside down) will be equal to −100%; the coefficient between two arrays with very different shapes (between “signal” and “noise”) will be close to zero.
The important point to notice is that if we shuffle the sequence of the coordinates in the first array x1 and the sequence of the coordinates in the second array x2 in exactly the same way, we do not change the correlation coefficient, because the sum (1) does not depend on the order of the items. That is why signatures twisted by shuffling can be used instead of actual ones.
The level of what “is close to 100%” is established on the basis of statistical characteristics of the signatures and on the level of desired security. Table 1 contains correlation coefficients of x and y-arrays for 6 twisted signatures of one person. The first number in each cell of the table corresponds to the x-coefficient; the second number corresponds to the y-coefficient.
The correlation coefficients between the corresponding actual signatures (or between twice-twisted signatures) are absolutely the same.
Comparison of Person2's twisted signature with six twisted signatures of Person1 gives:
(0, 6); (−12, −11); (−28, −15); (−21, −4); (−5, 3); (−12, 23).
Comparison of Person3's twisted signature with six twisted signatures of Person1 gives:
(71, 5); (61, 0); (41, −1); (60, −3); (53, 12); (48, 24).
In the described example, the level of 70% for correlation coefficient may be used. If the coefficient between twice-twisted signature obtained on client and twice-twisted signature obtained on server is greater then 70%, the verification is granted. At first glance the level of 70% does not look big enough to make sure we have the same signature. However, in verification process first step needed to be done before calculations: the user has to indicate her username (“Cindy”), after that the server uses information stored under name “Cindy”. This makes results much more reliable.
The method of comparison guaranties that the result of comparison does not depend either on particular “twister” used by user, or on particular number generated on server; the result depends on whether the real signature used during enrollment closely matches the real signature used during authentication process or not (because two different signatures give the same correlation coefficients before and after twisting). As opposed to an arbitrary distortion of biometric the proposed here method guarantees that signatures of two different persons will be different to the same degree before and after twisting.
The advantage of the present invention is the improved privacy of the user. It is guaranteed explicitly by one-way shuffling of biometric arrays on the client side using input known to user only, so there is no way to restore the real signature or its biometric features from intercepted or stolen twisted signatures.
Another advantage is that security required of the server is less crucial. Even if an attacker can steal all information stored on server, he cannot decode the number sent from the server to a client because he does not have the private key (which is not stored anywhere, but repeatedly recreated on the client from the real biometric of the user). Attacker cannot generate twice-twisted signature correctly even if he knows once-twisted signature. So, the user's interests will not be harmed during the time range from the moment of a breach of the server security till the moment the breach is discovered.
In the case of hand-written signature, the twisted and the twice-twisted signatures look like strange but possible signatures of another person, see
Another example of the present invention's possible embodiment is a system where the server and the client are implemented in one device and are not using public network for communication. In this kind of systems, the server is a subsystem storing twisted samples of biometric data and making decisions regarding verification and/or identification of a client. The client is a subsystem collecting biometric data, twisting this data and submitting twisted data to the server. For example, teller machines may store the public key/string and twisted biometric of the customer, generated based on the real biometric and a secret code known only to the customer. Server does not know this secret code; only the person who submits his/her fingerprints along with this code knows it. After a twisted fingerprint is generated on the client, the secret code and the real signature are not needed anymore, so they are not stored anywhere. Even in the case of a breach of server security, the customer is risking only his twisted fingerprints saved on server. For other applications the customer is using the same real fingerprints along with the different secret code.
In the drawings and specification above, there have been disclosed typical embodiments of the invention and, although specific terms are employed, they are used in a generic and descriptive sense only and not for purposes of limitation.
This application is related to U.S. patent application Ser. No. 10/725,116, filed on Dec. 2, 2003, and entitled TWISTED SIGNATURE, to Victor Gorelik.