1. Field of the Invention
The present invention relates to a method of searching encrypted data using an inner product operation, and more particularly, to a method of searching encrypted data for a plurality of keywords by using an inner product operation, thereby increasing security.
This work was supported by the IT R&D program of MIC/IITA. [2005-Y-001-03, Developments of Next Generation Security Technology]
2. Description of the Related Art
The present invention relates to a method of searching encrypted data while protecting a user's privacy when the user stores important data in a server.
The user may store a large amount of important data in an internal or external server due to the size of storage space, etc. However, if data is stored in an internal or external server, the data may be leaked due to server manager's malicious behaviors. If the data is stored in plaintext, the server manager may easily leak or illegally use the contents of the data.
In order to prevent such an invasion of privacy, methods of encrypting data and storing the encrypted data have been researched. However, it is difficult to search encrypted data by using a general searching method. For this reason, methods of searching encrypted data have been required.
A method of searching encrypted data was first proposed by Song, et al. (IEEE Security and Privacy Symposium 2000), and researches on various methods such as symmetric key cryptography and public keys cryptography have been carried out. Most methods are used to search for one keyword. In order to search for a plurality of keywords, a single-keyword searching method may be repeatedly performed. However, while searching for a plurality of keywords, a user may not want to disclose information on the individual keywords in order to protect privacy. Further, if the server is unreliable, it may be possible for the server to access to data by combining the single-keyword search results.
Another method includes storing a conjunction of a plurality of keywords in a server as a Meta key, and performing a search without disclosing the individual keywords. However, in a case of combining m keywords, the maximum number of necessary Meta keys is 2m. For this reason, as the number of keywords increases, the number of Meta keys increases exponentially.
Furthermore, a method of searching for multiple keywords on the basis of pairing has been proposed by Golle, et al. (ACNS 2004). However, this method has some problems in that it is an inefficient, which imposes limitations from a practical application standpoint.
Accordingly, it is an object of the present invention to provide a searching method which is capable to rapidly and securely search encrypted data stored in an unreliable server for multiple keywords while meeting both security and efficiency.
According to an aspect of the present invention, there is provided a method of searching encrypted data using an inner product operation. This method includes: generating a private key; encrypting a plurality of documents using the private key; generating index keyword values by converting a plurality of keywords in the plurality of documents into numerical values using the private key; transmitting the encrypted documents and a set of the generated index keyword values to a server so as to be stored; receiving at least one keyword for search data; generating a search keyword value by converting each of the received keywords into a numerical value using the private key; performing an inner product operation on a set of the search keyword values to generate search information; and transmitting the search information to a server.
In the generating of the index keyword values, the index keyword values may be generated using a hash function. Further, in the generating of the search keyword values, the search keyword values may be generated using the hash function.
The generating of the search information may include: generating a set of random values; and calculating the inner product of the generated random value set and the search keyword value set to generate the search information. In the transmitting of the search information, the random value set may also be transmitted to the server as the search information.
Each of a plurality of keywords may include a keyword field value representing the location of the corresponding keyword. In the transmitting of the search information, the keyword field value corresponding to each of the received keywords may also be transmitted as the search information.
According to another aspect of the invention, a terminal for searching encrypted data includes: a keyword input unit that receives at least one search keyword with respect to data to be searched; a keyword value generating unit that generates at least one search keyword value by converting each of the received keywords into a numerical value; an inner product operating unit that performs an inner product operation on a set of the search keyword values; and a search information transmitting unit that transmits to a server the inner product value as search information.
The keyword value generating unit may generate the search keyword values using a hash function.
The inner product operating unit may generate a set of random values and perform the inner product operation on the generated random value set and the search keyword value set. The search information transmitting unit may transmit to the server the random value set as the search information.
Each of the received keywords may include a keyword field value representing the location of the corresponding keyword, and the search information transmitting unit may transmit the keyword field value corresponding to each of the received keywords as the search information.
According to still another aspect of the invention, a server for searching encrypted data includes: an encrypted data storage unit that stores encrypted documents and numerical index keyword values; an inner product operating unit that performs an inner product operation using search information transmitted from a terminal and the stored index keyword values; and a comparing unit that compares the obtained inner product value with an inner product value included in the search information. The inner product operating unit performs an inner product operation for each of the stored documents, and the comparing unit compares the inner product result of each of the stored documents with the inner product value included in the search information to determine whether two inner product values are matched with each other.
The search information may include a random value set, and the inner product operating unit may perform an inner product operation on the random value set and the stored index keyword values.
The search information may include at least one keyword field value representing the location of the keyword, and the inner product operating unit may perform an inner product operation on index keyword values corresponding to the keyword field values.
First, a system for searching encrypted data according to an embodiment of the present invention will be described with reference to
A system according to an embodiment of the present invention includes a terminal 10 and a server 20.
The terminal 10 includes an input unit 110 that receives data to be encrypted or at least one search keyword, an encrypted data generator 120 that encrypts data, a search information generator 150 that generates search information to be transmitted to a server, a keyword value generator 130 and an inner product operator 140 that perform operations for generating encrypted data or search information, and a transceiver 160 that transmits the information.
The server 20 includes a transceiver 210 that receives encrypted data or search information from the terminal 10, an encrypted data storage unit 240 that stores encrypted data or related information, and an inner product operator 220 and a comparator 230 that perform operations based on search information in order to search for desired documents.
A method of searching encrypted data according to an embodiment of the present invention will be described with reference to
The searching method in the terminal 10 generally includes generating a private key (S100), encrypting a plurality of documents (S210), generating an index of keywords in the individual documents (S220), transmitting the encrypted documents and the index to the server (S230), and receiving search keywords and transmitting search information to the server (S310 to S350). The searching method used in the server 20 includes comparing the search information transmitted from the terminal 10 to a stored index to extract documents matched to the search information and returning the matched documents (S400 to S430).
First, the general environment in which the present invention is applied will be described. Hereinafter, a user means a user's terminal 20.
It is assumed that a user encrypts important data and stores the encrypted data in an unsecured server. It is assumed that the total number of documents is n. The n documents are denoted by D1 to Dn. Furthermore, it is assumed that the number of keyword fields with respect to each document is m. For example, if data is an email, four keyword fields, such as a “From” field, a “To” field, a “Date” field, and a “Subject” field, may be assumed. Keywords are assigned to corresponding fields. For the security of the scheme, it is assumed that the same word cannot appear in any two different keyword fields. As shown in
Assuming the above-mentioned environment, a searching method according to an embodiment of the present invention will be described with reference to
Now, a process in which the encrypted data generator 120 encrypts a plurality of documents and a plurality of keywords in the documents and stores an index will be described.
First, a user randomly generates a private key K for encrypting data and a keyword set for data searching (S100). The private key K should be a secret that no one knows other than the user. The length of the private key is determined according to an encrypting algorithm.
It is assumed that the user has n documents D1, D2, Dn, m keyword fields exist in each of the documents, and keywords with respect to a document Di are denoted by Wi1, Wi2, . . . , Wim. In other words, Wij means a keyword corresponding to a j-th keyword field with respect to the document Di.
After the user randomly selects k1, k2, . . . , km in GF(p), these values are kept secret. Here, it is preferable that p be a prime number. Further, since the size of p relates to the security of the scheme, it is preferable to set the size of p to be equal to the length of the private key K. In consideration of the security of a current encrypting algorithm and computing power, it is preferable to set the size of p to 120 bits or more. Similar to the private key K, the private information should be a secret that no one knows other than the user.
The user encrypts the individual documents Di using the private key K to generate encrypted documents EK(Di) (S210). Here, EK( ) is a symmetric key encryption algorithm, in which K is a private key. Further, the user calculates vectors (hK(Wi1⊕1), hk(Wi1⊕2)), (hk(Wi2⊕1), hk(Wi2⊕2)), . . . , (hk(Wim⊕1), hk(Wim⊕2)) (where hk)) is a keyed hash function, in which K is a private key). In order to simplify those symbols, (hk(Wij⊕1), hk(Wij⊕2)) is defined as Hij. Next, the user selects a random value ai from GF(p). The value ai should be kept secret and the user does not need to know or store the value ai. Finally, the user randomly generates a public vector Gi=(gi1,gi2). Using those generated values, the user generates the following values with respect to each document Di (S220)
Gi, <Hi1,Gi>+k1ai, <Hi2,Gi>+k2ai, . . . , <Him,Gi>+kmai, EK(Di) [Expression 1]
The calculation of Expression 1 is performed in GF(p). Here, <Hij,Gi>+kjai is a numerical value which a keyword corresponding to the j-th keyword field of each document Di is converted into and is referred to as an index keyword value. <Hij,Gi> means the inner product of Hij and Gi. For example, the inner product of a vector (a,b) and a vector (c,d) (that is, <(a,b),(c,d)>) is ac+bd. The user performs the above-mentioned operation on all of the documents D1, D2, . . . , Dn, and this operation is performed by the keyword value generator 130 and the inner product operator 140.
The public vectors Gi, the index keyword values <Hij,Gi>+kjai with respect to the individual keywords, and the encrypted documents EK(Di) generated by the above-mentioned method are stored in the form of Expression 2 in the server through the transceiver 160:
G1, <H11,G1>+k1a1, <H12,G1>+k2a1, . . . , <H1m,G1>+kma1, EK(D1)
G2, <H21,G2>+k2a2, <H22,G2>+k2a2, . . . , <H2m,G2>+kma2, EK(D2)
Gn, <Hn1,Gn>+k1an, <Hn2,Gn>+k2an, . . . , <Hnm,Gn>+kman, EK(Dn). [Expression 2]
The server 20 stores the individual Expression 2 in a database 240 corresponding to the user.
Next, a process in which the user inputs at least one keyword and searches for desired data will be described. When the user wants to search for documents including a plurality of keywords, in general, a conjunction of the keyword search results is used to search for corresponding documents. In order to keep the individual keywords concealed from the server 20, the keyword transmitted to the server should be encrypted and it is preferable that the keywords can not be divided into individual keywords. Therefore, even when a plurality of keywords are searched, the individual keywords need to be capsulated without being divided, and transmitted to the server 20.
Returning to the data searching method according to the embodiment of the present invention, if at least one search keyword is input through the input unit 110 of the terminal 10 (S310), first, the location of each keyword field and a corresponding keyword value are extracted. When t search keywords are input, pairs of the locations of keyword fields to be searched and keyword values are denoted by (i(1),Wi(1)) , (i(2),Wi(2)), . . . , (i(t),Wi(t)). For example, with respect to a plurality of documents as shown in
H1=(hK(Wi(1)⊕1), hK(Wi(1)⊕2), H2=(hK(Wi(2)⊕1), hK(Wi(2)⊕2)), . . . , Ht=(hK(Wi(t)⊕1) hK(Wi(t)⊕2)).
The obtained values H1 to Ht are defined as search keyword values.
Next, the user randomly selects a set of arbitrary random values s1, s2, . . . , st−1 in GF(p), and the inner product operator 140 calculates a numerical value st meeting Expression 3 (S330):
s
1
k
i(1)
+s
2
k
i(2)
+ . . . +s
t−1
k
i(t−1)
+s
t
k
i(t)=0. [Expression 3]
An inner product operation is performed on the search keyword values and the random value set generated in the above-mentioned method to obtain a value of s1H1+s2H2+ . . . +st−1Ht−1+stht(S340). Next, the obtained value, the locations of the keyword fields (i(1), i(2), . . . , i(t)), and the set of arbitrary random values s1, s2, . . . , st−1 are transmitted to the server 20 through the transceiver 160 (S350). The server 20 searches the encrypted documents using the transmitted data and the stored values of Expression 2.
The server 20 performs the following processes in order to determine whether a document EK(Di) is a desired document. First, the inner product operator 220 calculates an inner product <s1H1+s2H2+ . . . +st−1Ht−1+stHt,Gj> using the transmitted value s1H1+s2H2+ . . . +st−1Ht−1+stHt and Gj stored in the server. The server 20 performs an inner product operation on elements (i(1), i(2), . . . , i(t)) representing the locations of the keyword fields, the set of arbitrary random values s1, s2, . . . , st−1, st and the index keyword values stored in the server 20 to calculate s1(<Hji(1),Gj>+Ki(1)aj)+s2(<Hji(2),Gj>+Ki(2)aj)+ . . . +st(<Hji(t),Gj>+Ki(t)aj) (S410).
Next, the comparator 230 calculates the difference between the two calculated values, that is, <s1H1+s2H2+ . . . +st−1Ht−1+stHt,Gj>−{s1(<Hji(1),Gj>+Ki(1)aj)+s2(<Hji(2),Gj>+Ki(2)aj)+ . . . +st(<Hji(t),Gj>+Ki(t)aj)} (S420). If the transmitted search keyword is matched with the keyword included in the document Di stored in the server, the difference between the two values, that is, <s1H1+s2H2+ . . . +st−1Ht−1+stHt,Gj>−{s1(<Hji(1),Gj>+Ki(1)aj)+s2(<Hji(2),Gj>+Ki(2)aj)+ . . . +st(<Hji(t),Gj>+Ki(t)aj)} becomes −aj(s1ki(1)+s2ki(2)+ . . . +stki(t)), which becomes 0 by Expression 3.
The comparator 230 performs the above-mentioned process on all of EK(Di) of Expression 2, searches for encrypted documents for which the result of the above-mentioned process is 0, and transmits the searched encrypted documents to the terminal 10 through the transceiver 210 (S430).
Finally, the user decrypts the encrypted document in the terminal 10 using the private key K (S360) and thus the corresponding documents become accessible.
As described above, the method of searching encrypted data for multiple keywords, in which the server does not know the contents of the data and the contents of the index, is provided. Therefore, it is possible to protect the privacy of the user, to search for a plurality of keywords at the same time, and to prevent information on each keyword from leaking out to the server.
Further, in order to search for t keywords, the server needs to perform a one inner product operation, t finite field multiplications, and t finite field additions for each document. An inner product operation needs to perform 2 finite field multiplications and a one finite field addition. Therefore, in order to search for t keywords, totally, (t+2) finite field multiplication operations and (t+1) finite field addition operations are required. The computational complexity is less than that of the existing method which requires several times of pairing operations with respect to individual documents when multiple keywords are searched for. Therefore, it is possible to improve the efficiency.
According to the method of searching encrypted data using an inner product operation, it is possible to search for data desired by the user while keeping the contents of data and keywords concealed from the server. Therefore, it is possible to protect the privacy of the user with respect to important data.
Further, it is possible to search for a plurality of keywords at the same time, and to prevent the server from accessing user's data by keeping information on the keywords concealed from the server.
Furthermore, it is possible to search for multiple keywords using less amounts of calculation as compared to the existing method based on a pairing operation, thereby improving the search efficiency.
Although the method of searching for multiple keywords according to a representative embodiment of the present invention has been described, it will be appreciated that modifications and variations can be made in the present invention without deviating from the spirit or scope of the invention in which the results calculated by an inner product operation are transmitted to the server and documents matched to the results are searched for. A process of generating a public vector or a set of random values may be performed in other ways, and the document encryption and the index generation are not limited to the above-mentioned embodiment. Further, as long as encryption is performed in a numerical value form as well as a hash function in the process of converting keywords into numerical values, other methods can be performed.
Further, according to the above-mentioned method, it is assumed that a plurality of keywords have fixed keyword field values. However, as long as a keyword set is maintained in a form on which an inner product operation can be performed, the field values for individual documents may vary.
Furthermore, according to the searching system according to the embodiment of the present invention, each component may include other components. For example, the encrypted data generator or the search information generator may include the transceiver.
Number | Date | Country | Kind |
---|---|---|---|
10-2007-0119661 | Nov 2007 | KR | national |