Data storage systems store data on behalf of one or more users of such data. The data may or may not be stored in encrypted form. The users may submit a search request to the data storage system to search for particular data of interest. The data storage system performs the search and transmits the requested data to the user.
For a detailed description of various examples, reference will now be made to the accompanying drawings in which:
Users may store data stored in encrypted form (called “encrypted data”) in a storage device. Such users may desire for their encrypted data to be searchable, and to be searchable without requiring the data first to be decrypted in order for user-based searches of the encrypted data to be performed. That is, when a user desires to perform a search of certain data items, it would be desirable for the encrypted data to be searchable while still in its encrypted form. For some applications, different data items may be associated with a particular encryption key that was used to encrypt such data items. Further, different sets of data items may be encrypted with different encryption keys. The association of the various encryption keys to the encrypted data sets that each such key was used to encrypt should be protected. The examples disclosed herein provide searchable encryption techniques while authorizing users' desires to use the proper encryption key for their own encrypted data.
The data storage system 100 includes a storage device 110 coupled to a management unit 130. The storage device 110 includes non-transitory storage such as non-volatile storage (magnetic storage, optical storage, solid state storage, etc.), volatile storage (e.g., random access memory), or combinations thereof. Each client 50 may encrypt data and submit such encrypted data to the data storage system for storage in the storage device 110. Data is encrypted based on an encryption key and each client 50 may use a different encryption key to encrypt the data for each such client. Further, a given client 50 may use multiple different encryption keys to encrypt different sets of data. The storage device 110 stores encrypted data on behalf of multiple clients 50 and such encrypted data may include sets of data encrypted with different encryption keys. The encrypted data is stored in a data structure 120 contained in storage device 110.
In the example of
In accordance with the disclosed examples, each client 50 is able to perform a search for encrypted data stored on the data storage system using any of a plurality of search tokens. The search tokens may include any or all of:
The plaintext keyword may be, for example, any string of alphanumeric characters desired by a user to be associated with a particular encrypted data record. The plaintext keyword may be a string of alphanumeric characters that is contained in the plaintext version of the encrypted data record, but the plaintext keyword need not be present in the plaintext version of the encrypted data record.
The encrypted keyword is, as the name suggests, an encrypted version of an otherwise plaintext keyword. Any suitable encryption algorithm can be employed to actually encrypt a plaintext keyword to produce a corresponding encrypted keyword. For example, a technique may be used that can produce a cryptographically unpredictable value that is computed based on the plaintext keyword.
The key search token is a string of symbols (e.g., bits) having high entropy which means that its prediction is computationally infeasible. A prediction task is “computationally infeasible” if the probability of success is less than a threshold. The key search token is not an encryption key in that it is not used to actually encrypt a data record or a keyword. The key search token is chosen independent of the encryption key that is used to encrypt the data record associated with the key search token meaning that the key search token is not mathematically derived from the encryption key. In some implementations, the key search token is determined based on a random number generator.
Similarly, token 132 in table 126 is a plaintext keyword and is associated with IDs 1, 2, 7, and 8, which means that token 132 is associated with the encrypted data records in table 122 that are themselves associated with IDs 1, 2,7, and 8. Token 134 in table 126 is a key search token and is associated with IDs 2 and 3, which means that token 136 is associated with the encrypted data records in table 122 that are themselves associated with IDs 2 and 3.
At 154, the management unit 130 determines one or more encrypted data records associated with the key search token received from the client 50. This operation may be performed by examining table 126 to identify all entries that include that particular key search token. The management unit 130 then uses the IDs associated with that key search token to access table 122 to obtain the encrypted data records associated with the IDs. At 156, the encrypted data record(s) determined from operation 154 is (are) then transmitted back to the client 50 that initiated the search request.
If, by chance, the management unit 130 is unable to locate a data record that comports with the key search token provided by the client, the management unit 130 does not return a data record and may transmit an error message to the client indicative of the problem.
The method of
The key search token may be generated in any of a variety of manners. For example,
At 168, the method includes the client 50 causing the third child “encryption key” to be derived from the parent encryption to be used as a key search token. The phrase “encryption key is placed in quotes in this context to identify that this search token is derived from a parent encryption key using a key derivation function, but the key search token is not itself used to encrypt anything. As explained above, the key search token has properties (e.g., strong secret and high entropy) sufficient to make it suitable for use as an encryption key, but it is not actually used to encrypt anything (e.g., data records, key words).
One suitable key derivation function that can be used for the method of
In another example, the key search token may be chosen as an encryption key that is mathematically independent of the parent encryption key.
Each client 50 may securely maintain a copy of the information needed to recreate any of the search tokens that enable the searching of encrypted data records. Such information may include a list of all of the client's child keys themselves. Alternatively or additionally, such information may include the parent encryption key along with the salt values. The client 50 may re-compute the child keys based on the parent key and the salt values using the same key derivation functions used previously to encrypt the data records and keywords themselves (first and second child keys, respectively) as well as to generate the key search tokens (third child key). As noted above, the client 50 may interact with the key manager 75 to obtain or recomputed the various child keys using the parent encryption key and salt values.
The encryption process (e.g., to encrypt the data records and/or the keywords) may be an authenticated encryption process. An authenticated encryption process permits a client 50, with knowledge of the authentication key, to determine whether an encrypted data record has been altered since its encryption. An authenticated encryption scheme includes, for example, an “encrypt-then-MAC” technique. In this scenario, an encryption key used for encryption may be replaced with two keys—one for symmetric encryption and the other for the MAC.
The above discussion is meant to be illustrative of the principles and various embodiments of the present invention. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2014/048872 | 7/30/2014 | WO | 00 |