In order to describe the manner in which the above-recited and other advantages and features can be obtained, a more particular description is described below and will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments and are not therefore to be considered to be limiting of its scope, implementations will be described and explained with additional specificity and detail through the use of the accompanying drawings.
Embodiments are discussed in detail below. While specific implementations are discussed, it should be understood that this is done for illustration purposes only. A person skilled in the relevant art will recognize that other components and configurations may be used without parting from the spirit and scope of the subject matter of this disclosure.
Processing device 102 may be, for example, a server or other processing device capable of executing a database system. Processing device 104 may be a personal computer (PC) or other processing device capable of executing applications and communicating with processing device 102 via network 106.
Network 106 may be a wired or wireless network and may include a number of devices connected via wired or wireless means. Network 104 may include only one network or a number of different networks, some of which may be networks of different types.
In operating environment 100, processing device 104 may execute an application, which accesses information in a database of processing device 102 via network 106. The application may create, delete, read or modify data in the database of processing device 102.
Processor 220 may include at least one conventional processor or microprocessor that interprets and executes instructions. Memory 230 may be a random access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by processor 220. Memory 230 may also store temporary variables or other intermediate information used during execution of instructions by processor 220. ROM 240 may include a conventional ROM device or another type of static storage device that stores static information and instructions for processor 220. Storage device 250 may include any type of media for storing data and/or instructions. When processing device 200 is used to implement processing device 102, storage device 250 may include one or more databases of a database system.
Input device 260 may include one or more conventional mechanisms that permit a user to input information to processing device 200, such as, for example, a keyboard, a mouse, or other input device. Output device 270 may include one or more conventional mechanisms that output information to the user, including a display, a printer, or other output device. Communication interface 280 may include any transceiver-like mechanism that enables processing device 200 to communicate with other devices or networks. In one embodiment, communication interface 280 may include an interface to network 106.
Processing device 200 may perform such functions in response to processor 220 executing sequences of instructions contained in a computer-readable medium, such as, for example, memory 230, or other medium. Such instructions may be read into memory 230 from another computer-readable medium, such as storage device 250, or from a separate device via communication interface 280.
In a typical database system, data may be viewed as being stored in tables. A row of the table may correspond to a record in a file. Some database systems may permit data stored in a column of a table to be encrypted. Such database systems may permit an equality search on data in the encrypted column, provided the data is deterministically encrypted. That is, a search for rows in a table having a particular plaintext value corresponding to deterministically encrypted ciphertext in an encrypted column of the database may be performed. Deterministic encryption always encrypts plaintext items to the same corresponding ciphertext items when using a given cryptographic key. Thus, data patterns may be recognizable resulting in information leakage.
Non-deterministic encryption methods such as, for example, use of block ciphers in cipher-block chaining (CBC) mode with a random initialization vector, or other non-deterministic encryption methods, may encrypt the same plaintext data items to different ciphertext data items. For example, non-deterministic encryption according to use of block ciphers in CBC mode with a random initialization vector, may encrypt each block of plaintext by XORing a current block of plaintext with a previous ciphertext block before encrypting the current block. Thus, a value of a ciphertext data item may be based not only on a corresponding plaintext data item and a cryptographic key, but may also be based on other data, such as, for example, previously encrypted blocks of data or a random initialization vector.
Embodiments consistent with the subject matter of this disclosure relate to database systems in which ranged lookups may be performed on deterministically or non-deterministically encrypted data of an encrypted column of a database. In one embodiment, an indexing structure for performing a ranged lookup on data in an encrypted column of a database is provided. The indexing structure may include a number of entries. Each of the entries may include an index value, which may be calculated by decrypting a respective data item from the encrypted column of the database and applying a transformation function to the respective decrypted data item to produce the index value. The transformation function may be defined in such a way that the produced index value reveals less information than the corresponding decrypted data item from the encrypted column of the database.
In some implementations, the transformation function may be defined for a particular encrypted column of the database. In embodiments consistent with the subject matter of this disclosure, a user may be permitted to define or modify the transformation function for the particular encrypted column of the database. In some implementations, only those users who are authorized to modify and retrieve decrypted data from all encrypted columns of the database may be permitted to define or modify the transformation function for a particular encrypted column of the database. In such implementations, restricting which ones of the users who are permitted to define or modify the transformation function to only those users who are authorized to modify and retrieve decrypted data from all encrypted columns of the database may prevent an escalation of privileges attack.
As an example of an escalation of privileges attack, assume that a database system permits a user to define a transformation function for an encrypted column of the database even when the user is not authorized to access decrypted data for the encrypted column. The user may define or modify the transformation function to be weak such that all or nearly ail information from respective decrypted data items from the encrypted column of the database may be stored as index values of an indexing structure for performing a ranged lookup operation. At this point, a copy or equivalent, provided by the weak transformation function of the encrypted data, may be available in plaintext in the system, thereby allowing the user to look directly at it, nullifying the benefits of data encryption.
In embodiments consistent with the subject matter of this disclosure, after a user defines or modifies the transformation function for a particular encrypted column of the database, index values in respective entries of the indexing structure of the database may be recalculated according to the modified transformation function and the indexing structure may be rearranged such that a ranged lookup may be performed by traversing the indexing structure according to the recalculated index values.
In some implementations, one or more ranged lookup operators may be defined for performing ranged lookups on a particular encrypted column of the database. In such implementations, use of a ranged lookup operator, which is not defined for performing a ranged lookup on the particular encrypted column of the database, may result in a failed ranged lookup operation.
In one implementation, the indexing structure may include a B-tree or other indexing structure, which may be used to perform a ranged lookup operation to find one or more rows in the database having a particular plaintext data item, corresponding to encrypted data of an encrypted column of the database, which satisfies the ranged lookup operation.
Database systems typically use some type of indexing scheme for quickly searching data stored in a column of a database in order to access particular records or rows. One well-known indexing scheme includes use of a B-tree, although other indexing schemes may also be used in other embodiments.
Index node 302 may include a link 304, which may be a link to index node 312 having entries with corresponding index values less than index value 3452 of index node 302, a link 306, which is a link to index node 320 having an entry with a corresponding index value greater than index value 3452 and less than index value 6598 of index node 302, a link 308, which may link index node 302 to index node 326 having one or more entries with respective index values greater than index value 6598 and less than index value 8746 of index node 302, and a link 310, which may link index node 302 to an index node 328 having one or more entries with respective index values greater than index value 8746 of index node 302.
Further, index node 312 may include a link 314 to index node 330, which may include one or more entries having index values less than index value 1578 of index node 312, a link 316 to index node 332, which may include one or more entries including index values greater than index value 1578 and less than index value to 2094 of index node 312, and a link 318 to index node 334, which may include one or more entries including index values greater than index value 2094 of index node 312. Index node 320 may include a link 322 to index node 336, which may include one or more entries including index values less than index value 4678 of index node 320, and a link 324 to index node 338, which may include one or more entries including index values greater than index value 4678 of index node 320.
Because a ranged lookup operation may result in a number of rows of the database which satisfy the ranged lookup operation, the exemplary B-tree indexing structure of
Each of the index nodes may include a different number of items than as shown in the exemplary indexing structure of
In embodiments consistent with the subject matter of this disclosure, an indexing structure, such as, for example, the indexing structure of
The process may begin by processing device 102 decrypting a data item from an encrypted column of the database (act 402). Processing device 102 may then apply the transformation function to the decrypted data item to produce a transformed data item that reveals less information than the decrypted data item (act 404). Processing device 102 may create an entry in an indexing structure, which includes the transformed decrypted data item and retrieval information such as, for example, a pointer or a link, for retrieving a corresponding row in the database (act 406). Processing device 102 may then determine whether there are more data items in the encrypted column of the database (act 408). If processing device 102 determines that more data items exist in the encrypted column of the database, then processing device 102 may access a next data item from the encrypted column of the database (act 412) and may repeat acts 402-408.
If, while performing act 408, processing device 102 determines that there are no additional data items in the encrypted column of the database, then processing device 102 may arrange the entries of the indexing structure such that the transformed decrypted data items in each entry of the indexing structure may be used as index values for performing a ranged lookup operation (act 410). In one embodiment, arranging the entries of the indexing structure may include setting the links or pointers of the indexing structure to point to other appropriate entries of the indexing structure.
After receiving the ranged lookup request, processing device 102 may determine whether a ranged lookup operator of the ranged lookup request is defined for use on the encrypted column of the database (act 504). In one implementation, ranged lookup operators such as, for example, “<”, “≦”, “>” “≧” and “LIKE”, as well as other, or different ranged lookup operators may be defined for performing a ranged lookup operation on the encrypted column of the database. “<” may be used to find entries in the database having a value less than a particular value, “≦” may be used to find entries in a database having a value less than or equal to a particular value, “>” may be used to find entries in the database having a value greater than a particular value, “≧” may be used to find entries in the database having a value greater than or equal to a particular value, and “LIKE” may be used to find matching entries that may have been truncated by application of a transformation function such as, for example, entries that match a particular value for a last four digits of a Social Security number.
If, during act 504, processing device 102 determines that the ranged lookup operator in the ranged lookup request is not defined with respect to the encrypted column, then processing device 102 may return an indication to the requester that the ranged lookup request could not be performed (act 506).
If, during act 504, processing device 102 determines that the ranged lookup operator in the ranged lookup request is defined with respect to the encrypted column, then processing device 102 may search or traverse an indexing structure such as, for example, the indexing structure of
If processing device 102 determines that a corresponding item was found, as a result of performing act 508, then processing device 102 may use retrieval information included in an entry of the indexing structure corresponding to the found item to retrieve a corresponding row in the database and to provide the corresponding row to the requester (act 514). Processing device 102 may then use the indexing structure to determine whether additional items satisfy the ranged lookup request (act 516). In one implementation, act 516 may be performed by processing device 102 accessing a link to entries of the indexing structure having an index value equal to the index value of the current entry of the indexing structure, and by traversing the indexing structure, in a manner as illustrated by the exemplary indexing structure of
The process may end when processing device 102 determines that no additional items satisfy the ranged lookup request.
If processing device 104 determines that the requester is authorized to define or redefine a transformation function, then processing device 104 may permit the transformation function to be defined or altered by a requester (act 608). Processing device 104 may then recalculate the index values of the indexing structure (act 610). For example, processing device 104 may access data items from the encrypted column, decrypted data items, and apply a transformation function to produce a transformed data item. The transformed data item may then be stored as an index value in an entry of the indexing structure. Processing device 104 may repeat the recalculating of the index values of the indexing structure until all index values have been recalculated. After all of the index values of the indexing structure have been recalculated, processing device 104 may rearrange the indexing structure (act 612). For example, in an indexing structure such as the indexing structure shown in
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms for implementing the claims.
Although the above description may contain specific details, they should not be construed as limiting the claims in any way. Other configurations of the described embodiments are part of the scope of this disclosure. Further, implementations consistent with the subject matter of this disclosure may have more or fewer acts than as described, or may implement acts in a different order than as shown. Accordingly, the appended claims and their legal equivalents should only define the invention, rather than any specific examples given.