This application relates in general to data encryption, and in particular to a computer-implemented system and method for protecting sensitive data via data re-encryption.
Companies tend to collect large amounts of data during the normal course of business. At least a portion of the data includes sensitive information, such as financial transactions, medical profiles, and customer identification, including social security numbers. Once collected, the companies must store the data, sometimes for long amounts of time, as required by company policy or by government guidelines and policies. However, a majority of the companies are unable to store the data themselves due to the considerable amount of storage space required and thus, rely on leasing storage and computing power from larger companies. Servers used by the larger companies to store the data are public and often cloud based.
Additionally, the field of business intelligence depends on analytics to identify trends, steer strategies, and support successful business practices. The analysis is commonly performed by analysts hired by a company. These analysts are generally entrusted with important tools, including decryption keys to decrypt the stored data prior to analysis. However, if an unauthorized individual, such as the adversary obtains the decryption key, access to the entire database storing the data is granted. Unfortunately, mobile devices of the analysts are often not equipped with strong intrusion prevention mechanisms, which make the analysist a weak link for attack by an adversary.
Protecting data owners' sensitive information from unauthorized individuals is extremely important to prevent misappropriation of the data. Currently, sensitive data can be protected via an access control mechanism at a server on which the data is stored so that the server first engages with a party interested in the data and then requires the interested party to enter necessary credentials to pass authentication protocols established by the access control mechanism before accessing the data. Unfortunately, a number of security breaches has recently increased due to unauthorized access of the credentials for an authorized user.
In addition to requiring a user to enter credentials, stored data can be encrypted prior to storage as an additional security layer to reduce the effects of breach by preventing access to the data content. However, encryption itself is generally not secure enough to prevent disclosure of the data content. For instance, to encrypt the data, companies generally utilize a public key to encrypt the data prior to storage. Subsequently, a user associated with the company needs to access the data, but to do so, must obtain a secret key of the company to decrypt the encrypted data. Allowing multiple users of the company access to the secret key places the data in a vulnerable position since the user can provide the key to unauthorized users. Additionally, the secret key can be accessed directly by unauthorized users, resulting in access to the data content. Unfortunately, obtaining a secret key can be fairly easy since humans are often easily fooled by simple social engineering attacks.
Therefore, there is a need for an approach to improved data protection and breach prevention. Preferably, the data protection and breach prevention will include a re-encryption scheme for large amounts of plaintext data to reduce the effects of unauthorized access to the data itself or via individuals authorized to access the data.
A secure cloud-computing architecture can be used to increase security of sensitive data and reduce opportunities for breach over conventional security methods. Public and secret encryption keys can be generated for a data owner storing sensitive data on a cloud based server and for each individual authorized to access the data. Prior to storage, the data is encrypted using the data owner's public key. A user can submit a query to access the encrypted data and results of the query are determined based on the stored encrypted data. A re-encryption key is generated for the requesting user using his public key. The encrypted data results, which are in ciphertext form, are then re-encrypted to a different ciphertext form, using the re-encryption key. The re-encrypted results are provided to the requesting individual and then decrypted for analysis and further use. Specifically, decryption of the re-encrypted ciphertext reveals the underlying plaintext, namely the query result.
An embodiment provides a computer-implemented method for protecting sensitive data via data re-encryption. Encrypted data is maintained. A data query is received from a user associated with a public key and a secret key. Results of the query are computed by identifying at least a portion of the encrypted data and by adding plaintext for the identified portion of the encrypted data as the results. A re-encryption key is generated for the results using the public key of the user and the results are re-encrypted using the re-encryption key. The re-encrypted results are then transmitted to the user.
Still other embodiments of the present invention will become readily apparent to those skilled in the art from the following detailed description, wherein is described embodiments of the invention by way of illustrating the best mode contemplated for carrying out the invention. As will be realized, the invention is capable of other and different embodiments and its several details are capable of modifications in various obvious respects, all without departing from the spirit and the scope of the present invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not as restrictive.
Companies collect and store large amounts of data. Hired analysts can analyze the collected data, such as for use in business intelligence. However, analysts are often considered to be a weak link since each is trusted with extremely important tools, such as a decryption key to decrypt encrypted data prior to analysis. Unfortunately, most of the analysts are not equipped with strong intrusion prevention mechanisms and thus, attacks are common. Separately encrypting data for each data owner and re-encrypting the data for a particular user authorized by a data owner, helps prevent breach of the data due to unauthorized access to the server, untrustworthy users, or attacks on a user privy to security information of the data.
Re-encryption facilitates that a decryption key, used to decrypt an encrypted database, no longer resides at the endpoints.
The server 11 includes an encryptor 12, set-up module 13, and key generator 14. The encryptor 12 utilizes the public key of the data owner to encrypt the data 16 stored on the database 15. Once encrypted, the data 23 is transmitted for storage in a database 22 associated with a cloud based server 19. Cloud based storage offers extremely large amounts of storage space, which relieves data owners of the burden of storing all data locally. To access the encrypted data 23 from the cloud based servers, authorized users are each associated with a public key 29 and a secret key 30, where the public key can be maintained in a database 28 associated with a computing device 24 of that user. Alternatively, the public 29 key can be stored on in a database of a cloud based server. The users' public 29 and secret 30 keys can be generated by the owner's server 11 via the set up module 13, which outputs parameters that can be used by the key generator 14 to generate the keys for each user authorized to access the data 23, as further described below with reference to
Each authorized user can access the owner's encrypted data 23 via the computing device 24, such as a desktop or laptop computer, as well as a mobile device, for performing analytics on the data. Specifically, the computing device 24 is associated with a server 25 having a query generator 26 to generate the query and a decryptor 27. The query is transmitted to the cloud based server 19, which includes a query receiver 20 to receive and parse the query, and a result finder 21 that processes the encrypted data 23 in response to the query and generates one or more encrypted results. The results of the query are computed by adding the underlying plaintext of the results. However, prior to providing the encrypted results to the user, the results are transmitted to a proxy re-encryption server 31, which includes a key generator 32 and a re-encryptor 33. The key generator 32 generates a re-encryption key 35 for each authorized user based on the secret key of the data owner and the public key of that requesting user, as described below in further detail with respect to
The mobile computing devices and servers can each include one or more modules for carrying out the embodiments disclosed herein. The modules can be implemented as a computer program or procedure written as source code in a conventional programming language and is presented for execution by the central processing unit as object or byte code. Alternatively, the modules could also be implemented in hardware, as integrated circuitry and each of the client and server can act as a specialized computer. The various implementations of the source code and object and byte codes can be held on a computer-readable storage medium, such as a floppy disk, hard drive, digital video disk (DVD), random access memory (RAM), read-only memory (ROM) and similar storage mediums. Other types of modules and module functions are possible, as well as other physical hardware components.
Re-encrypting ciphertext helps prevent data breaches, as well as minimizes the effect of any breach that occurs, by preventing the sharing of decryption keys and adding an additional level of security. Specifically, if a re-encryption key is stolen, all the unauthorized user can do with the key is convert the ciphertext from one public key to another. In other words, the unauthorized user is not able to decrypt the ciphertext.
Data collected by the owner can be encrypted (block 44) using the owner's public key and stored on one or more cloud based servers, as further described below with reference to
Each user authorized to access data of an owner is associated with a public key and secret key. The keys of the users should be related based on their common relationship with the data owner. The set-up phase determines parameters for generating the keys, such that the keys of all the authorized users are tied together based on a common set of parameters.
A set of elements, ZN={0, 1, . . . N−1}, is accessed (block 53) and two subsets of elements are randomly selected (block 54) from the set ZN. Each randomly selected element is processed (block 55) through exponentiation to generate two calculated sets of elements. For example, two subsets of elements α0, α1, . . . αk-1 and β0, β1, . . . βk-1 can be randomly selected from the set ZN, and separately employed in the following equations:
to generate parameters h0, . . . , hk-1, f0, . . . , fk-1
Also during the set-up phase, a decryption oracle is constructed (block 56) for use with a decode algorithm to decrypt encrypted and re-encrypted ciphertext using one of the prime numbers, as further described below with reference to
Once set-up has been performed, public keys and secret keys for the data owner and one or more users can be generated. For example, upon hiring a new employee, a public and secret key can be generated for the employee to access the encrypted data stored on a cloud based server and owned by the employer.
Additionally, two groups of two elements each are randomly selected (block 62) from set ZN. For example, α, {tilde over (α)}←N and b, {tilde over (b)}←N. The public key is computed (block 63) using the random elements from G and the two groups of random elements for ZN. Specifically, the random elements from G and from ZN are computed to generate ga, gb, {tilde over (g)}ã, and {tilde over (g)}{tilde over (b)}. The public key is then output (block 65) as pk=((g, ga, gb), ({tilde over (g)}, {tilde over (g)}ã, {tilde over (g)}{tilde over (b)})).
Meanwhile, the secret key is computed (block 64) based on both groups of random elements selected from ZN and one of the secret parameters. In one embodiment, p1 is used for the secret parameter; however, another one of the secret parameters can be used. The private key is then output (block 65) as sk=((a, b, g), (ã, {tilde over (b)}, {tilde over (g)}), p1). Upon output (block 65), the public and secret keys can be maintained by the data owner, as well as provided to the associated user for accessing stored data. Use of the parameters from the set-up phase can also be used to generate public and secret keys for the data owner using the above identified processes.
Prior to storing data, especially sensitive data, on cloud based servers, the data owner can encrypt the data to help prevent breach, as well as reduce access to the data should breach occur.
For each message, all of the first message blocks Mi[1] are collected (block 74) and all of the second message blocks Mi[2] are collected (block 77). Subsequently, an encode algorithm is run (blocks 75, 78) separately on the first set of blocks and the second set of blocks. The algorithm for each group of blocks can be run simultaneously or asynchronously. The encode algorithm receives as input the public key of the data owner pk=((g, ga, gb), ({tilde over (g)}, {tilde over (g)}{tilde over (α)}, {tilde over (g)}{tilde over (b)})) and a message. In one embodiment, the encode algorithm is as follows:
During running of the encode algorithm, two groups of two elements are sampled from ZN, r, s←ZN and {tilde over (r)}, {tilde over (s)}←ZN, and are used to encode the message. Specifically, the group of first message blocks are encrypted to generate (block 76) ciphertext C1 according to the following equation:
C1=((ga)r,(gb)s,gr+s·Encode({hi}I=0k-1,(m0[1], . . . ,mk-1[1])))
where hi represents a portion of the public parameters identified during the set-up phase, as described above with reference to
C2=(({tilde over (g)}ã){tilde over (r)},({tilde over (g)}{tilde over (b)}),{tilde over (g)}{tilde over (r)}+{tilde over (s)}·Encode({fi}I=0k-1,(m0[2], . . . ,mk-1[2])))
where fi represents a portion of the public parameters identified during the set up phase, as described above with reference to
Once the encrypted data is stored, a user can analyze the data by submitting a query. Results of the query can be identified via a cloud based server, for example. However, prior to providing the results to the requesting user, the encrypted results are re-encrypted from under the data owner's public key to under the user's public key.
Specifically, a first half of the re-encryption key is computed (block 94) using the first random element z selected from the set ZN, a first part of the data owner's secret key (a, b, g), and a first part of the user's public key (a, ga, gb). Also, a second half of the reencyption key is computed (block 95) using the second random element {tilde over (z)} selected from the set ZN, a second part of the data owner's secret key (ã, {tilde over (b)}, {tilde over (g)}), and a second part of the user's public key ({tilde over (g)}, {tilde over (g)}ã, {tilde over (g)}{tilde over (b)}). Once calculated, the two parts of the reencyption key are combined (block 96) and output (block 97) for reencypting the encrypted data results.
Re-encrypting ciphertext of the encrypted data allows a user to decrypt the data using his secret key, rather than the secret key of the data owner, and provides an additional level of security.
Once the user receives the re-encrypted data results, the secret key of the user can be used to decrypt the re-encrypted data. Additionally, the data owner can decrypt the encrypted data, if necessary, using the data owner's secret key.
If the ciphertext is fresh, the oracle generated during set-up is accessed (block 114) and a first part of the fresh ciphertext is processed (block 115) via the oracle and a Decode algorithm. Specifically, the first part of the fresh ciphertext C1=(W, X, Y) is used to compute the following:
where a and b are obtained from the secret key.
Once the values of W, X, and Y, which are provided above with respect to
The oracle is queried via the encode algorithm and
is input into me oracle, which outputs (block 116) the first message blocks Mi[1] associated with the first part of the ciphertext using the decode algorithm below:
Decode(0,2ω−1,p1,{hi}i=0k-1inp),
where p1 is a private parameter and hi includes public parameters, both of which are discussed above with respect to
Simultaneously to or asynchronously to processing of the first part of the ciphertext, processing (block 117) of the second part of the ciphertext C2=({tilde over (W)}, {tilde over (X)}, {tilde over (Y)}) can occur via the Decode algorithm, without the oracle. Specifically, the following is computed using the second part of the ciphertext, as follows:
Once the values of {tilde over (W)}, {tilde over (X)}, and {tilde over (Y)}, which are provided above with respect to
Subsequently, the Decode algorithm is run as provided below:
Decode(0,2ω−1,p2,{fi}i=0k-1),Encode({fi}i=0k-1,(m0[2], . . . ,mk-1[2]))
Results of the Decode algorithm include plaintext of the second blocks (block 118) of the message segments, (m0[2], . . . , mk-1[2]). Finally, the plaintext of the first and second blocks of the message segments are combined (block 119) as the decrypted message, which is output (block 120) to the user for processing and analysis, such as for conducting market research or identifying behavior trends and patterns.
If the ciphertext is determined to be a sum of freshly generated ciphertext, such as the re-encrypted ciphertext, decryption can be performed as provided above, except that the minimum and maximum parameters in the decode algorithm are computed based on the expected range of plaintext messages in the original fresh ciphertexts and a total number of such fresh ciphertexts added.
While the invention has been particularly shown and described as referenced to the embodiments thereof, those skilled in the art will understand that the foregoing and other changes in form and detail may be made therein without departing from the spirit and scope of the invention.
Number | Name | Date | Kind |
---|---|---|---|
20160380767 | Hayashi et al. | Dec 2016 | A1 |
Number | Date | Country |
---|---|---|
2924911 | Sep 2015 | EP |
2924911 | Sep 2015 | EP |
2016051591 | Apr 2016 | WO |
WO-2016051591 | Apr 2016 | WO |
Entry |
---|
Sara Foresti, “Overview of the State of the Art,” Preserving Privacy in Data Outsourcing, 2011, pp. 9-30, vol. 51. |
Boneh et al., “Evaluating 2-DNF Formulas on Ciphertexts,” Theory of Cryptography, Second Theory of Cryptography Conference, TCC 2005, Cambridge, MA, USA, 2005, Proceedings, vol. 3378 of Lecture Notes in Computer Science, Springer, 2005, pp. 325-341. |
Damgard et al., “A Generalization of Paillier's Public-Key System with Applications to Electronic Voting,” International Journal of Information Security, 9(6):371-385, 2010. |
Gentry et al., “Fully Homomorphic Encryption Using Ideal Lattices,” STOC, 2009, pp. 169-178, vol. 9. |
Hohenberger et al., “Securely Obfuscating Re-Encryption,” Theory of Cryptography, Springer, 2007, pp. 233-252. |
P. Paillier, “Public-Key Cryptosystems Based on Composite Degree Residuosity Classes,” Advances in Cryptology, EUROCRYPT99, 1999, pp. 223-238. |
Popa et al., “Cryptdb: Protecting Confidentiality with Encrypted Query Processing,” Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, ACM, 2011, pp. 85-100. |
Rivest et al., “A Method for Obtaining Digital Signatures and Public-Key Cryptosystems,” Communications of the ACM, 1978, 21(2)120-126. |
Number | Date | Country | |
---|---|---|---|
20170366519 A1 | Dec 2017 | US |