The present disclosure relates to the technical fields of discrete logarithm problem, multiplication encryption, encrypted database and searchable encryption, in particular to a database encryption method supporting composable SQL query.
A discrete logarithm problem refers to the problem of finding logarithms in a computable finite cyclic group. This problem was published by W. Diffie and E. Hellman in 1976. Up to now, the discrete logarithm problem is still considered as a hard problem to solve, and thus is widely used in encryption.
Multiplication encryption is widely used in cryptography, including well-known encryption solutions such as ElGmal and RSA. Encryption can achieve “perfect security” and ensure the multiplication homomorphism of the ciphertext, that is, the multiplication of the ciphertext is also the multiplication of the plaintext. Therefore, multiplication encryption achieves good security and operability. In 2014, a solution to encrypt a database by using the multiplication encryption solution was proposed in the SIGMOD conference, but this solution is not secure and quite different from the present disclosure.
Encryption database technology was first proposed in 2011. The first encrypted database model is CryptDB, which skillfully combines various technologies to realize various query operations of an encrypted database. However, since 2015, there have been a series of work attacks on CryptDB. These works prove that the security of CryptDB is extremely poor. Although CryptDB has landed, it should not be used for commercial activities. In 2014, a solution for encrypting a database was proposed in SIGMOD. This solution is called SDB, which puts forward a novel encryption solution, which realizes most of various query operations. However, this solution is still proved to be unsafe and difficult to apply to commercial activities.
The searchable encryption technology has developed for decades, and so far, there are many technologies used for searchable encryption, including DET (deterministic encryption) and OPE (order-preserving encryption). However, so far, these technologies have been proved to be either unsafe or too complicated to be used in real scenes. In addition, although composable SQL query is the mainstream demand of the encrypted database, no database supports safe composable SQL query at present.
With the rapid development of Internet, users (including individuals and companies) are increasingly demanding to store large-scale data (over GB). At present, it has become a mainstream trend for users to store data in the service providers that provide storage services, that is, the cloud in a broad sense. However, there are two problems when data is stored in service providers: if data is stored without encryption in service providers, the privacy of the users will be exposed to the service providers; if the data is encrypted and stored in the service provider, the user cannot operate the encrypted data safely and efficiently, especially the composable SQL query.
Therefore, how to support the operations of users' private data, especially the composable SQL query including updating, adding, searching for a specified keyword and querying a specified range, is still a problem to be solved.
In view of the shortcomings of the prior art, the purpose of the present disclosure is to provide a database encryption method supporting composable SQL query.
The purpose of the present disclosure is realized by the following technical solution: a database encryption method supporting composable SQL query, includes the following steps:
(1) Encrypting stored data: a user encrypts and preprocesses his own stored data, and uploads an encryption result and preprocessed data to a service provider, who provides software, apparatuses, electronic devices or storage media for running a database to store the data uploaded by the user; the step (1) specifically comprises the following sub-steps:
(1.1) The user generating random number sets as a row key and a column key of the data for his own data; the user encrypting the data with the row key and the column key based on multiplication encryption, and outputting an encrypted database.
(1.2) The user uploading the encrypted database to the service provider, and the user selecting any encryption scheme to encrypt and upload the row key generated in step (1.1) to the service provider to realize encrypted storage of the row key, and meanwhile selectively storing the column keys locally without encryption as needed or encrypting and uploading the column keys to the service provider.
(1.3) The user preprocessing an instruction to generate auxiliary data needed to run the instruction; the user uploading the auxiliary data to the service provider, wherein the service provider cannot obtain any privacy information about the database from the auxiliary data; the instruction comprises operations of updating, inserting, deleting, adding, searching for a certain specified keyword and querying a specified range.
(1.4) The service provider selecting a storage form according to an actual situation, wherein the storage form comprises database software, apparatuses, electronic devices or storage media, and storing the encrypted data and the auxiliary data based on the storage form, and carrying out execution of subsequent instructions.
(2) Setting a composable SQL query instruction, and the user uploading the composable SQL query instruction to the service provider according to actual demands, and uploading auxiliary parameters used for the query instruction.
(3) Running the composable SQL query instruction set in step (2): the service provider runs the query instruction according to the received composable SQL query instruction and the auxiliary parameters from the user, saving a calculation result after the instruction is run, updating the data and returning a query result to the user.
Further, the step (1.1) is specifically as follows:
The user randomly generates two large prime numbers p and q to obtain a large integer N=p·q; the user takes the large integer N as an order of an generator g and finds out the generator g on a finite field ZN
c
(i,j)=(N+1)t
Further, the step (1.3) is specifically as follows:
For the instructions of deleting and adding, the user does not need to generate auxiliary data D.
For the instructions of updating and inserting, once a user updates or inserts a row of data, it is necessary to generate a row key in advance and encrypt and store the row key in the service provider; during updating and inserting, the user downloads the row key generated in advance from the service provider, and obtains the column key from the service provider or locally, so as to decrypt and update the updated data, re-encrypt and upload the data to the service provider or directly encrypt the inserted data and upload the data to the service provider.
For the instruction of searching for a certain specified keyword, the user is allowed to select whether each row is the same keyword between two columns or whether all rows in a certain column are a certain specified keyword; if the user queries whether the same row is the same keyword between two columns, the user needs to generate a column of encrypted random numbers αi(i=1, 2, . . . , n) for each query, and the corresponding column key is (mα, xα), and the encryption scheme is E(αi)=(mα·gr
For the instruction of querying the specified range, the user selects two columns to compare with each other or one column to compare with the same constant to query the specified range; if the user compares the two columns, it needs to generate a column of encrypted random numbers αi(i=1, 2, . . . , n) for each comparison operation, the corresponding column key is (mα, xα), and the encryption scheme is E(αi)=(mα·gr
if the user compares a column with the same constant, the user needs to generate an additional encrypted column of γ in addition to the auxiliary data needed for the comparison between the two columns, that is, the encrypted element in each row is γ, the corresponding column key is (mγ, xγ), and the element in the ith row is encrypted as E(γ)i=(mγ·gr
Further, the step (2) specifically comprises the following sub-steps:
(2.1) The users uploading operation instructions to the service provider according to actual business needs; wherein the operation instructions comprise updating, adding, searching for a certain specified keyword and querying a specified range.
(2.2) The user calculating the auxiliary parameters according to the uploaded instructions, and uploading the auxiliary parameters to the service provider along with the instructions or after receiving a request of the service provider:
The step (2.2) is specifically as follows: the user does not need to calculate any auxiliary parameters for the operations of updating, inserting and deleting; for the operation of adding, if a column A and a column B are added, and the corresponding column keys are (mA, xA) and (mB, mB), the column key of a newly generated column C=A+B is calculated and stored by the user; the corresponding calculation mode is (mC=mA·mB mod N2, xC=xA+xB mod N); the user chooses to store the key locally without encryption or encrypt and upload the key to the service provider for storage; for the instruction of searching for a certain specified keyword, the user selects to query whether each row is the same keyword between two columns or whether all rows of a certain column are a certain specified keyword; if the user queries whether each row is the same keyword between two columns such as the column A and the column B, and the corresponding column keys are (mA, xA) and (mB, xB), the user needs to calculate:
(I) The column key (mA′=mA·mB−1·mα mod N2, xA′=xA−XB+xα mod N) of a column A′=A−B+α.
(II) Calculating the auxiliary parameter ζ=−xS−1xA′ mod N, η=mA′msζ mod N2 to obtain the auxiliary parameter (ζ,η) uploaded by the user at last.
If the user queries whether all rows in a certain column such as the column A, are a certain specified keyword such as v, the user needs to calculate:
(I) v·γ−1, where γ is the auxiliary data in step (1.3).
(II) The column key (mγ′=mγv·γ
(III) The column key (mA′=mA·(mγ′)−1·mα mod N2, xA′=xA−xγ′+xα mod N) of a column A′=A−γ′+α.
(IV) Calculating the auxiliary parameter P: ζ=−xS−1xA′ mod N, η=MA′msζ mod N2; obtaining the auxiliary parameter (v·γ−1, ζ, η) uploaded by the user at last;
for the instruction of querying a specified range, the user selects to compare the values of two columns with each other or compare the value of a column with the same constant to query the specified range; if the user compares the values between the same rows of the two columns, for example the column A and the column B, and the column keys corresponding to the two columns are (mA, xA) and (mB, xB), the user needs to calculate:
(I) The column key (mA′=mA·mB−1·mα mod N2, xA′=xA−XB+xα mod N) of the column A′=A−B+α; (2) ζ=−XS−1XA′ mod N, η=MA′msζ mod N2; obtaining the auxiliary parameter (ζ,η) uploaded by the user at last.
If the user compares the values of all rows in a certain column, for example the column A, with the same constant, for example v, the user needs to calculate:
(I) v·γ−1, where γ is the auxiliary data in step (1.3).
(II) Calculating the column key (mγ′=mγv·γ
(III) Calculating the column key (mA′=mA·(mγ′)−1·mα mod N2, xA′=xA−xγ′+xα mod N) of the column A′=A−γ′+α.
(IV) Calculating the auxiliary parameter ζ=−xS−1xA′ mod N, η=mA′msζ mod N2; obtaining the auxiliary parameter (v·γ−1, ζ, η) uploaded by the user at last.
Further, the step (3) specifically comprises the following sub-steps:
(3.1) The service provider performing corresponding calculation and operation after receiving the composable SQL query instruction of the user, and putting forward a request for the auxiliary parameters to the user according to the requirements in the calculation or selecting required parameters from the auxiliary parameters uploaded by the user along with the instruction.
(3.2) The service provider carrying out calculation according to the instruction after receiving the auxiliary parameters, saves the calculation result after the instruction is run, updates the data and return the query result to the user.
Further, the step (3.1) is specifically as follows: for the instruction of deleting, the service provider only needs to perform a normal deletion operation and update the database; for operations of insertion and update, if the user encrypts and stores the row key and column key at the service provider, the service provider only needs to return the row key and column key requested by the user and receive the new data uploaded by the user for normal insertion and update; for the operation of addition, if the column A and the column B are added to get a column C, the service provider calculates the ith row element Ci of the column C as follows: Ci=Ai×Bi(i=1, 2, 3, . . . n), where Ai and Bi represent the ith row elements of the column A and the column B respectively; in the operation of the above instructions, the service provider does not need the user to provide any auxiliary parameters.
For the operation of searching for a certain specified keyword, the user selects to query whether each row is the same keyword between two columns or whether all rows in a certain column are a certain specified keyword; if the user queries whether each row is the same keyword between two columns, for example the column A and the column B, and the corresponding column keys are (mA, xA) and (mB, xB), the service provider performs the following calculations:
(I) Calculating a column B′ as follows: the ith row element B′i=BiN-1 mod N2 of the column B′, where Bi is the ith row element of the column B.
(II) Calculating a column A′ as follows: the ith row element A′i=Ai×B′i×E(α)i mod N2 of the column A′, where Ai is the ith row element of the column A and E(α)i is the ith row element of the column α of the auxiliary data in step (1.3).
(III) The service provider requesting the auxiliary parameter P to the user or selects the required parameter from the auxiliary parameters uploaded by the user along with the instruction.
If the user queries whether all the rows in a certain column, for example the column A, are an external keyword, for example v, the service provider directly asks the user for auxiliary parameters or selects the required parameters from the auxiliary parameters uploaded by the user along with the instruction.
For the instruction of querying a specified range, the user may select to compare the values of two columns with each other or the value of a certain column with the same constant to query the specified range; if the user compares the values of two columns, for example the column A and the column B, and the corresponding column keys are (mA, xA) and (mB, xB), the service provider carries out the following calculations:
(I) Calculating the column B′ as follows: the ith row element ith B′i=BiN-1 mod N2 of the column B′, where Bi is the ith row element of the column B.
(II) Calculating the column A′ as follows: the ith row element A′i=Ai×B′i×E(α)i mod N2 of the column A′, where Ai is the ith row element of the column A and E(α)i is the ith row element of the column α of the auxiliary data in step (1.3).
(III) The service provider puts forward a request for the auxiliary parameters to the user or selects the required parameters from the auxiliary parameters uploaded by the user along with the instruction.
If the user compares the values of all the rows in a certain column, for example the column A, with the same constant, for example v, the service provider directly puts forward a request for the auxiliary parameters to the user or selects the required parameters from the auxiliary parameters uploaded by the user along with the instruction.
Further, the step (3.2) is specifically as follows: for the operation of searching for a specified keyword, the user selects to query whether each row is the same keyword between two columns or whether all rows in a certain column are a certain specified keyword; if the user queries whether each row is the same keyword between two columns, for example the column A and the column B, and the corresponding column keys are (mA, xA) and (mB, xB), the service provider receives the auxiliary parameter (ζ,η) calculated and uploaded by the user in step (2), and carries out the following calculations:
(I) Calculation of a column a, wherein the ith row element ai of the column a is calculated as follows: ai=A′i×η×E(1)iζ mod N2, where A′i is the ith row element of the column A′ in step (3.1) and E(1)i is the ith row element of the column S of the auxiliary data in step (1.3).
(II) Calculation of a column H, wherein the ith row element Hi of the column H is calculated as follows: Hi=(hie
If the user queries whether all the rows in a certain column, for example the column A, are an external keyword, for example v, the service provider receives the auxiliary parameter (v·γ−1, ζ, η) uploaded by the user and carries out the following calculations:
(I) Calculation of the column γ′, where γ′ is a new column calculated by the service provider according to the column γ in the auxiliary data in step (1.3) and the calculated auxiliary parameter P uploaded by the user, and the ith row element E(γ′)i of the column γ′ is calculated from the ith row element E(γ)i of the column γ: E(γ′)i=E(γ)iv·γ
(II) Calculation of the column A′ as follows: the ith row element A′i=Ai×E(γ′)iN-1×E(α)i mod N2 of the column A′, where Ai is the ith row element of the column A and E(α)i is the ith row element of the column α in the auxiliary data in step (1.3).
(III) Calculation of the column a, wherein the ith row element of the column a is calculated as follows: ai=A′i×η×E(1)iζ mod N2, where A′i is the ith row element of the column A′ in step (3.1) and E(1)i is the ith row element of the column S in the auxiliary data in step (1.3).
(IV) Calculation of the column H, wherein the ith row element Hi of the column H is calculated as follows: Hi=(hie
(V) Comparison of (he
For the instruction of querying a specified range, the user selects to compare the values of two columns with each other or the value of a certain column with the same constant to query the specified range; if the user compares the values of the same rows between two columns, for example the column A and the column B, and the corresponding column keys of the two columns are (mA, xA) and (mB, xB), then the service provider receives the auxiliary parameter P: (ζ, η) uploaded by the user, and carries out the following calculations:
(I) Calculation of the column a, wherein the ith row element ai of the column a is calculated as follows: ai=A′i×η×E(1)iζ mod N2, where A′i is the ith row element of the column A′ in step (3.1) and E(1)i is the ith row element of the column S in the auxiliary data in step (1.3).
(II) Calculation of the column a′, wherein the ith row element a; of the column a′ is calculated as follows: a′i=ai×(βi·ti−1)+ui−βiαi mod N, where βiti−1 is the i1 row element of the column β in the auxiliary data in step (1.3), and ui−βiαi is the ith row element of the column u in step (1.3).
(III) Comparison of a′i with
wherein if
the element in the ith row of the column A is smaller; if not, the element in the ith row of the column B is not greater than the element in the ith row of the column A.
If the user compares the values of all rows in a certain column, for example, the column A, with the same constant, for example v, the service provider receives the auxiliary parameter (v·γ−1, ζ, η) uploaded by the user and carries out the following calculations:
(I) Calculation of the column γ′, wherein γ′ is a new column calculated by the service provider according to the column γ in the auxiliary data in step (1.3) and the calculated auxiliary parameter P uploaded by the user, and the ith row element E(γ′)i of the column γ′ is calculated from the ith row element E(γ)i of the column γ: E(γ′)i=E(γ)iv·γ
(II) Calculation of the column A′ as follows: the ith row element A′i=Ai×E(γ′)iN-1×E(α)i mod N2(i=1, 2, . . . , n) of the column A′, where Ai is the ith row element of the column A and E(α)i is the ith row element of the column α in the auxiliary data in step (1.3).
(III) Calculation of the column a, wherein the ith row element ai of the column a is calculated as follows: ai=A′i×η×E(1)iζ mod N2, where A′i is the ith row element of the column A′ in step (3.1) and E(1)i is the ith row element of the column S in the auxiliary data in step (1.3).
(IV) Calculation of the column a′, wherein the ith row element a; of the column a′ is calculated as follows: a′i=ai×(βi·ti−1)+ui−βiαi mod N, where βiti−1 is the i1 row element of the column β in the auxiliary data in step (1.3) and ui−βiαi is the ith row element of the column u in step (1.3).
(V) Comparison of a′i with
wherein if
the element in the ith row of the column A is not less than v; otherwise, the element in the ith row of the column A is less than v.
According to the above technical solution, the present disclosure has the following beneficial effects:
1. Based on the discrete logarithm problem, the present disclosure realizes the encryption protection of user data and reduces the leakage of user privacy information, thus providing a safe and complete protection solution for cloud storage of user data in actual commercial activities.
2. The present disclosure realizes the safe and efficient execution of the composable SQL query instructions on the encrypted data, including updating, inserting, deleting, adding, searching for a specified keyword and querying a specified range, and meets the requirements of users for remote operations and query of cloud storage data in actual commercial activities.
3. The present disclosure has the advantages of strong universality, safety and high efficiency, privacy protection, simple and convenient use, high efficiency, less memory and time consumption, and the like.
In order to make the purpose, technical solution and advantages of this application clearer, the technical solution of this application will be clearly and completely described below with reference to the specific embodiments of this application and the corresponding drawings. Obviously, the described embodiments are only part of, not all of the embodiments of this application. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative labor belong to the scope of protection in this application.
Supposing that a user A of a sales company stores his own data set M in a service provider B, and asks the service provider B not to get any information about M. In addition, the user A requires that the data set M can be operated without revealing privacy (operations include but are not limited to updating, inserting, deleting, adding, searching for specified keywords and querying a specified range, for example returning orders with a transaction volume greater than 5000). To solve this situation, the database encryption method supporting composable SQL query of the present disclosure is used to meet the requirements of the user A, which specifically includes the following steps:
(1) Encrypting stored data: the user A encrypts and preprocesses data M, and uploads the encryption result and preprocessed data to the service provider B, which provides software, apparatuses, electronic devices or storage media for running a database for storing the data uploaded by the user.
Specifically, in step (1.1), the user A generates random number sets R and C as a row key and a column key of the database M, respectively; based on multiplication encryption, the user A encrypts data with the keys R and C, and outputs the encrypted database.
Specifically, the user A randomly generates two large prime numbers p and q to obtain a large integer N=p q; the user takes the large integer N as an order of an generator g and finds out the generator g on a finite field ZN
c
(i,j)=(N+1)t
In step (1.2), the user A uploads the encrypted database M to the service provider B; in addition, the user A chooses any safe encryption scheme to encrypt a random number set R and uploads the encrypted random number set R to the service provider B; the user A can also choose to store a random number set C locally without encryption or uploads it with encryption to the service provider B as needed;
Specifically, in the whole process, assuming that the database has elements of n rows and m columns; in addition to the encrypted database ciphertext, the user A will upload n encrypted row keys to the service provider B at the same time, and store m column keys locally without encryption or upload the m column keys with encryption to the service provider B; any safe encryption scheme, for example AES encryption, may be adopted for the encryption schemes of row keys and column keys; the encrypted database and the encrypted row keys are collectively referred to as the encryption result, and the encryption result is uploaded by the user A to the service provider B for storage; the service provider B cannot obtain any sensitive information about the database M of the user A from the encryption result; if the user A needs to restore the plaintext of the database elements, it needs to download the encryption elements and the corresponding row keys from service provider B at the same time; if the column keys are encrypt and uploaded to the service provider B, the user A needs to download the column keys corresponding to the encryption elements; if that column keys are stored without encryption locally by the user A, then the user A only needs to find the corresponding column keys of the encrypt elements locally; if the user A needs to query a column in the database, the user A needs the column key of this column; if the column key is encrypted and uploaded to the service provider B, the user A needs to download the column key of this column from the service provider B; if the column key is stored without encryption locally by the user A, then the user A only needs to find the column key of this column locally.
In step (1.3), the user A preprocesses the instruction to generate auxiliary data D required for executing the instruction; the user A uploads the auxiliary data D to the service provider B, and the service provider B cannot obtain any privacy information about the database M from the auxiliary data D; the service provider B must obtain the auxiliary parameter P calculated by the user A before performing the specified operation; the instructions include, but are not limited to, updating, inserting, deleting, adding, searching for specified keywords, querying the specified range, and the like;
Specifically, for the instructions of deleting and adding, the user A does not need to generate auxiliary data D; for the instructions of updating and inserting, once the user A updates or inserts a row of data, it is necessary to generate a row key in advance and encrypt and store the row key in the service provider B; during updating and inserting, the user A downloads the row key generated in advance from the service provider B, and obtains the column key from the service provider B or locally, so as to decrypt and update the updated data, re-encrypt and upload the data to the service provider B or directly encrypt the inserted data and upload the data to the service provider B; for the instruction of searching for a certain specified keyword, the user A can select whether each row is the same keyword between two columns or whether all rows in a certain column are a certain specified keyword; if the user A queries whether the same row is the same keyword between two columns, the user A needs to generate a column of encrypted random numbers αi(i=1, 2, . . . , n) (collectively referred to as a column α) for each query, and the corresponding column key is (mα, xα), and the encryption scheme is E(αi)=(mα·gr
The above column α, column S, and column γ of the auxiliary data can be used for instructions of searching for a specified keyword or querying a specified range. All the above auxiliary data should be uploaded to the service provider B before the query instruction is uploaded, and the service provider B can complete the query instruction with the help of the auxiliary data, thus greatly improving the execution efficiency of the query instruction.
In step (1.4), the service Provider B selects database software, apparatuses, electronic devices or storage media according to the actual situation to store the encrypted M, the random number set R and the auxiliary data D, and performs subsequent instruction execution based on this mode.
In step (2), a composable SQL query instruction is set, and the user A uploads the instruction to the service provider B according to the actual demand, and calculates and uploads the auxiliary parameter P for executing the instruction.
Specifically, in step (2.1), the users A uploads operation instructions to the service provider B according to actual business needs; wherein the operation instructions include updating, adding, searching for a certain specified keyword and querying a specified range;
in step (2.2), the user A calculates the auxiliary parameter P according to the uploaded instruction Q, and uploading the auxiliary parameter P to the service provider B along with the instruction or after receiving a request of the service provider B.
Specifically, the user A does not need to calculate any auxiliary parameter P for the operations of updating, inserting and deleting; for the operation of adding, if a column A and a column B are added, and the corresponding column keys are (mA, xA) and (mB, xB), the column key of a newly generated column C=A+B is calculated and stored by the user A; the corresponding calculation mode is (mC=mA·mB mod N2, xC=xA+xB mod N); the user A chooses to store the key locally without encryption or encrypt and upload the key to the service provider b for storage; for the instruction of searching for a certain specified keyword, the user selects to query whether each row is the same keyword between two columns or whether all rows of a certain column are a certain specified keyword.
If the user queries whether each row is the same keyword between two columns, for example the column A and the column B, and the corresponding column keys are (mA, xA) and (mB, xB), the user needs to calculate:
(I) The column key (mA′=mA·mB−1·mα mod N2, xA, =XA−XB+xα mod N) of a column A′=A−B+α.
(II) Calculating the auxiliary parameter P: ζ=−xs−1xA′ mod N, η=MA′msζ mod N2 to obtain the auxiliary parameter P uploaded by the user A at last as (ζ, η).
If the user A queries whether all rows in a certain column, for example the column A, are a certain specified keyword, for example v, the user A needs to calculate:
(I) v·γ−1, where γ is the auxiliary data in step (1.3);
(II) The column key (mγ′=mγv·γ
(III) The column key (mA′=mA·(mγ′)−1·mα mod N2, xA′=xA−xγ′+xα mod N) of a column A′=A−γ′+α.
(IV) Calculating the auxiliary parameter P: ζ=−xs−1xA′ mod N, η=mA, msζ mod N2; obtaining the auxiliary parameter (v·γ−1, ζ, η) uploaded by the user at last;
for the instruction of querying a specified range, the user A may select to compare the values of two columns with each other or compare the value of a column with the same constant to query the specified range; if the user A compares the values between the same rows of the two columns, for example the column A and the column B, and the column keys corresponding to the two columns are (mA, xA) and (mB, xB), the user A needs to calculate:
(I) The column key (mA′=mA·mB−1·mα mod N2, xA′=xA−XB+xα mod N) of the column A′=A−B+α; (2) ζ=−xs−1xA′ mod N, η=MA′msζ mod N2; obtaining the auxiliary parameter (ζ,η) uploaded by the user at last.
If the user A compares the values of all rows in a certain column, for example the column A, with the same constant, for example v, the user A needs to calculate:
(I) v·γ−1, where γ is the auxiliary data in step (1.3);
(II) Calculating the column key (mγ′=mγv·γ
(III) Calculating the column key (mA′=mA·(mγ′)−1·mα mod N2, xA′=xA−xγ′+xα mod N) of the column A′=A−γ′+α.
(IV) Calculating the auxiliary parameter P: ζ=−xs−1xA′ mod N, η=mA′msζ mod N2; obtaining the auxiliary parameter (v·γ−1, ζ, η) uploaded by the user at last.
In step (3), the composable SQL query instruction set in step (2) is run: the service provider B runs the query instruction according to the received composable SQL query instruction and the auxiliary parameter P from the user, saves a calculation result after the instruction is run, updates the data and returns a query result U to the user A, which includes the following sub-steps:
(3.1) The service provider B performing corresponding calculation and operation after receiving the composable SQL query instruction of the user, and putting forward a request for the auxiliary parameter P to the user A according to the requirements in the calculation or selecting required parameters from the auxiliary parameter P uploaded by the user A along with the instruction.
Specifically, for the instruction of deleting, the service provider B only needs to perform a normal deletion operation and update the database; for operations of insertion and update, if the user A encrypts and stores the row key and column key at the service provider B, the service provider B only needs to return the row key and column key requested by the user A and receive the new data uploaded by the user A for normal insertion and update; for the operation of addition, if the column A and the column B are added to get a column C, the service provider B calculates the ith row element Ci of the column C as follows: Ci=Ai×Bi(i=1,2,3, . . . n), where Ai and Bi represent the ith row elements of the column A and the column B respectively; in the operation of the above instructions, the service provider B does not need the user A to provide any auxiliary parameters.
For the operation of searching for a certain specified keyword, the user A may select to query whether each row is the same keyword between two columns or whether all rows in a certain column are a certain specified keyword; if the user A queries whether each row is the same keyword between two columns, for example the column A and the column B, and the corresponding column keys are (mA, xA) and (mB, xB), the service provider B performs the following calculations:
(I) Calculating a column B′ as follows: the ith row element B′i=BiN-1 mod N2 of the column B′, where Bi is the ith row element of the column B.
(II) Calculating a column A′ as follows: the ith row element A′i=Ai×B′i×E(α)i mod N2 of the column A′, where Ai is the ith row element of the column A and E(α)i is the ith row element of the column α of the auxiliary data in step (1.3).
(III) The service provider B requesting the auxiliary parameter P to the user or selects the required parameter from the auxiliary parameter P uploaded by the user A along with the instruction.
If the user A queries whether all the rows in a certain column, for example the column A, are an external keyword, for example v, the service provider B directly asks the user for auxiliary parameter P or selects the required parameter from the auxiliary parameters P uploaded by the user A along with the instruction.
For the instruction of querying a specified range, the user A can select to compare the values of two columns with each other or the value of a certain column with the same constant to query the specified range; if the user A compares the values of two columns, for example the column A and the column B, and the corresponding column keys are (mA, xA) and (mB, xB), the service provider B carries out the following calculations:
(I) Calculating the column B′ as follows: the ith row element B′i=BiN-1 mod N2 of the column B′, where Bi is the ith row element of the column B.
(II) Calculating the column A′ as follows: the ith row element A′i=Ai×B′i×E(α)i mod N2 of the column A′, where Ai is the ith row element of the column A and E(α)i is the ith row element of the column α of the auxiliary data in step (1.3).
(III) The service provider B puts forward a request for the auxiliary parameters P to the user or selects the required parameter from the auxiliary parameter P uploaded by the user A along with the instruction.
If the user A compares the values of all the rows in a certain column, for example the column A, with the same constant, for example v, the service provider B directly puts forward a request for the auxiliary parameter P to the user or selects the required parameters from the auxiliary parameter P uploaded by the user A along with the instruction.
In step (3.2) the service provider B carries out calculation according to the instruction after receiving the auxiliary parameter P, saves the calculation result after the instruction is run, updates the data and return the query result U to the user A.
Specifically, for the operation of searching for a specified keyword, the user A selects to query whether each row is the same keyword between two columns or whether all rows in a certain column are a certain specified keyword; if the user queries whether each row is the same keyword between two columns, for example the column A and the column B, and the corresponding column keys are (mA, xA) and (mB, xB), the service provider B receives the auxiliary parameter P: (ζ, η) calculated and uploaded by the user A in step (2), and carries out the following calculations:
(I) Calculation of a column a, wherein the ith row element ai of the column a is calculated as follows: ai=A′i×η×E (1)iζ mod N2, where A′i is the ith row element of the column A′ in step (3.1) and E(1)i is the ith row element of the column S of the auxiliary data in step (1.3).
(II) Calculation of a column H, wherein the ith row element Hi of the column H is calculated as follows: Hi=(hie
If the user A queries whether all the rows in a certain column, for example the column A, are an external keyword, for example v, the service provider receives the auxiliary parameter P: (v·γ−1, ζ, η) uploaded by the user and carries out the following calculations:
(I) Calculation of the column γ′, where γ′ is a new column calculated by the service provider according to the column γ in the auxiliary data in step (1.3) and the calculated auxiliary parameter P uploaded by the user, and the ith row element E(γ′)i of the column γ′ is calculated from the ith row element E(γ)i of the column γ: E(γ′)i=E(γ)iv·γ
(II) Calculation of the column A′ as follows: the ith row element A′i=Ai×E(γ′)iN-1×E(α)i mod N2 of the column A′, where Ai is the ith row element of the column A and E(α)i is the ith row element of the column α in the auxiliary data in step (1.3).
(III) Calculation of the column a, wherein the ith row element of the column a is calculated as follows: ai=A′i×η×E(1)iζ mod N2, where A′i is the ith row element of the column A′ in step (3.1) and E(1)i is the ith row element of the column S in the auxiliary data in step (1.3).
(IV) Calculation of the column H, wherein the ith row element Hi of the column H is calculated as follows: Hi=(hie
(V) Comparison of (he
For the instruction of querying a specified range, the user A may select to compare the values of two columns with each other or the value of a certain column with the same constant to query the specified range; if the user compares the values of the same rows between two columns, for example the column A and the column B, and the corresponding column keys of the two columns are (mA, xA) and (mB, xB), then the service provider B receives the auxiliary parameter P: (ζ, η) uploaded by the user, and carries out the following calculations:
(I) Calculation of the column a, wherein the ith row element ai of the column a is calculated as follows: ai=A′i×η×E (1)iζ mod N2, where A′i is the ith row element of the column A′ in step (3.1) and E(1)i is the ith row element of the column S in the auxiliary data in step (1.3).
(II) Calculation of the column a′, wherein the ith row element a; of the column a′ is calculated as follows: a′i=ai×(βi·ti−1)+ui−βiαi mod N, where βiti−1 is the ith row element of the column β in the auxiliary data in step (1.3), and ui−βiαi is the ith row element of the column u in step (1.3).
(III) Comparison of a′i with
wherein if
the element in the ith row of the column A is smaller; otherwise, the element in the ith row of the column B is not greater than the element in the ith row of the column A.
If the user A compares the values of all rows in a certain column, for example, the column A, with the same constant, for example v, the service provider B receives the auxiliary parameter P: (v·γ−1, ζ, η) uploaded by the user and carries out the following calculations:
(I) Calculation of the column γ′, wherein γ′ is a new column calculated by the service provider according to the column γ in the auxiliary data in step (1.3) and the calculated auxiliary parameter P uploaded by the user, and the ith row element E(γ′)i of the column γ′ is calculated from the ith row element E(γ)i of the column γ: E(γ′)i=E(γ)iv·γ
(II) Calculation of the column A′ as follows: the ith row element A′i=Ai×E(γ′)iN-1×E(α)i mod N2(i=1, 2, . . . , n) of the column A′, where Ai is the ith row element of the column A and E(α)i is the ith row element of the column α in the auxiliary data in step (1.3).
(III) Calculation of the column a, wherein the ith row element ai of the column a is calculated as follows: ai=A′i×η×E(1)iζ mod N2, where A′i is the ith row element of the column A′ in step (3.1) and E(1)i is the ith row element of the column S in the auxiliary data in step (1.3).
(IV) Calculation of the column a′, wherein the ith row element a; of the column a′ is calculated as follows: a′i=ai×(βi·ti−1)+ui−βiαi mod N, where βiti−1 is the i1 row element of the column β in the auxiliary data in step (1.3) and ui−βiαi is the ith row element of the column u in step (1.3).
(V) Comparison of a′i with
wherein if
the element in the ith row of the column A is not less than v; otherwise, the element in the ith row of the column A is less than v.
The present disclosure discloses a system for encrypting a database supporting composable SQL query, which can perform the functions of safely storing user data and supporting updating, inserting, deleting, adding, searching for a specified keyword and querying a specified range for user data provided by any embodiment of the present disclosure. The system for encrypting a database supporting composable SQL query includes a user device module and a service provider module; the user device module encrypts and preprocesses user data, and uploads the encryption result and preprocessed data to the service provider module; the user module executes data operation instructions, and uploads operation instructions to the service provider module according to actual requirements, wherein the operation instructions include updating, inserting, deleting, adding, searching for a specified keyword and querying a specified ranges, and uploads auxiliary parameter operation instructions to the service provider module, wherein the auxiliary parameter operation instructions include updating, inserting, deleting, adding, searching for a specified keyword and querying a specified range; the service provider module runs the instruction according to the received operation instruction and the auxiliary parameter, stores the calculation result after the instruction is run, updates the data and returns the query result to the user module.
In the field of database, an enterprise runs the database test international standard TPC-C for trading, that is, a certain warehouse accepts orders from multiple users at the same time, and the warehouse has multiple transactions with these users at the same time. In order to save memory and improve performance, the warehouse encrypts transaction data and stores it in the cloud. According to the demand of real-time transactions, the warehouse continuously submits SQL instructions to the encrypted data in the cloud (instructions include updating, inserting, deleting, adding, searching for a specified keyword and querying a specified range).
According to the meaning of the instructions for actual transactions, the instructions are divided into five sets:
1. New-Order: the customer enters a new order transaction.
2. Payment: update the customer's account balance and reflect its payment status.
3. Delivery: delivery (simulated batch transaction).
4. Order-Status query: query the status of customers' recent transactions.
5. Stock-Level query: query the stock level of the warehouse so as to replenish the goods in time.
In this embodiment, a server with two 2.5 GHz Intel Xeon Gold 6248 processors and 256 GB memory is used to simulate the cloud service provider, and the method of the present disclosure is used to execute the five sets of instructions respectively, and the time spent is shown in Table 1.
As shown in Table 1 above, the present disclosure supports the execution of five TPC-C transactions. When the five TPC-C transactions are completed once, the shortest time used by the present disclosure is 101 seconds, and the longest time is only 422 seconds (Delivery transactions). As each TPC-C transaction consists of dozens of SQL query instructions, it can be seen that in actual commercial activities, even if the transmission delay is considered, the present disclosure can still be completed in about 10 seconds. In addition, with the increase of the number of executions, the time of the present disclosure is always stable, so it can maintain stable performance in actual commercial activities. Therefore, by using the method of the present disclosure, the database runs faster, consumes less time and runs stably. To sum up, based on the discrete logarithm problem, the present disclosure realizes the encryption protection of user data and reduces the leakage of user privacy information, thus providing a safe and complete protection solution for cloud storage of user data in actual commercial activities; according to the present disclosure, the composable SQL query instructions can be safely and efficiently executed on encrypted data, and the instructions include updating, inserting, deleting, adding, searching for a specified keyword and querying a specified range, so that the requirements of users for remote operation and query of cloud storage data in actual commercial activities are met; the present disclosure has the advantages of strong universality, safety and high efficiency, privacy protection, simple and convenient use, high efficiency, less memory and time consumption and the like.
The steps of the method or algorithm described combined with the embodiments of the present disclosure may be implemented in a hardware manner, or may be implemented in a manner in which a processor executes software instructions. The software instructions may consist of corresponding software modules, and the software modules can be stored in Random Access Memory (RAM), flash memory, Read Only Memory (ROM), Erasable Programmable ROM (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), registers, hard disks, removable hard disks, CD-ROMs or any other forms of storage media well-known in the art. An exemplary storage medium is coupled to the processor, such that the processor can read information from, and write information to, the storage medium. The storage medium can also be an integral part of the processor. The processor and storage medium may reside in an Application Specific Integrated Circuit (ASIC). Alternatively, the ASIC may be located in a node device, such as the processing node described above. In addition, the processor and storage medium may also exist in the node device as discrete components.
It should be noted that when the data compression apparatus provided in the foregoing embodiment performs data compression, division into the foregoing functional modules is used only as an example for description. In an actual application, the foregoing functions can be allocated to and implemented by different functional modules based on a requirement, that is, an inner structure of the apparatus is divided into different functional modules, to implement all or some of the functions described above. For details about a specific implementation process, refer to the method embodiment. Details are not described herein again.
All or some of the foregoing embodiments may be implemented by using software, hardware, firmware, or any combination thereof. When the software is used for implementation, all or some of the embodiments may be implemented in a form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a server or a terminal, all or some of the procedures or functions according to the embodiments of this application are generated. The computer instructions may be stored in a computer-readable storage medium or may be transmitted from a computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a web site, computer, server, or data center to another web site, computer, server, or data center in a wired (for example, a coaxial optical cable, an optical fiber, or a digital subscriber line) or wireless (for example, infrared, radio, or microwave) manner. The computer-readable storage medium may be any usable medium accessible by a server or a terminal, or a data storage device, such as a server or a data center, integrating one or more usable media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a digital video disk (DVD)), or a semiconductor medium (for example, a solid-state drive).
What is described above is only preferred embodiments of the present disclosure, and it is not intended to limit the present disclosure. Any modifications, equivalents, improvements and the like made within the spirit and principle of the present disclosure shall be included in the scope of protection of the present disclosure.
The present application is a continuation of International Application No. PCT/CN2021/110269, filed on Aug. 3, 2021, the content of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2021/110269 | Aug 2021 | US |
Child | 17888521 | US |