This invention relates to a technology for searching encrypted graph structure data as it is in its encrypted state without decrypting the encrypted graph structure data.
In recent years, big data businesses for collecting, storing, and analyzing a large volume of data to extract valuable knowledge therefrom have become widespread. In particular, for example, in order to analyze content on social networks, a large volume of graph structure data is stored in a database for the analysis.
However, a related-art relational database has a problem of requiring a large amount of calculation resources for handling the graph structure data during the execution of processing for extracting all nodes adjacent to a specific node or retrieval involving a large amount of joining processing for searching paths between a specific node and another node. Therefore, a graph database dedicated to such processing has attracted attention.
Meanwhile, large-capacity storage devices, high-speed CPUs, and a system for distributively controlling the storage devices and the CPUs are required for handling a large volume of data, and hence the use of external resources including a cloud is under consideration. However, a security problem occurs when a user using a cloud service stores data in an external organization. Therefore, a secret information processing technology for outsourcing data after subjecting the data to encryption processing and executing retrieval or other such processing on the data as it is in its encrypted state has attracted attention (see, for example, JP 2012-123614 A, ADVANCED ENCRYPTION STANDARD (AES), [online], Nov. 26, 2001, Federal Information Processing Standards Publication 197, [retrieved on Oct. 2, 2014], Internet <http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf>, and Secure Hash Standard (SHS), [online], March 2012, FEDERAL INFORMATION PROCESSING STANDARDS PUBLICATION 180-4, [retrieved on Oct. 2, 2014], Internet <http://csrc.nist.gov/publications/fips/fips180-4/fips-180-4.pdf>).
In order to solve the above-mentioned security problem that occurs in the outsourcing of data, for example, in JP 2012-123614 A, there is proposed an encryption retrieval processing system for executing retrieval as follows. That is, data is encrypted and then outsourced to a cloud or the like. Subsequently, a computer being used by a user transmits an encrypted retrieval query to the cloud. On the cloud side, the data is searched as it is in its encrypted state.
Only the encrypted data is transmitted to the cloud side without showing its plaintext data, which can solve the above-mentioned problem in terms of privacy. In the invention of JP 2012-123614 A, there is described a method of encrypting respective cells of table-type data and further searching the respective cells as they are in the encrypted state of the data.
When data to be encrypted is data including a graph structure, in order to execute the retrieval involving traverse processing, which is specific to the graph structure, for searching paths between a specific node (start point) and another node (end point) as described above, an encryption query for retrieving an intermediate node existing on the path is required in addition to an encryption query for retrieving the start point node and the end point node. Each time the intermediate node is identified during the traverse processing, the cloud needs to request the user to generate and transmit the encryption queries of those nodes, which raises a problem in that a large amount of communications and calculation resources are required.
Therefore, this invention has an object to reduce a load required for retrieval processing for data including an encrypted graph structure.
A representative aspect of the present disclosure is as follows. A method of retrieving an encrypted graph, which allows a first computer comprising a processor and a memory to generate an encrypted graph by encrypting a graph including a start point, an edge, and an end point, and allows a second computer comprising a processor and a memory to retrieve the encrypted graph, the method comprising: a first step of generating, by the first computer, a secret key through use of a secret key generating function of a searchable encryption algorithm set in advance; a second step of generating, by the first computer, the encrypted graph by encrypting the graph through use of an encryption function of the searchable encryption algorithm; a third step of generating, by the first computer, an encryption query by encrypting a query for retrieving the graph through use of a searchable encryption query function of the searchable encryption algorithm; a fourth step of transmitting, by the first computer, encrypted graph data obtained by associating the encrypted graph with the encryption query for each edge of the graph to the second computer; a fifth step of transmitting, by the first computer, a searchable encryption matching function of the searchable encryption algorithm to the second computer; a sixth step of receiving and storing, by the second computer, the encrypted graph data and the searchable encryption matching function; a seventh step of generating, by the first computer, an encrypted graph retrieval query by encrypting the start point and the end point of the graph to be retrieved through the use of the searchable encryption query function, and transmitting the encrypted graph retrieval query to the second computer; an eighth step of receiving, by the second computer, the encrypted graph retrieval query, and executing retrieval processing through use of the searchable encryption matching function with the encrypted graph retrieval query and the encrypted graph data being used as input; and a ninth step of transmitting, by the second computer, a result of the retrieval processing to the first computer.
According to this invention, it is possible to execute the retrieval processing specific to the graph data while ensuring security of the graph data entrusted to a computer system.
Now, a first embodiment of this invention is described in detail with reference to the accompanying drawings. In the first embodiment, a description is made of an example of retrieval involving traverse processing for a graph to be conducted for graph data encrypted through use of a searchable cipher. As the traverse processing for the graph, processing for setting a specific vertex node as a start point and causing a user terminal to send a query to a database server about whether or not a path including a length equal to or smaller than n with another vertex node being set as an end point exists in the graph data is exemplified. It is assumed that the value of the maximum value (or threshold value) n of the length of the path to be retrieved is set in advance.
The terms used in the first embodiment are defined as follows.
1. Traverse processing for graph data. The traverse processing for graph data represents processing for retrieving a specific vertex node from data including a graph structure, further retrieving a node adjacent to the vertex node, and further retrieving a node adjacent to the above-mentioned adjacent node, that is, processing for repeatedly retrieving a node adjacent to a specific node and another node further adjacent to the above-mentioned adjacent node.
2. Common key searchable encryption algorithm. The common key searchable encryption algorithm (hereinafter referred to as “searchable encryption algorithm”) collectively represents not only a common key encrypting (searchable encrypting) function of conducting normal probabilistic encryption and decryption but also encryption capable of conducting matching determination (hereinafter referred to as “matching processing”) for a plaintext as it is in its encrypted state without decrypting the plaintext. It is only an entity (in the first embodiment, user terminal) including a secret key that can execute the encryption, the decryption, and the generation of an encryption retrieval query to be used for retrieval.
The matching processing between a ciphertext and an encryption query can be conducted by an entity (in the first embodiment, database server) including a search key. More specifically, the searchable encryption algorithm includes the following four function groups [searchable encryption secret key generating function, searchable encryption encrypting function, searchable encryption query function, searchable encryption matching function].
(1) The searchable encryption secret key generating function (secret key generating function) represents a secret key generation algorithm defined by the searchable encryption algorithm, which is hereinafter referred to as “secret key generation processing”. A security parameter and a key seed are set as function input, and a binary string including a specific bit length and corresponding to the secret key set as function input in the following items (2) and (3) and the search key set as function input in the following item (4) is set as output.
(2) The searchable encryption encrypting function (encryption function) represents a probabilistic encryption algorithm defined by the searchable encryption algorithm. The ciphertext is output with the plaintext and the secret key being used as the function input.
(3) The searchable encryption query function represents a probabilistic query generation algorithm defined by the searchable encryption algorithm. The encryption query is output with a plaintext query and the secret key being used as the function input.
(4) The searchable encryption matching function represents a matching algorithm between the ciphertext and the encryption query, which is defined by the searchable encryption algorithm. With a ciphertext argument, an encryption query argument, and the search key being used as the function input, [plaintext matched] is output as a result when the plaintext corresponding to the ciphertext and the plaintext relating to the encryption query match each other, and otherwise, [plaintext mismatched] is output as a result. In the first embodiment, the encrypted graph and the encryption query are input to the searchable encryption matching function as the ciphertexts, and the encrypted graph from which [plaintext matched] has been output becomes a retrieval result of the encryption query.
The first embodiment is described below by fixing one searchable encryption algorithm, that is, by fixing the searchable encryption secret key generating function, the searchable encryption encrypting function, the searchable encryption query function, and the searchable encryption matching function. As a specific searchable encryption method algorithm, such a publicly-known or well-known technology as disclosed in JP 2012-123614 A may be employed.
Now, with reference to
The auxiliary storage device 102 is configured to store program codes. The program codes are loaded into the memory 103 and executed by the CPU 101.
The memory 103 is configured to store a secret key generation module 110 configured to generate a secret key 150 and a search key 160, an encrypted graph data generation module 120 configured to generate encrypted graph data 400 by encrypting plaintext (non-ciphertext) graph data 140, and an encrypted graph retrieval query generation module 130 configured to generate an encrypted graph retrieval query 500 for searching the encrypted graph data 400.
The respective functional modules of the secret key generation module 110, the encrypted graph data generation module 120, and the encrypted graph retrieval query generation module 130 are loaded into the memory 103 as the program codes.
The CPU 101 is configured to perform processing based on a program of each of the functional modules, to thereby operate as the functional module configured to provide a predetermined function. For example, the CPU 101 is configured to perform processing based on a secret key generation program, to thereby function as the secret key generation module 110. The same applies to the other program codes. The CPU 101 further operates as the functional module configured to provide the function of each of a plurality of pieces of processing executed by each program. The computer and a computer system represent a device and a system, respectively, including those functional modules.
Information including programs and tables for implementing the respective functional modules of the user terminal 100 can be stored in: a storage device, for example, the auxiliary storage device 102, a nonvolatile semiconductor memory, a hard disk drive, or a solid state drive (SSD); or a computer-readable non-transitory data storage medium, for example, an IC card, an SD card, or a DVD.
The secret key generation module 110 is configured to generate the secret key 150 and the search key 160 based on the searchable encryption secret key generating function described above in the item (1).
The encrypted graph data generation module 120 is configured to generate the encrypted graph data 400 as the ciphertext from the secret key 150 and the plaintext graph data 140 through use of the searchable encryption encrypting function described above in the item (2).
The encrypted graph retrieval query generation module 130 is configured to generate an encryption query from the secret key 150 and a plaintext graph through use of the searchable encryption query function described above in the item (3), and to set the encryption query in the encrypted graph data 400 as described later. The encrypted graph retrieval query generation module 130 is further configured to transmit in advance the searchable encryption matching function described above in the item (4) and the search key 160 to the database server 200.
The auxiliary storage device 202 is configured to store programs. The programs are loaded into the memory 103 and executed by the CPU 201.
The memory 203 is configured to store a database management system (hereinafter referred to as “DBMS”) 300 including an encrypted graph retrieval module 310, the encrypted graph data 400 distributed from the user terminal 100, the search key 160, and a searchable encryption matching function 170.
The respective functional modules of the DBMS 300 are loaded into the memory 203 as programs. In the same manner as the user terminal 100, the CPU 201 is configured to perform processing based on the program of each of the functional modules, to thereby operate as the functional module configured to provide a predetermined function.
As described later, the DBMS 300 is configured to receive an encrypted graph retrieval query from the user terminal 100, causes the encrypted graph retrieval module 310 to search the encrypted graph data 400, and returns a retrieval result to the user terminal 100.
Now, with reference to
The plaintext graph data 140 of the first embodiment is formed of directed graph data. The plaintext graph data 140 is expressed by a table formed of fields of an edge number 141, a start point 142, an edge 143, and an end point 144 as shown in
In
For example, an edge a has a vertex node A as the start point and a vertex node B as the end point. As shown in
As shown in
In addition, the encrypted graph data 400 has the encrypted start point 402, the encrypted edge 403, the encrypted end point 404 that are converted into the encryption query by the searchable encryption query function as an encryption query start point 451, an encryption query edge 452, and an encryption query end point 453. The encrypted graph data 400 is handled as a table-type data format including the above-mentioned seven fields in one row. The encrypted graph data 400 is information obtained by combining, for each edge, the encrypted graph data and the encryption query for searching the encrypted graph data without decrypting the encrypted graph data. For example, the first row [1,A,a,B] of the plaintext graph data of
In
For example, the encrypted start point 402 of the edge number 401 of 1 is the start point of the encrypted graph obtained by generating the start point A of the plaintext graph data 140 as the ciphertext by the encrypted graph data generation module 120 through the use of the searchable encryption encrypting function.
Further, the encryption query start point 451 of the edge number 401 of 1 is the start point of the encrypted graph obtained by generating the start point A of the plaintext graph data 140 as the ciphertext by the encrypted graph retrieval query generation module 130 through the use of the searchable encryption encrypting function.
Then, the encrypted graph data 400 is generated so that the encrypted graph (from 402 to 404) obtained by encrypting the plaintext graph data 140 and the encryption query (from 451 to 453) for retrieving the encrypted graph are set as data in one row for each edge.
The encrypted graph retrieval query 500 of
The encrypted graph retrieval query generation module 130 generates, based on a query received by the user terminal 100, the encrypted graph retrieval query 500 by inputting the data Query(A) obtained by converting the plaintext A into the encryption query by the searchable encryption query function to the start point query field and the data Query(D) obtained by converting a plaintext D into the encryption query by the searchable encryption query function to the end point query field.
Now, with reference to
First, the secret key generation module 110 of the user terminal 100 uses the preset searchable encryption secret key generating function to generate the secret key 150 to be used as the input for the searchable encryption encrypting function and the searchable encryption query function and the search key 160 to be used as the input for the searchable encryption matching function (S100). In the secret key generation module 110, the searchable encryption secret key generating function, the searchable encryption encrypting function, the searchable encryption query function, and the searchable encryption matching function are set as the preset searchable encryption algorithm.
Subsequently, the encrypted graph data generation module 120 of the user terminal 100 generates the encrypted graph formed of from the encrypted start point 402 to the encrypted end point 404 by encrypting the plaintext graph data 140 held by the user terminal 100 based on the data format of the encrypted graph data 400 shown in
The encrypted graph data generation module 120 further generates the encryption query formed of from the encryption query start point 451 to the encryption query end point 453 by encrypting the query for searching the plaintext graph data 140 through use of the above-mentioned searchable encryption query function and the above-mentioned secret key 150 generated in Step S100.
The encrypted graph data generation module 120 generates the encrypted graph data 400 by combining the encrypted graph and the encryption query for each of the edges of the plaintext graph data 140 (S200).
Subsequently, the user terminal 100 transmits the encrypted graph data 400 to the database server 200. Subsequently, the user terminal 100 transmits the above-mentioned search key 160 generated in Step S100 and the searchable encryption matching function 170 to the database server 200, and brings the prestorage processing to an end.
First, the user terminal 100 receives a retrieval request from the user. The encrypted graph retrieval query generation module 130 generates the encrypted graph retrieval query 500 based on the received retrieval request and the data format shown in
For example, in order to determine whether or not the path including the vertex node A set as the start point and the vertex node D set as the end point exists in the graph data 140, the encrypted graph retrieval query generation module 130 generates the encrypted graph retrieval query 500 from the two pieces of data in which the data Query(A) obtained by converting the plaintext A into the encryption query by the searchable encryption query function is input to the start point query field and the data Query(D) obtained by converting the plaintext D into the encryption query by the searchable encryption query function is input to the end point query field as shown in
Subsequently, the user terminal 100 transmits the above-mentioned encrypted graph retrieval query 500 generated in Step S300 to the database server 200. Subsequently, the DBMS 300 of the database server 200 inputs the received encrypted graph retrieval query 500 to the encrypted graph retrieval module 310, and executes the encrypted graph retrieval processing on the encrypted graph data 400 to generate the retrieval result. The encrypted graph retrieval module 310 generates the retrieval result as an encryption retrieval processing result 600. Then, the DBMS 300 returns the encryption retrieval processing result 600 to the user terminal 100 (S400).
By the above-mentioned processing, the DBMS 300 of the database server 200 searches the encrypted graph data 400 by the encrypted graph retrieval query 500 received from the user terminal 100. The encrypted graph retrieval module 310 of the DBMS 300 generates the graph data 400 that matches a search condition as the encryption retrieval processing result 600 without decrypting the graph data 400, and returns the encryption retrieval processing result 600 to the user terminal 100.
Through execution of the processing of
First, in
Subsequently, the encrypted graph retrieval module 310 of the DBMS 300 sets the start point query 501 of the encrypted graph retrieval query 500 as the encryption query argument of the searchable encryption matching function 170. Further, the encrypted graph retrieval module 310 sets each piece of the encrypted data in the column of the encrypted start point 402 within the encrypted graph data 400 as the ciphertext argument of the searchable encryption matching function 170. Then, the encrypted graph retrieval module 310 inputs the encryption query argument, the ciphertext argument, and the search key 160 to the searchable encryption matching function 170, and executes the matching processing.
Then, the encrypted graph retrieval module 310 stores the edge number of each of the rows hit in the matching in the internal variable memory area [pool: S1] (S402).
Subsequently, the encrypted graph retrieval module 310 determines whether or not the edge number data is stored in the internal variable memory area [pool: S1] (S403). In other words, the encrypted graph retrieval module 310 determines whether or not the matched start point exists in the encrypted graph data 400.
When the edge number is not stored in the internal variable memory area [pool: S1], the procedure advances to Step S404.
In Step S404, the encrypted graph retrieval module 310 generates [retrieval result: R] as a retrieval processing result by setting the internal variable memory area [retrieval result: R] to No, outputs the retrieval processing result, and brings the encrypted graph retrieval processing (S400) to an end.
Meanwhile, when the edge number is stored in the internal variable memory area [pool: S1], the procedure advances to Step S405.
In Step S405, the encrypted graph retrieval module 310 sets encrypted data in the t-th row of the column of the encrypted end point 404 within the encrypted graph data 400 as the ciphertext argument for each [edge number: t] stored in the internal variable memory areas [pool: S1] to [pool: Sn]. The encrypted graph retrieval module 310 sets the end point query 502 of the encrypted graph retrieval query 500 as the encryption query argument. Then, the encrypted graph retrieval module 310 inputs each of the ciphertext argument, the encryption query argument, and the search key 160 to the searchable encryption matching function 170, and executes the matching processing as to whether or not the encrypted end point 404 of each [edge number: t] matches the end point query 502 of the encrypted graph retrieval query 500.
Subsequently, when there is encrypted data hit in the above-mentioned matching processing of Step S405, the procedure advances to Step S407. In Step S407, the encrypted graph retrieval module 310 outputs [retrieval result: R] by setting the internal variable memory area [retrieval result: R] to Yes, and brings the encrypted graph retrieval processing (S400) to an end.
Meanwhile, when there is no encrypted data hit in the above-mentioned matching processing of Step S405, the procedure advances to Step S408 of
First, the encrypted graph retrieval module 310 increments the internal variable [path length: i] by 1 (S408).
Subsequently, the encrypted graph retrieval module 310 sets the encryption query end point 453 in the t-th row within the encrypted graph data 400 as the encryption query argument of the searchable encryption matching function 170 for each [edge number: t] stored in the internal variable memory area [pool: Si]. The encrypted graph retrieval module 310 further sets each piece of encrypted data in the column of the encrypted start point 402 within the encrypted graph data 400 as the ciphertext argument of the searchable encryption matching function 170. Then, the encrypted graph retrieval module 310 inputs the ciphertext argument, the encryption query argument, and the search key 160 to the searchable encryption matching function 170, and executes the matching processing. The encrypted graph retrieval module 310 stores the edge number of each row hit in the matching in the internal variable memory area [pool: S(i+1)] (S409).
Subsequently, the encrypted graph retrieval module 310 determines whether or not the edge number is stored in the internal variable memory area [pool: S(i+1)] (S410). When the edge number is stored in the internal variable memory area [pool: S(i+1)], the encrypted graph retrieval module 310 determines that the encrypted edge 403 including the encrypted end point 404 of [edge number: t] set as the start point exists, and the procedure advances to Step S412.
Meanwhile, when the edge number is not stored in the internal variable memory area [pool: S(i+1)], the encrypted graph retrieval module 310 determines that the encrypted edge 403 including the encrypted end point 404 of [edge number: t] set as the start point does not exist, and the procedure advances to Step S411. In Step S411, the encrypted graph retrieval module 310 outputs [retrieval result: R] by setting the internal variable memory area [retrieval result: R] to No, and brings the encrypted graph retrieval processing (S400) to an end.
In Step S412, the encrypted graph retrieval module 310 sets encrypted data in the t-th row of the column of the encrypted end point 404 within the encrypted graph data 400 as the ciphertext argument for each [edge number: t] stored in the internal variable memory area [pool: S(i+1)]. The encrypted graph retrieval module 310 further sets the end point query 502 of the encrypted graph retrieval query 500 as the encryption query argument. Then, the encrypted graph retrieval module 310 inputs each of the ciphertext argument, the encryption query argument, and the search key 160 to the searchable encryption matching function 170, and executes the matching processing as to whether or not the encrypted end point 404 of each [edge number: t] matches the end point query 502 of the encrypted graph retrieval query 500.
Subsequently, the encrypted graph retrieval module 310 determines whether or not there is encrypted data hit in the matching processing of Step S412 (S413). When there is encrypted data hit in the matching processing, the procedure advances to Step S414, and when there is no hit encrypted data, the procedure advances to Step S415.
In Step S414, the encrypted graph retrieval module 310 outputs [retrieval result: R] by setting the internal variable memory area [retrieval result: R] to Yes, and brings the encrypted graph retrieval processing (S400) to an end.
Meanwhile, when there is no hit data, in Step S415, the encrypted graph retrieval module 310 conducts a comparison between the internal variable [path length: i] and the maximum value n of the path length. When i is equal to n, the procedure advances to Step S416, and when i is smaller than n, the procedure returns to Step S408. In Step S408, [path length: i] is incremented, and then the above-mentioned processing is repeatedly conducted.
In Step S416, the encrypted graph retrieval module 310 outputs [retrieval result: R] by setting the internal variable memory area [retrieval result: R] to No, and brings the encrypted graph retrieval processing (S400) to an end.
By the above-mentioned processing, the database server 200 can reduce a load required for the retrieval processing for data including an encrypted graph structure. In addition, the database server 200 can determine whether or not the path including a length equal to or smaller than the length n with the vertex node corresponding to the start point query 501 being set as the start point and the vertex node corresponding to the end point query 502 being set as the end point exists in the encrypted graph data 400.
Now, the reduction of the load required for the retrieval processing is described below.
A computer (not shown) of the user encrypts each cell of the table of
The user transmits an encryption query Query1 for a searchable cipher of the vertex node A and an encryption query Query2 for a searchable cipher of the vertex node D to the cloud, and this causes the cloud to execute the matching processing between Query1 and Query2 for the encrypted data of each cell.
As a result of the matching processing, such a result as shown in
As a result, the cloud obtains the fact that the vertex of Query1 and the vertex of Query2 are not adjacent nodes, but in order to search for the path connecting between Query1 and Query2, temporarily returns, for example, an encrypted end point node “dcd1ce1” in the fourth column of the first row to the computer of the user. The computer of the user decrypts the encrypted data “dcd1ce1”, obtains the vertex node B, and then generates an encryption query Query3 for a searchable cipher of the vertex node B. Then, the computer of the user returns Query3 to the cloud, and the cloud executes the matching processing between Query3 and the encrypted data of each cell, and obtains Query1→Query3→Query2 as one of the shortest paths.
In addition, in order to search for another one of the shortest paths, the cloud returns, for example, a piece of encrypted data “d26e5cd” in the fourth column of the second row to the computer of the user, obtains an encryption query Query4 for a vertex node C, and then conducts the same processing, to thereby obtain Query1→Query4→Query2 as another one of the shortest paths.
In a case where each cell is thus encrypted through the use of the searchable cipher of a related-art technology, there exists a problem in that, when the graph structure is traversed (in short, retrieval processing for repeatedly tracing the adjacent node, for example, retrieving a given vertex node, further retrieving a node adjacent to the given vertex node, and further retrieving another node adjacent thereto), a large amount of processing for generating, transmitting, and receiving data required for the traverse processing occurs between a cloud user and the cloud, which may cause processing delay.
In contrast, in the first embodiment, as illustrated in
With this configuration, it is possible to suppress an increase in amounts of transmission and reception of the encrypted data and the encryption query, and particularly in the traverse processing, it is possible to suppress an increase in amount of the processing for generating, transmitting, and receiving data required for the processing, which can reduce the load required for the retrieval processing for data including the encrypted graph structure.
This invention is not limited to the first embodiment described above, and various changes can be made within the scope of the gist of the invention.
For example, in the first embodiment, the maximum value n of the path length is assumed to be set in advance, but does not necessarily need to be set in advance. When the user terminal 100 transmits the encrypted graph retrieval query 500 to the database server 200, the maximum value n of the path length may be designated by adding the column of the maximum value of the path length to the data format columns of the encrypted graph retrieval query 500 and setting n as the value of its field. Further, when it is clear in advance that there is no closed path in the directed graph, it is guaranteed that the encrypted graph retrieval processing (S400) is to be terminated without the designation of the maximum value n of the path length, and hence the maximum value n of the path length does not need to be designated.
Further, in the first embodiment, the encrypted graph retrieval processing (S400) for solving a determination problem as to whether or not the path including a length equal to or smaller than the length n with a specific vertex node being set as the start point and another vertex node being set as the end point exists in the graph data is executed. However, the encrypted graph retrieval processing (S400) does not necessarily need to be processing for solving the above-mentioned determination problem. For example, in order to list the edges including a specific vertex as the start point from such an encrypted graph as described in JP 2012-123614 A, the column of the encrypted start point of the encrypted graph may be searched by the encryption query of the encrypted specific vertex, and data in each row of the encrypted graph hit in the matching processing may be set as the encryption retrieval processing result 600.
Further, similarly with a specific edge being converted into the encryption query, the column of the encrypted edge of the encrypted graph may be searched by the encryption query of the encrypted specific edge, and data in each row of the encrypted graph hit in the matching processing may be set as the encryption retrieval processing result 600.
Further, as another example of the encrypted graph retrieval processing (S400), when a shortest path exists in the graph data including a specific vertex node set as the start point and another vertex node set as the end point, the shortest path may be set as the encryption retrieval processing result 600 in such a manner as illustrated in
As shown in
Further, in order to narrow down the shortest paths to designated one from the encryption retrieval processing result 600 for the retrieval of the shortest path, a set of the encryption queries of edges of the shortest path to be narrowed down to may be added to the encrypted graph retrieval query 500. For example, when the shortest paths are narrowed down to a path formed of the edge a and an edge d, the encrypted graph retrieval query 500 may be set by adding {Query(a),Query(d)}, which is the set of the edges of the shortest path to be narrowed down to, to the encrypted graph retrieval query 500 of
Further, in a case of employing an existing (publicly-known or well-known) algorithm for searching vertices and paths while traversing the graph, when other encrypted vertices adjacent to a specific encrypted vertex are listed by generating the encrypted graph data in the data format shown in
Further, in the first embodiment, the encrypted graph retrieval module 310 cannot cause the searchable encryption matching function 170 to function without the search key 160. Therefore, the encrypted graph data 400 can be searched only on the computer to which the search key 160 has been distributed. This can ensure security when the retrieval processing is conducted by outsourcing the encrypted graph data 400.
Next, a second embodiment of this invention is described with reference to the accompanying drawings.
As illustrated in
The first embodiment and the second embodiment differ from each other in that the functions and processing of the user terminal 100 of the first embodiment are divided into those of the data provider terminal 1400 and those of the data user terminal 1500 illustrated in
The data user terminal 1500 is configured to issue a plaintext graph retrieval request, and to acquire the encrypted graph retrieval query 500 from the data provider terminal 1400 to access the database server 200.
A hardware configuration of the data provider terminal 1400, the data formats of the plaintext and the encrypted graph data 400, the data format of the encrypted graph retrieval query 500, and the processing of the encrypted graph retrieval are the same as those of the user terminal 100 of the first embodiment. The same components as those of the first embodiment are denoted by like reference symbols. The database server 200 is the same as that of the first embodiment.
The data user terminal 1500 has the same hardware configuration as that of the user terminal 100 of the first embodiment, and has a software configuration including a plaintext graph retrieval query generation module 1510 configured to generate a plaintext graph retrieval query 420 from the plaintext graph retrieval request.
First, in the same manner as the user terminal 100 of the first embodiment, the secret key generation module 110 of the data provider terminal 1400 uses the searchable encryption secret key generating function to generate the secret key 150 to be used as the input for the searchable encryption encrypting function and the searchable encryption query function and the search key 160 to be used as the input for the searchable encryption matching function 170 (S100).
Subsequently, the data provider terminal 1400 generates the encrypted graph data 400 including the encrypted graph and the encryption query by encrypting the plaintext graph data 140 held by the data provider terminal 1400 based on the data format shown in
Subsequently, the data provider terminal 1400 transmits the encrypted graph data 400 to the database server 200.
Subsequently, the data provider terminal 1400 transmits the above-mentioned search key 160 generated in Step S100 to the database server 200. The data provider terminal 1400 further transmits the searchable encryption matching function 170 to the database server 200, and brings the prestorage processing to an end.
First, the data user terminal 1500 causes the plaintext graph retrieval query generation module 1510 to generate the plaintext graph retrieval query 420, and transmits the plaintext graph retrieval query 420 to the data provider terminal 1400 (S500). The plaintext graph retrieval query 420 has the same data format as the data format of the encrypted graph retrieval query shown in
For example, the plaintext graph retrieval query 420 issues a query as to whether or not the path including the vertex node A set as the start point and the vertex node D set as the end point exists in the graph data. In the case of the above-mentioned query, the plaintext graph retrieval query generation module 1510 generates, as the plaintext graph retrieval query 420, two pieces of data including the plaintext A input to the field of the start point query 501 and the plaintext D input to the field of the end point query 502 as shown in
Subsequently, the data provider terminal 1400 receives the plaintext graph retrieval query 420 from the data user terminal 1500. The encrypted graph retrieval query generation module 130 of the data provider terminal 1400 generates the encrypted graph retrieval query 500 based on the data format shown in
Subsequently, when the data user terminal 1500 receives the encrypted graph retrieval query 500 generated in Step S300, the encrypted graph retrieval query 500 transmits the encrypted graph retrieval query 500 as it is to the database server 200.
Subsequently, in the same manner as in the first embodiment, the DBMS 300 of the database server 200 inputs the received encrypted graph retrieval query 500 to the encrypted graph retrieval module 310, and executes the encrypted graph retrieval processing on the encrypted graph data 400. The encrypted graph retrieval module 310 generates the retrieval result as the encryption retrieval processing result 600. Then, the DBMS 300 returns the encryption retrieval processing result 600 to the data user terminal 1500 (S400).
Further, when the determination problem as to whether or not the path including the vertex node A set as the start point and the vertex node D set as the end point exists in the graph data is queried as the encrypted graph retrieval query 500, the encryption retrieval processing result 600 can output a binary result of [Yes] or [No].
When the encrypted graph retrieval query 500 is used to search a set of vertex nodes adjacent to a specific vertex node, the ciphertexts of the vertex nodes adjacent to the specific vertex node are output from the database server 200 as the encryption retrieval processing result 600.
In this case, the data user terminal 1500 transmits the received encryption retrieval processing result 600 to the data provider terminal 1400 holding the secret key 150, and requests the decryption processing. The data provider terminal 1400 may then return the set of the plaintexts of the adjacent vertex nodes to the data user terminal 1500.
The data provider terminal 1400 may instead transmit the secret key 150 to the data user terminal 1500 to allow the data user terminal 1500 to generate the encrypted graph retrieval query and conduct the decryption processing for the encryption retrieval processing result 600.
As described above in the second embodiment, even in the case of employing the data provider terminal 1400 and the data user terminal 1500 that are separately provided, it is possible to suppress the increase in amounts of the transmission and reception of the encrypted data and the encryption query, and particularly in the traverse processing, it is possible to suppress the increase in amount of the processing for generating, transmitting, and receiving data required for the processing, which can reduce the load required for the retrieval processing for data including the encrypted graph structure.
In Step S300 of
Next, a third embodiment of this invention is described with reference to the accompanying drawings.
The third embodiment is different from the first embodiment and the second embodiment in that the key management server 1600 configured to conduct processing using the secret key 150, for example, encryption, decryption, and query generation is provided in the third embodiment.
The database server 200 is the same as that of the first embodiment. The data user terminal 1500 is the same as that of the second embodiment.
Now, the third embodiment is described with reference to
The data provider terminal 1400A includes only the plaintext graph data 140 within the configuration of the second embodiment. Meanwhile, the key management server 1600 is obtained by excluding the plaintext graph data 140 from the user terminal 100 of the first embodiment. In other words, the key management server 1600 includes the secret key generation module 110, the encrypted graph data generation module 120, the encrypted graph retrieval query generation module 130, the secret key 150, the search key 160, and the encrypted graph retrieval query 500. In the following description, the same components as those of the first embodiment or the second embodiment are denoted by like reference symbols.
First, in the same manner as the user terminal 100 of the first embodiment, the secret key generation module 110 of the key management server 1600 uses the searchable encryption secret key generating function to generate the secret key 150 to be used as the input for the searchable encryption encrypting function and the searchable encryption query function and the search key 160 to be used as the input for the searchable encryption matching function 170 (S100).
Subsequently, the data provider terminal 1400 transmits the plaintext graph data 140 held by the data provider terminal 1400, which is illustrated in
Subsequently, the key management server 1600 transmits the encrypted graph data 400 to the database server 200. Subsequently, the key management server 1600 transmits the above-mentioned search key 160 generated in Step S100 to the database server 200. The key management server 1600 further transmits the searchable encryption matching function 170 to the database server 200, and brings the prestorage processing to an end.
First, in the same manner as in the second embodiment, the data user terminal 1500 causes the plaintext graph retrieval query generation module 1510 to generate the plaintext graph retrieval query 420, and transmits the plaintext graph retrieval query 420 to the key management server 1600 (S500).
Subsequently, the encrypted graph retrieval query generation module 130 of the key management server 1600 generates the encrypted graph retrieval query 500 based on the data format shown in
Subsequently, in the same manner as in the first embodiment, the DBMS 300 of the database server 200 inputs the received encrypted graph retrieval query 500 to the encrypted graph retrieval module 310, and executes the encrypted graph retrieval processing on the encrypted graph data 400. The encrypted graph retrieval module 310 generates the retrieval result as the encryption retrieval processing result 600, and returns the encryption retrieval processing result 600 to the key management server 1600 (S400).
At this time, when the determination problem as to whether or not the path including the vertex node A set as the start point and the vertex node D set as the end point exists in the graph data is queried as the encrypted graph retrieval query 500, the encryption retrieval processing result 600 can output the binary result of [Yes] or [No].
When the encrypted graph retrieval query 500 is used to search the set of vertex nodes adjacent to a specific vertex node, the ciphertexts of the vertex nodes adjacent to the specific vertex node are output from the database server 200 as the encryption retrieval processing result 600.
When the encryption retrieval processing result 600 thus includes the ciphertext, the key management server 1600 uses the secret key to conduct the decryption processing for the ciphertext (S600). Subsequently, the key management server 1600 transmits a decrypted retrieval processing result 430 to the data user terminal 1500, and brings the retrieval processing to an end.
As described above in the third embodiment, even in the case of employing the data provider terminal 1400, the data user terminal 1500, and the key management server 1600 that are separately provided, it is possible to suppress the increase in amounts of the transmission and reception of the encrypted data and the encryption query, and particularly in the traverse processing, it is possible to suppress the increase in amount of the processing for generating, transmitting, and receiving data required for the processing, which can reduce the load required for the retrieval processing for data including the encrypted graph structure.
Next, a fourth embodiment of this invention is described with reference to the accompanying drawings.
In the first embodiment, the encrypted graph data 400 is encrypted in the data format shown in
As a result, before the user terminal 100 transmits the encrypted graph retrieval query 500, the database server 200 can acquire the graph structure of the encrypted graph data 400 indicating that the end point node of the edge number of 1 match the start point node of the edge number of 4, that is, information indicating which node is the start point, which node is the end point, and which nodes are connected to each other. The same applies to the second and third embodiments.
In view of the foregoing, in the fourth embodiment, only after the user terminal 100 transmits the encrypted graph retrieval query 500, the database server 200 is allowed to calculate the graph structure, that is, which node is the start point, which node is the end point, and which nodes are connected to each other. In other words, the database server 200 cannot obtain a solution even by executing the encryption query until receiving the encrypted graph retrieval query 500.
Therefore, in the fourth embodiment, the data format of the encrypted graph data 400 of the first embodiment is changed to encrypted graph data 400B based on JP 2012-123614 A described above. The other configuration is the same as that of the first embodiment.
First, the secret key generation module 110 of the user terminal 100 is provided with a common key encryption method C(-,-) and a hash function H(-). In this case, a ciphertext obtained by encrypting a message m by a secret key sk through use of the common key encryption method C(-,-) is expressed by C(m,sk), and a hash value of the hash function H(-) of the message m is expressed by H(m). Further, ADVANCED ENCRYPTION STANDARD (AES), [online], Nov. 26, 2001, Federal Information Processing Standards Publication 197, [retrieved on Oct. 2, 2014], Internet <http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf> and Secure Hash Standard (SHS), [online], March 2012, FEDERAL INFORMATION PROCESSING STANDARDS PUBLICATION 180-4, [retrieved on Oct. 2, 2014], Internet <http://csrc.nist.gov/publications/fips/fips180-4/fips-180-4.pdf described above may be used for each of the common key encryption method and the hash function.
For example, the value of the encryption query end point 453 of the edge number of 1 is Query(B) in
In this case, a secret key e1 for the encryption query end point 453 of C(Query(B),e1) of the edge number of 1 is a hash value of the above-mentioned hash function H of a homomorphic function value FR (708 in JP 2012-123614 A) to be used for generating the ciphertext of the encryption start point of Enc(A) of the edge number of 1 through the use of the searchable encryption encrypting function of a searchable encryption method disclosed in JP 2012-123614 A (referred to as “processing for creating secret registered data 712” in JP 2012-123614 A).
That is, the secret key e1=H(FR) is obtained. In the following, the homomorphic function value FR (708 in JP 2012-123614 A) to be used for generating the ciphertext Enc(X) through the use of the searchable encryption encrypting function of the searchable encryption method disclosed in JP 2012-123614 A is expressed by FR-Enc(X).
In the same manner, when the respective cells of the column of the encryption query end point 453 are encrypted by the common key encryption method C(-,-), in regard to the edge number 401 of 2 and the subsequent edge numbers 401, the values of secret keys e2, e3, e4, and e5 are set as hash values of the hash function H of the homomorphic function values FR-Enc(A), FR-Enc(C), FR-Enc(B), and FR-Enc(C) to be used for encrypting start points Enc(A), Enc(C), Enc(B), and Enc(C) of their corresponding edge numbers through the use of the searchable encryption encrypting function of the searchable encryption method disclosed in JP 2012-123614 A.
That is, e1=H(FR-Enc(A)), e2=H(FR-Enc(A)), e3=H(FR-Enc(C)), e4=H(FR-Enc(B)), and e5=H(FR-Enc(C)) are set.
In the above-mentioned manner, the respective cells of the column of the encryption query end point 453 are encrypted as e1, e2, e3, e4, and e5. Therefore, when the above-mentioned prestorage processing illustrated in
Meanwhile, as disclosed in JP 2012-123614 A by Expression (8) used in the processing of Step S1311, the homomorphic function value FR-Enc(A) is acquired from the user terminal 100 by the database server 200 in the processing for conducting the matching processing between Enc(A) and Query(A) through the use of the searchable encryption matching function 170.
Therefore, the database server 200 receives Query(A) within the encrypted graph retrieval query 500 from the user terminal 100, acquires the homomorphic function value FR-Enc(A), and calculates a hash value H(FR-Enc(A))=e1 through use of the hash function H to obtain the key e1. With this configuration, the database server 200 can calculate the secret key e1 through use of the homomorphic function value FR-Enc(A) acquired from the user terminal 100, to thereby decrypt C(Query(B),e1) within the column of the encryption query end point 453 of the edge number of 1 and obtain Query(B). In other words, the user terminal 100 further encrypts the encryption query through use of the second key e1, and in the case of executing the retrieval, transmits the homomorphic function value for extracting the encrypted encryption query as a query execution key.
Then, the database server 200 calculates the second key e1 by the hash function H from the homomorphic function value being the query execution key, and decrypts the encrypted encryption query to obtain the encryption query. It suffices that the hash function H is transmitted to the database server 200 along with the search key 160 or the like in advance by the user terminal 100.
After that, the above-mentioned traverse processing illustrated in
Further, in the above-mentioned example, the hash value H(FR-Enc(X)) of the hash function H of the homomorphic function value FR-Enc(X) used for generating the ciphertext of the encrypted start point 402 of the same edge number is set as the secret key used for encrypting each cell of the column of the encryption query end point 453 by the common key encryption method C(-,-). The fourth embodiment is not limited thereto, and the hash value H(FR-Enc(X)) of the hash function H of the homomorphic function value FR-Enc(X) used for generating the ciphertext of the encrypted end point 404 of the same edge number may be set as the secret key used for encrypting each cell of the column of the encryption query start point 451 by the common key encryption method C(-,-).
That is, s1=H(FR-Enc(B)), s2=H(FR-Enc(C)), s3=H(FR-Enc(B)), s4=H(FR-Enc(D)), and s5=H(FR-Enc(D)) are set as secret keys s1, s2, s3, s4, and s5 used in respective ciphertexts C(Query(A),s1), C(Query(A),s2), C(Query(C),s3), C(Query(B),s4), and C(Query(C),s5) in the column of the encryption query start point 451 of
As described above, according to the fourth embodiment, it is possible not only to reduce the load required for the retrieval processing for data including the encrypted graph structure but also to improve confidentiality of the encrypted graph data 400B by encrypting the encryption query to a further extent. Therefore, when data is outsourced, it is possible to reduce the load required for the retrieval processing while ensuring security.
In the above-mentioned example of the fourth embodiment, the encryption query is decrypted by acquiring the homomorphic function value FR-Enc(X) from the user terminal 100 as the key for decrypting the encrypted encryption query, but the secret key may be acquired from the user terminal 100.
This invention is not limited to the embodiments described above, and encompasses various modification examples. For instance, the embodiments are described in detail for easier understanding of this invention, and this invention is not limited to modes that have all of the described components. Some components of one embodiment can be replaced with components of another embodiment, and components of one embodiment may be added to components of another embodiment. In each embodiment, other components may be added to, deleted from, or replace some components of the embodiment, and the addition, deletion, and the replacement may be applied alone or in combination.
Some of all of the components, functions, processing units, and processing means described above may be implemented by hardware by, for example, designing the components, the functions, and the like as an integrated circuit. The components, functions, and the like described above may also be implemented by software by a processor interpreting and executing programs that implement their respective functions. Programs, tables, files, and other types of information for implementing the functions can be put in a memory, in a storage apparatus such as a hard disk, or a solid state drive (SSD), or on a recording medium such as an IC card, an SD card, or a DVD.
The control lines and information lines described are lines that are deemed necessary for the description of this invention, and not all of control lines and information lines of a product are mentioned. In actuality, it can be considered that almost all components are coupled to one another.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2014/079615 | 11/7/2014 | WO | 00 |