The present invention relates to a graph searching apparatus and a graph searching method for executing a graph search, and further relates to a computer-readable recording medium on which a program for realizing them is recorded.
A graph search is an important technique for analyzing a Social Networking Service (SNS) (for example, clustering relationships between users) and analyzing a road connecting points on an electronic map (for example, shortest route search between the points). Further, in a graph search, it is known that the structure of the graph is represented using an adjacency matrix.
As a related technique, Patent Document 1 discloses a pattern extraction apparatus capable of efficiently obtaining the number of connected components of a graph. According to the pattern extraction apparatus of Patent Document 1, the number of connected components of a graph is obtained by representing a connection relationship of a pattern by using a vertex adjacency matrix and counting the number of diagonalized blocks of the vertex adjacency matrix.
However, in a graph search, as the number of vertices increases, the number of matrix elements increases, and thus the adjacency matrix becomes large. Therefore, the time required for the graph search increases. Therefore, there is a desire to reduce the time required for the graph search.
An example of an object of the present invention is to provide a graph searching apparatus and a graph searching method that reduce the time required for the graph search, and a computer-readable recording medium.
In order to achieve the above object, a graph searching apparatus in one aspect of the present invention includes:
a generation means for selecting a plurality of vertices based on an adjacency relationship of vertices included in a graph and generating a frontier matrix in which different labels are respectively set for elements corresponding to the selected vertices; and a classification means for classifying the vertices using the frontier matrix and an adjacency matrix representing the adjacency relationship of the vertices included in the graph.
Further, in order to achieve the above object, a graph searching method in one aspect of the present invention includes:
(a) selecting a plurality of vertices based on an adjacency relationship of vertices included in a graph and generating a frontier matrix in which different labels are respectively set for elements corresponding to the selected vertices; and
(b) classifying the vertices using the frontier matrix and an adjacency matrix representing the adjacency relationship of the vertices included in the graph.
Further, in order to achieve the above object, a computer-readable recording medium in one aspect of the present invention includes a program recorded thereon, the program including instructions that cause a computer to carry out:
(a) selecting a plurality of vertices based on an adjacency relationship of vertices included in a graph and generating a frontier matrix in which different labels are respectively set for elements corresponding to the selected vertices; and
(b) classifying the vertices using the frontier matrix and an adjacency matrix representing the adjacency relationship of the vertices included in the graph.
As described above, according to the present invention, the time required for the graph search can be reduced.
Hereinafter, an example embodiment of the present invention will be described with reference to
[Device Configuration]
First, a configuration of a graph searching apparatus 10 in the present example embodiment will be described with reference to
The graph searching apparatus 10 illustrated in
Among them, the generation unit 11 selects a plurality of vertices based on an adjacency relationship of vertices included in a graph, and generates a frontier matrix in which different labels are respectively set for elements corresponding to the selected vertices. The classification unit 12 classifies the vertices using the frontier matrix and an adjacency matrix representing the adjacency relationship of the vertices included in the graph.
An adjacency matrix (R) of
A frontier matrix (F1) of
A bit width of a label is determined based on the number of vertices that are the starting points. In the frontier matrix of
The matrix of
Note that when calculating the product of matrices, the logical product is used to calculate the product of the elements, and the logical sum is used to calculate the sum of the elements.
Subsequently, the result (M1) of the product of
Subsequently, a result (S1: second matrix) of the sum of the result (M1) of the product of
Note that when calculating the sum of matrices, the logical sum is used to calculate the sum of the elements.
The matrix (first matrix) of
Subsequently, a frontier matrix (F3) is generated using the result (M2) of the product and the result (S1) of the sum of
Subsequently, a result (S2: second matrix) of the sum illustrated in
The matrix (first matrix) of
Subsequently, a frontier matrix (F4) is generated using the result (M3) of the product and the result (S2) of the sum of
Subsequently, a result (S3: second matrix) of the sum illustrated in
Note that when the elements corresponding to all of the vertices of the result (second matrix) of the sum are labeled, processing for classifying the vertices is terminated.
As described above, in the present example embodiment, since the adjacency matrix and the frontier matrix are used to classify the vertices, the time required for the graph search can be reduced.
In addition, conventionally, when performing a graph search using a processor or the like, one vertex is selected, graph search is completed for the selected vertex, then one next vertex is selected, and the graph search is sequentially performed. However, in the present example embodiment, since the vertices are selected and the graph search is performed in parallel, the number of steps required for the graph search can be reduced as compared with the conventional case.
[System Configuration]
Subsequently, the configuration of the graph searching apparatus 10 in the present example embodiment will be described more specifically with reference to
As illustrated in
The system will be described.
The graph searching apparatus 10 is, for example, an information processing device such as a vector processor, a Central Processing Unit (CPU), a Field-Programmable Gate Array (an FPGA), or a server computer, a personal computer, or a mobile terminal equipped with them.
The external device 20 is a device having data required for graph search. Specifically, the external device 20 may be a storage device such as a database that stores data used for analysis of SNS, analysis of a road connecting points on an electronic map, or the like, a server computer, a personal computer, a mobile terminal, or the like. Note that the external device 20 communicates with the graph searching apparatus 10 by wire or wirelessly, and transmits the data to the graph searching apparatus 10.
The output device 30 obtains output information converted into a format capable of being output by the output information generation unit 15, and outputs a generated image, sound, and the like based on the output information. The output device 30 is, for example, an image display device using a liquid crystal display, an organic Electro Luminescence (EL), or a Cathode Ray Tube (CRT). Further, the image display device may include, for example, a sound output device such as a speaker. Note that the output device 30 may be a printing device such as a printer.
The graph searching apparatus will be described.
The obtaining unit 13 obtains data required for performing a graph search on a target graph. Specifically, the obtaining unit 13 receives the data transmitted from the external device 20 and outputs the data to the adjacency matrix generation unit 14. The obtaining unit 13 communicates with, for example, the external device 20 by wire or wirelessly and receives the data.
The adjacency matrix generation unit 14 generates an adjacency matrix representing the adjacency relationship of the vertices for the target graph using the data required for the graph search. Specifically, the adjacency matrix generation unit 14 first obtains the data required for the graph search. Subsequently, the adjacency matrix generation unit 14 generates the adjacency matrix corresponding to the target graph using the vertices included in the target graph and the adjacency relationship of each vertex. The adjacency matrix generation unit 14 generates, for example, the adjacency matrix (R) as illustrated in
The generation unit 11 selects vertices based on the adjacency relationship of the vertices included in the graph, and generates the frontier matrix in which different labels are respectively set for the elements corresponding to the selected vertices. Specifically, first, the generation unit 11 selects vertices from the vertices included in the graph.
For selection of vertices, for example, a user may select the vertices as appropriate, or the generation unit 11 may select the vertices. Note that when the generation unit 11 selects the vertices, the generation unit 11 may select preset vertices or may select the vertices based on the number of vertices.
Subsequently, the generation unit 11 sets different labels for the elements corresponding to the selected vertices of the frontier matrix having the same number of rows as the number of vertices. The generation unit 11 generates, for example, the frontier matrix (F1) as illustrated in
The classification unit 12 generates a new frontier matrix by referring to the result (second matrix) of the sum representing the vertices for which the graph search has been completed and excluding the elements corresponding to the searched vertices from the result (first matrix) of the product of the adjacency matrix and the frontier matrix. Thereafter, the classification unit 12 classifies the vertices by calculating the sum of the result of the product and the result of the sum and generating a new result (second matrix) of the sum.
Specifically, first, the classification unit 12 obtains the adjacency matrix and the frontier matrix. Subsequently, the classification unit 12 calculates the result of the product of the adjacency matrix and the frontier matrix. Subsequently, the classification unit 12 generates a new frontier matrix by referring to the result of the sum and excluding elements corresponding to searched vertices from the result of the product if the result of the product has searched vertices. The generation unit 11 generates the new frontier matrices (F3) (F4), for example, as described above with reference to
Subsequently, the classification unit 12 calculates the sum of the result of the product and the result of the sum, and generates a new result of the sum, to classify the vertices. The generation unit 11 generates the new results (S1), (S2), and (S3) of the sum, for example, as described above with reference to
The output information generation unit 15 generates the output information used to output one or more of the graph, the adjacency matrix, the frontier matrix, the result of the product, the result of the sum, and the like to the output device. Thereafter, the output information generation unit 15 transmits the generated output information to the output device.
[Modification 1] If the vertices selected by the generation unit 11 are in the same connected set, different labels will be respectively set for the selected vertices, and therefore the classification unit 12 will attach different labels respectively to vertices in the same connected set. In this case, the classification will be different from the actual graph.
In Modification 1, even if different labels are respectively set for vertices in the same connected set, the labels of the vertices in the same connected set are unified to the same label, and thus they can be classified in the same way as the actual graph.
Modification 1 will be described with reference to
The example of
The example of
Subsequently, the result (M1′) of the product of
Subsequently, a result (S1′: second matrix) of the sum of the result (M1′) of the product of
The example of
Subsequently, the classification unit 12 generates a frontier matrix (F3′) using the result (M2′) of the product and the result (S1′) of the sum of
Subsequently, the classification unit 12 calculates a result (S2′: second matrix) of the sum of the result (M2′) of the product of
According to Modification 1, even if the vertices are selected in the same connected set, since the labels in the same connected set can be set to the same label, the vertices can be classified.
[Modification 2]
If the vertices selected by the generation unit 11 are in the same connected set, since different labels are respectively set for the selected vertices, the classification unit 12 will attach different labels respectively to the vertices in the same connected set. In this case, the classification will be different from the actual graph.
In Modification 2, even if different labels are respectively set for the vertices in the same connected set, the labels of the vertices in the same connected set are unified to the same label, and thus they can be classified in the same way as the actual graph.
In Modification 2, if the classification unit 12 detects duplicate vertices, the classification unit 12 generates connection information indicating that the vertices that are the starting points of the duplicate vertices are in the same connected set and stores the connection information in a storage unit. Further, the classification unit 12 selects a label corresponding to any of the starting points of the duplicate vertices and sets it for the corresponding element of the second matrix. Thereafter, the classification unit 12 unifies the labels of the elements corresponding to the vertices in the same connected set based on the connection information after the labels are set for all of the elements of the second matrix.
Modification 2 will be described with reference to
In the example of
The example of
In such a case, since “11” is attached to the elements corresponding to the vertices 5 and 6 of the result (M1′) of the product, the classification unit 12 stores in the storage unit the connection information indicating that the vertices 4, 5, 6, and 7 are in the same connected set. Alternatively, the classification unit 12 may store in the storage unit the connection information indicating that the vertices labeled with the label “01” or the label “10” are in the same connected set.
Note that the classification unit 12 sets the label “01” set for the vertex 4 or the label “10” set for the vertex 7 for the elements corresponding to the duplicate vertices 5 and 6 of the result (M1′) of the product.
Subsequently, the result (M1′) of the product of
Subsequently, the classification unit 12 calculates the result (S1″: second matrix) of the sum of the result (M1′) of the product of
The example of
Subsequently, the classification unit 12 calculates the result (S2″: second matrix) of the sum of the result (M2″) of the product of
Subsequently, when the vertices 4 to 9 of the result (S2″) of the sum are labeled, the classification unit 12 unifies the labels corresponding to the vertices in the same connected set to the same label based on the connection information. For example, as illustrated in
According to Modification 2, even if vertices are selected in the same connected set, since the labels in the same connected set can be set to the same label, the vertices can be classified.
[Device Operation]
Subsequently, the operation of the graph searching apparatus in the embodiment, and Modifications 1 and 2 of the present invention will be described with reference to
As illustrated in
Subsequently, the adjacency matrix generation unit 14 generates the adjacency matrix representing the adjacency relationship of vertices for the target graph by using the data required for the graph search (step A2). Specifically, in step A2, the adjacency matrix generation unit 14 first obtains the data required for the graph search. Subsequently, in step A2, the adjacency matrix generation unit 14 generates the adjacency matrix corresponding to the target graph by using the vertices included in the target graph and the adjacency relationship of each vertex. The adjacency matrix generation unit 14 generates, for example, the adjacency matrix (R) as illustrated in
Subsequently, the generation unit 11 selects vertices based on the adjacency relationship of the vertices included in the graph, and generates a frontier matrix in which different labels are respectively set for the elements corresponding to the selected vertices (step A3). Specifically, in step A3, the generation unit 11 first selects vertices from the vertices included in the graph.
For the selection of vertices, for example, the user may select the vertices as appropriate, or the generation unit 11 may select the vertices. Note that when the generation unit 11 selects the vertices, the generation unit 11 may select preset vertices or may select the vertices based on the number of vertices.
Subsequently, in step A3, the generation unit 11 sets different labels for elements corresponding to the selected vertices of a frontier matrix having the same number of rows as the number of vertices. The generation unit 11 generates, for example, the frontier matrix (F1) as illustrated in
Subsequently, the classification unit 12 generates a new frontier matrix by referring to the result (second matrix) of the sum representing the vertices for which the graph search has been completed, and excluding the elements corresponding to the searched vertices from the result (first matrix) of the product of the adjacency matrix and the frontier matrix (step A4). Thereafter, the classification unit 12 classifies the vertices by calculating the sum of the result of the product and the result of the sum and generating a new result (second matrix) of the sum.
Specifically, the classification unit 12 first obtains the adjacency matrix and the frontier matrix in step A4. Subsequently, the classification unit 12 calculates the result of the product of the adjacency matrix and the frontier matrix in step A4.
Subsequently, in step A4, the classification unit 12 generates the new frontier matrix by referring to the result of the sum and excluding the elements corresponding to the searched vertices from the result of the product if the result of the product has searched vertices. The generation unit 11 generates the new frontier matrices (F3) (F4), for example, as described above with reference to
Subsequently, in step A4, the classification unit 12 classifies the vertices by calculating the sum of the result of the product and the result of the sum and generating a new result of the sum. The generation unit 11 generates the new results (S1), (S2), and (S3) of the sum, for example, as described above with reference to
Subsequently, the output information generation unit 15 generates the output information to be used to output one or more of the graph, the adjacency matrix, the frontier matrix, the result of the product, the result of the sum, and the like to the output device (step A5). Thereafter, the output information generation unit 15 transmits the generated output information to the output device (step A6).
[Modification 1]
The operation of Modification 1 will be described with reference to
Subsequently, in step A4, the classification unit 12 generates the result (M1′) of the product of the adjacency matrix (R) of
Subsequently, in step A4, the classification unit 12 uses the result (M1′) of the product of
Subsequently, in step A4, the classification unit 12 calculates the result (S1′: second matrix) of the sum of the result (M1′) of the product of
Subsequently, in step A4, the classification unit 12 generates the result (M2′) of the product of the adjacency matrix (R) of
Subsequently, in step A4, the classification unit 12 generates a frontier matrix (F3′) using the result (M2′) of the product and the result (S1′) of the sum of
Subsequently, in step A4, the classification unit 12 calculates a result (S2′: second matrix) of the sum of the result (M2′) of the product of
[Modification 2]
The operation of Modification 2 will be described with reference to
In step A3, when the vertices 4 and 7 illustrated in
In step 4, if the classification unit 12 detects duplicate vertices, the classification unit 12 generates connection information indicating that the vertices that are the starting points of the duplicate vertices are in the same connected set and stores the connection information in a storage unit. Further, the classification unit 12 selects the label corresponding to any of the starting points of the duplicate vertices and sets it for the corresponding element of the second matrix. Thereafter, the classification unit 12 unifies the labels of the elements corresponding to the vertices in the same connected set based on the connection information after the labels are set for all the elements of the second matrix.
Specifically, in step A4, the classification unit 12 first generates the result (M1′) of the product of the adjacency matrix (R) of
In such a case, since “11” is attached to the elements corresponding to the vertices 5 and 6 of the result (M1′) of the product, in step A4, the classification unit 12 stores in the storage unit the connection information indicating that the vertices 4, 5, 6, and 7 are in the same connected set. Alternatively, in step A4, the classification unit 12 may store in the storage unit the connection information indicating that the vertices labeled with the label “01” or the label “10” are in the same connected set.
Note that, in step A4, the classification unit 12 sets the label “01” set for the vertex 4 or the label “10” set for the vertex 7 for the elements corresponding to the duplicate vertices 5 and 6 of the result (M1′) of the product.
Subsequently, in step A4, the classification unit 12 uses the result (M1′) of the product of
Subsequently, in step A4, the classification unit 12 calculates the result (S1“: second matrix) of the sum of the result (M1′) of the product of
Subsequently, in step A4, the classification unit 12 generates the result (M2″) of the product of the adjacency matrix (R) of
Subsequently, in step A4, the classification unit 12 calculates the result (S2″: second matrix) of the sum of the result (M2″) of the product of
Subsequently, in step A4, when the vertices 4 to 9 of the result (S2″) of the sum are labeled, the classification unit 12 unifies the labels corresponding to the vertices in the same connected set to the same label based on the connection information. For example, as illustrated in
[Effect of the Present Example Embodiment]
As described above, according to the present example embodiment, since the adjacency matrix and the frontier matrix are used to classify the vertices, the time required for the graph search can be reduced.
In addition, conventionally, when performing a graph search using a processor or the like, one vertex is selected, the graph search is completed for the selected vertex, then one next vertex is selected, and the graph search is sequentially performed. However, in the present example embodiment, since the vertices are selected and the graph search is performed in parallel, the number of steps required for the graph search can be reduced as compared with the conventional case.
According to Modifications 1 and 2, even when the vertices are selected in the same connected set, since the labels in the same connected set can be set to the same label, the vertices can be classified.
[Program]
The program according to the embodiment of the present invention may be any program that causes the computer to execute steps A1 to A6 illustrated in
Further, the program in the present example embodiment may be executed by a computer system constructed by a plurality of computers. In this case, for example, each computer may function as any of the obtaining unit 13, the adjacency matrix generation unit 14, the generation unit 11, the classification unit 12, and the output information generation unit 15.
[Physical Configuration]
Here, the computer for realizing the graph searching apparatus by executing the programs in the embodiment, and Modifications 1 and 2 will be described with reference to
As illustrated in
The CPU 111 expands the program (code) in the present example embodiment stored in the storage device 113 into the main memory 112 and executes the program in a predetermined order to perform various operations. The main memory 112 is typically a volatile storage device such as a dynamic random access memory (DRAM). Further, the program in the present example embodiment is provided in a state of being stored in a computer-readable recording medium 120. The program in the present example embodiment may be distributed on the Internet connected via the communication interface 117. Note that the recording medium 120 is a non-volatile recording medium.
Further, examples of the storage device 113 include a semiconductor storage device such as a flash memory in addition to a hard disk drive. The input interface 114 mediates data transmission between the CPU 111 and an input device 118 such as a keyboard and a mouse. The display controller 115 is connected to a display device 119 and controls display on the display device 119.
The data reader/writer 116 mediates the data transmission between the CPU 111 and the recording medium 120, reads the program from the recording medium 120, and writes a processing result in the computer 110 to the recording medium 120. The communication interface 117 mediates the data transmission between the CPU 111 and another computer.
Further, examples of the recording medium 120 include a general-purpose semiconductor storage device such as a compact flash (CF) (registered trademark) and a secure digital (SD), a magnetic recording medium such as a flexible disk, or an optical recording medium such as a compact disk read only memory (CD-ROM).
Note that the graph searching apparatus 10 in the present example embodiment can also be implemented using hardware corresponding to each part instead of the computer in which the program is installed. Further, the graph searching apparatus 10 may be partially implemented by a program and the rest may be implemented by hardware.
[Supplementary Notes]
Regarding the above-described example embodiment, the following supplementary notes are further disclosed. A part or all of the above-described example embodiment can be described by (Supplementary note 1) to (Supplementary note 10) described below, but is not limited to the following description.
(Supplementary Note 1)
A graph searching apparatus including:
a generation unit configured to select a plurality of vertices based on an adjacency relationship of vertices included in a graph and generate a frontier matrix in which different labels are respectively set for elements corresponding to the selected vertices; and
a classification unit configured to classify the vertices using the frontier matrix and an adjacency matrix representing the adjacency relationship of the vertices included in the graph.
(Supplementary Note 2)
The graph searching apparatus described in supplementary note 1, in which
the classification unit
generates a new frontier matrix by referring to a second matrix representing vertices for which a graph search has been completed and excluding elements corresponding to the searched vertices from a first matrix that is the product of the adjacency matrix and the frontier matrix, and classifies the vertices by calculating the sum of the first matrix and the second matrix and generating a new second matrix.
(Supplementary Note 3)
The graph searching apparatus described in supplementary note 1 or 2, in which the label determines a bit width based on the number of vertices that are starting points.
(Supplementary Note 4)
The graph searching apparatus described in any one of supplementary notes 1 to 3, in which the generation unit and the classification unit are operated using a vector processor.
(Supplementary Note 5)
A graph searching method including:
(a) a step of selecting a plurality of vertices based on an adjacency relationship of vertices included in a graph and generating a frontier matrix in which different labels are respectively set for elements corresponding to the selected vertices; and
(b) a step of classifying the vertices using the frontier matrix and an adjacency matrix representing the adjacency relationship of the vertices included in the graph.
(Supplementary Note 6)
The graph searching method described in supplementary note 5, in which
in the (b) step,
a new frontier matrix is generated by referring to a second matrix representing vertices for which a graph search has been completed and excluding elements corresponding to the searched vertices from a first matrix that is the product of the adjacency matrix and the frontier matrix; and
the vertices are classified by calculating the sum of the first matrix and the second matrix and generating a new second matrix.
(Supplementary Note 7)
The graph searching method described in supplementary note 5 or 6, in which the label determines a bit width based on the number of vertices that are starting points.
(Supplementary Note 8)
A computer readable recording medium including a program recorded thereon, the program including instructions for causing a computer to carry out:
(a) a step of selecting a plurality of vertices based on an adjacency relationship of vertices included in a graph and generating a frontier matrix in which different labels are respectively set for elements corresponding to the selected vertices; and
(b) a step of classifying the vertices using the frontier matrix and an adjacency matrix representing the adjacency relationship of the vertices included in the graph.
(Supplementary Note 9)
The computer readable recording medium described in supplementary note 8, in which
in the (b) step,
a new frontier matrix is generated by referring to a second matrix representing vertices for which a graph search has been completed and excluding elements corresponding to the searched vertices from a first matrix that is the product of the adjacency matrix and the frontier matrix, and
the vertices are classified by calculating the sum of the first matrix and the second matrix and generating a new second matrix.
(Supplementary Note 10)
The computer readable recording medium described in supplementary note 8 or 9, in which
the label determines a bit width based on the number of vertices that are starting points.
Although the present invention has been described above with reference to the example embodiment, the present invention is not limited to the above example embodiment. Various changes that can be understood by those skilled in the art can be made within the scope of the present invention in terms of the structure and details of the present invention.
As described above, according to the present invention, the time required for graph search can be reduced. The present invention is useful in fields where graph search is required. For example, it can be used for analysis of SNS, analysis of a road connecting points on an electronic map, and the like.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2019/047708 | 12/5/2019 | WO |