This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2013-068967, filed on Mar. 28, 2013, the entire contents of which are incorporated herein by reference.
This invention relates to a technique for generating a graph that connects plural nodes with links.
There are a lot of techniques in which a position of each node is calculated by an algorithm that conforms with a mechanics model such as a spring model, when a graph that connects plural nodes with links is displayed.
In such a technique, when the number of nodes increases, some nodes are not typically connected with links due to problems such as a calculation amount and display space. When there is no link between nodes, there is few relation between those nodes in almost all cases. However, when the link is not actually provided, information representing that there is few relation between those nodes is not reflected to the position calculation. In other words, the position calculation of the nodes is performed in a form that the relation between those nodes is indifferent, instead of “few”. Therefore, because of other constraint conditions, a case occurs in which nodes that have few relations are disposed in the neighborhood.
Recently, attention is paid to a data mining that is a technique for extracting knowledge from a large volume of data. In the data mining, analysis for deriving the relation of each element included in the large volume of data is often performed. For example, when it is assumed that the transmission source and destination of e-mail have a relation, data representing that the relation among users is extracted by analyzing e-mail that is transmitted and received among certain users. Typically, because e-mail is frequently exchanged between certain users, however, there are users that do not exchange e-mail at all, there are various relations among users.
When such data representing that the relation among users is expressed by a two-dimensional graph, the relation may be represented by the thickness of the link, or the positions of nodes representing users may be optimized by using the mechanics model such as the spring model as described above, and the relation among the users may be expressed by the final positional relation of the nodes.
In case where the latter expression method is employed, the relation between nodes may be misconceived when employing the aforementioned typical conventional method. In other words, regardless of nodes being disposed in the vicinity, those nodes may have less relation. Especially, when the relation is expressed by the two-dimensional graph, nodes and/or links are overlapped due to the problem of the space, and accordingly, it is not possible to identify nodes that are disposed in the vicinity regardless of less relation.
There are a lot of cases where such a graph is handled as being static, however, there is a case where the position of the node can be changed according to the user's instruction. However, the meaning that the position of the node is changed according to the user's instruction is not clear.
According to one aspect of embodiments, an information processing method includes: (A) obtaining, for each node of plural nodes in a graph, which are associated each other, a display position at which the node is displayed on a display device; (B) calculating, for each node of the plural nodes, a movement vector according to a total sum of forces in conformity with a mechanics model in which an inertial force does not work, wherein the total sum of the forces is obtained by adding, with respect to all of nodes other than the node, a force that works in association with a distance concerning the display position with another node; (C) moving, for each node of the plural nodes, the display position by the calculated movement vector; and (D) while repeating the obtaining, the calculating and the moving or before performing the obtaining, the calculating and the moving, accepting an instruction corresponding to a user's operation for causing a display position of a certain node among the plural nodes to be changed, and changing the display position of the certain node on the display device according to the instruction.
The object and advantages of the embodiment will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the embodiment, as claimed.
The first data storage unit 110 stores data concerning degrees of association aij between elements i and j, which is obtained from an analysis apparatus 300 that performs a processing relating to the data mining, for example, to calculate data concerning degrees of association between elements. For example, the degree of association aij is expressed by a value from 0.0 to 1.0. In this embodiment, the degrees of association aij for all combinations of elements are set, and reference distances described later are also set.
The distance calculation unit 120 calculates a reference distance between a node i corresponding to an element i and a node j corresponding to an element j from the data concerning the degrees of association, which is stored in the first data storage unit 110, and stores the calculated reference distances into the second data storage unit 130. As described above, the reference distance dij is calculated for all combinations of nodes. The third data storage unit 190 stores data concerning a present display position vi(=(x, y). coordinate values in a two-dimensional space) of the node i.
The movement processing unit 140 calculates, for each node, data concerning a display position vi of a movement destination by using data stored in the second data storage unit 130 and data stored in the third data storage unit 190, and stores the calculated data into the fourth data storage unit 150. In this embodiment, the spring model is employed as one example of the mechanics model for calculation of the movement destination. However, as described later, the inertial force does NOT work on nodes.
The position obtaining unit 180 obtains data of the present display position from the fourth data storage unit 150 and the input unit 200, and stores the obtained data into the third data storage unit 190.
The display data generator 160 generates display data of a graph by using data of the display positions of nodes, which is stored in the fourth data storage unit 150, and data concerning the degrees of associations aij between nodes, which is stored in the first data storage unit 110, and outputs the generated display data to the display unit 170. When the user inputs an instruction to move the position of a certain node by operating the input device such as a mouse through the input unit 200, the display data generator 160 performs a processing to move the certain node in response to the instruction. Moreover, the instruction to move the position of the certain node is also outputted from the input unit 200 to the position obtaining unit 180, and the position obtaining unit 180 modifies the display position of the certain node, which is stored in the fourth data storage unit 150, and also stores the modified display position and the display positions of other nodes into the third data storage unit 190.
In principle, by disposing nodes that have strong relation in the neighborhood and by disposing nodes that have few relation far away, this embodiment enables the user who watches the graph to intuitively understand the relations among elements corresponding to nodes. For this purpose, a disposition processing of nodes is performed based on the spring model while assuming that there are links for all combinations of nodes. This is because it is a problem that the node disposition is obtained in which the fact there is few relation is not taken into consideration, when the positions of the nodes are determined while assuming that there is no link because there is few relation. In other words, this is because it is a problem that a processing is performed similarly to a case where the relation is indifferent, and nodes that have few relation are often disposed in vicinity. More specifically, no repulsion between nodes that have few relation works on those nodes.
However, when the number of nodes becomes huge, it becomes difficult to view nodes because the nodes are overlapped in the two-dimensional graph. However, in order to maintain the aforementioned intuition, a method for resolving the overlap of the nodes by other rules is not employed.
Moreover, as schematically illustrated in
In this embodiment, in order to solve the difficulty in viewing the graph due to the overlap of the nodes, and to verify whether the balance of the forces such as the local optimization occurs as a temporary measure or the nodes that have strong relation are originally disposed adjacently, it is possible for the user who refers to the graph to move the node position interactively. In case where the spring model is employed, an intuitive and easily-understandable phenomena can be confirmed in which, when a certain node among nodes that have a strong relation is separated, the certain node immediately returns to the other nodes that have the strong relation with the certain node, in other words, the original position. On the other hand, when a node is moved that is disposed at that position because the balance of the forces occurs as a temporary measure and that have a weak relation with surrounding nodes, another phenomena is confirmed that such a node moves to another position and becomes stable. Thus, by moving the node, it is possible to confirm the strength of the relation with the surrounding nodes, accordingly the importance of the node.
In the following, processing contents of the information processing apparatus 100 will be explained by using
Firstly, the distance calculation unit 120 calculates a reference distance dij between nodes based on the degree of association aij between elements, which is stored in the first data storage unit 110, and stores the calculated reference distances into the second data storage unit 130 (
For example, data concerning the degrees of association, as illustrated in
The reference distance dij is expressed by a following matrix D.
Here, it is assumed that there are “n” nodes.
Moreover, the position obtaining unit 180 sets an initial display position vi for each node i (1≦i≦n), and stores the initial display positions into the third data storage unit 190 (step S3). For example, the initial display position is set randomly.
After that, the movement processing unit 140 calculates a movement vector fi according to a total sum of relative vectors (vj−vi), each of which is weighted according to a difference between a distance d(vi, vj) with another node j for the display position vj and the reference distance dij (more specifically, a difference normalized by the distance d(vi, vj)), and stores data of the calculated movement vector fi into the fourth data storage unit 150 (step S5).
More specifically, the movement vector fi is calculated by the following expression.
δ is a sufficiently lesser value. In other words, while repeating the loop, the node is slightly moved. d (vi, vj) represents a function for calculating a distance between a display position vi and a display position vj (e.g. Euclid distance) . As described above, this expression represents that forces from all other nodes j works on one node i according to the difference with the reference distances dij. However, this expression is an example, and other expressions may be employed. For example, in this expression, the spring constant is “1”, however, other values may be employed for the spring constant.
However, in this embodiment, the friction is infinite, and the inertial force does not work. When taking into account the inertial force in this spring model, it takes a long time up to the convergence of the node disposition. As described above, there is a case where the strength of the relation is confirmed by manually changing the node position, however, because it takes a long time up to the beginning of the confirmation in case where it takes a long time up to the convergence, the interactivity is lowered. Moreover, because the node is moved by the user, the kinetic energy increases, however, in case where the inertial force works, the system becomes unstable, and the node disposition may not be settled. Therefore, the spring model without the inertial force is employed in this embodiment. When there is no inertial force, the calculation of the speed and the like become unnecessary to reduce a computation amount. In addition, the interactivity increases.
Furthermore, as for the weight of the node, a sufficiently large weight is employed so that the node does not move even by the spring.
This calculation is executed in a problem in which the node position vi is calculated to minimize an evaluation value of the following evaluation expression.
In this embodiment, instead of the scalar values such as energy that was conventionally used, a point that a direction of the force worked on each node (i.e. relative vector) is considered is different from a typical conventional art.
Then, the movement processing unit 140 moves each node i by the movement vector fi to calculate the display position of the movement destination by (the movement vector fi+present display position vi), and stores the calculated display positions into the fourth data storage unit 150 (step S7).
For example, data as illustrated in
Then, the display data generator 160 disposes a node at the display position vi of each node i, which is stored in the fourth data storage unit 150, and also generates display data of a graph in which a link is set between nodes according to the degrees of association ai, which is stored in the first data storage unit 110, for example to display the display data on the display unit 170 (step S9). In other words, when the degree of association is equal to or less than a predetermined value, display of the link may be omitted, the link may be thickened when the degree of association exceeds a predetermined value, and the link having the thickness corresponding to the degree of association maybe set. However, as described above, the reference distance dij is calculated while assuming links exist for all combinations of nodes, and while taking into account all combinations of nodes, the movement vector fi is calculated at the step S5.
For example, display as illustrated in
When an instruction to end this processing here from the user or the like is received or when a condition as a preset end condition is satisfied that a total sum of the movement vectors fi, which are calculated at the step S5, is less than a predetermined value (step S11: Yes route), the processing ends. On the other hand, when the processing does not end (step S11: No route), the input unit 200 determines whether or not a movement instruction of a node is inputted from the user (step S12). When the movement instruction of the node is not inputted (step S12: No route), the processing shifts to step S15.
On the other hand, when the user operates the mouse or the like, and inputs the movement instruction by changing the display position of the displayed node by drag-and-drop, for example, (step S12: Yes route), the input unit 200 accepts the movement instruction, and outputs the movement data to the display data generator 160. Then, the display data generator 160 changes the display contents according to the movement instruction, and outputs data concerning the changed display contents to the display unit 170 (step S13).
Moreover, when there is no movement instruction from the input unit 200, the position obtaining unit 180 reads out data of the display positions, which is stored in the fourth data storage unit 150 as it is, and stores the read data into the third data storage unit 190. On the other hand, when the input unit 200 accepts the movement instruction, the input unit 200 outputs the movement instruction to the position obtaining unit 180. Then, the position obtaining unit 180 changes the display position for a node relating to the movement instruction according to the movement instruction, and then reads out data of the display positions for other nodes, which is stored in the fourth data storage unit 150, and stores the changed data and read data into the third data storage unit 190 (step S15). For example, data as illustrated in
By repeating such a processing, display as schematically illustrated in the following is made.
For example, when it is assumed that goods ordered together from the same store in a franchise, which handles a lot of goods, have a relation, and orders made on plural days from a lot of stores are analyzed by the analysis apparatus 300, relations among a lot of goods are extracted.
For example, when the steps S3 to S15 (except step S13) are repeated until the node positions are converged, in other words, the total sum of the vectors fi calculated at the step S5 is less than the predetermined value, a graph illustrated in
When the user drops the node “milk” in the state illustrated in
When the degree of association is high, the movement speed is also high. Because the inertial force does not work, the movement distance obtained for each time when the processing of the steps S5 to S15 is executed is greater compared with a case where the degree of association is low, and the node position is rapidly converged. Therefore, the verification of other nodes can be performed immediately.
As illustrated in
Next, as illustrated in
As described above, by performing the processing in this embodiment, the user can confirm, in the graph in which plural nodes are disposed in the two-dimensional space, whether this node is a node that is surrounded by other nodes that have a strong relations with this node or a node that has a weak relation with surrounding nodes and is disposed at the current position accidentally.
Moreover, when it is assumed that the transmission source user and destination user have a relation, and the degrees of association are calculated by analyzing a lot of e-mail, the graph as illustrated in
Although the embodiment of this invention was explained, this invention is not limited that embodiment. For example, as for the processing flow, as long as the processing result does not change, the turns of steps may be exchanged, and plural steps maybe executed in parallel. For example, until the node positions are converged to a certain degree, in order words, when the total sum of the vectors fi calculated at the step S5 in
Moreover, the functional block diagram is a mere example, and does not correspond to the program module configuration and file configuration.
Moreover, an application range of this embodiment is not limited to the relation of goods or users, and the embodiment can be applied to various field.
Furthermore, the information processing apparatus 100 may be configured by plural computers instead of a single computer.
In addition, the aforementioned information processing apparatus 100 is a computer device as illustrated in
The aforementioned embodiments are outlined as follows:
An information processing method relating to this embodiment includes (A) obtaining, for each node of plural nodes in a graph, which are associated each other, a display position at which the node is displayed on a display device; (B) calculating, for each node of the plural nodes, a movement vector according to a total sum of forces in conformity with a mechanics model in which an inertial force does not work, wherein the total sum of the forces is obtained by adding, with respect to all of nodes other than the node, a force that works in association with a distance concerning the display position with another node; (C) moving, for each node of the plural nodes, the display position by the calculated movement vector; and (D) while repeating the obtaining, the calculating and the moving or before performing the obtaining, the calculating and the moving, accepting an instruction corresponding to a user's operation for causing a display position of a certain node among the plural nodes to be changed, and changing the display position of the certain node on the display device according to the instruction.
For example, when the display position of a certain node is forcibly moved after the display positions are temporarily converged, the movement vector is recalculated according to the display position after the movement. Therefore, when the certain node is a node that has strong relations with surrounding nodes, the certain node returns to almost the original position, when the certain node is not a node that has the strong relation, a phenomenon that the display position of the certain node is converged to another display position is observed. Nodes that have the strong relation are disposed originally in the vicinity, however, there is a node that is disposed in the vicinity due to the curvature caused by displaying the graph in the two-dimensional space. According to the aforementioned processing, it is possible to easily confirm whether or not the node selected by the user is such a node. Because the movement vector is calculated while assuming that there are links for all combinations of nodes, the weak repulsions work each other even when the relation between nodes is weak, and an appropriate node disposition are obtained.
The aforementioned calculating method may include calculating, for each node of the plural nodes, a movement vector according to a total sum of relative vectors, wherein the total sum of relative vectors is obtained by adding, with respect to all of nodes other than the node, a relative vector with another node, which is weighted according to a difference between a first distance concerning the display position with the another node and a second distance with the another node, which is preset according to a degree of association between the node and the another node. Because the inertial force does not work, the convergence of the node position can be stably and rapidly realized.
In addition, a link between nodes of the plural nodes may be selected as a display target according to a degree of association between the nodes. Although the links are set for all combinations of nodes, it is separately determined whether or not the link is displayed or how to display the like.
Incidentally, it is possible to create a program causing a computer to execute the aforementioned processing, and such a program is stored in a computer readable storage medium or storage device such as a flexible disk, CD-ROM, DVD-ROM, magneto-optic disk, a semiconductor memory, and hard disk. In addition, the intermediate processing result is temporarily stored in a storage device such as a main memory or the like.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present inventions have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2013-068967 | Mar 2013 | JP | national |