The present description generally relates to the technical field of artificial intelligence and measurement methods, and it relates in particular to knowledge graphs used for recommendations.
The present disclosure relates to a computer implemented system and methods to determine relationships between data elements using a knowledge graph.
In knowledge representation and reasoning, a knowledge graph is a knowledge base that uses a graph-structured data model to integrate data. Knowledge graphs are used to store and process interlinked descriptions on edges, known as the relationships between nodes.
A social network can be seen as an analogy for a knowledge graph as social networks encode entities such as people. For example, two individuals can be connected through relations such as a friend, and these relations are known as edges. The knowledge graph encodes their contents in a similar connected manner through the means of nodes for entities and edges for relations between two entities, but unlike a social network, the knowledge graphs are not required to encode knowledge focused on people.
By analysing existing edges in a graph, new subtle relations in the graph can be discovered. For example, two individuals with ten common friends may benefit from the recommendation of becoming friends. Carrying out this type of reasoning using artificial intelligence techniques can require assigning weights to different types of relationships, such as a friend or colleague, as each relationship can have a different relevance to the problem at hand. Therefore, assigning accurate weights to different types of relationships can greatly impact the accuracy of the recommendations.
According to one aspect of the present disclosure, a computer-implemented method to determine relationships between data elements in a data store where the data is obtained from multiple information sources is disclosed. In this disclosure, we address the problem of assigning weights in order to increase customisability of a knowledge graph. The method includes establishing a knowledge graph, having graph elements including a plurality of nodes representing data elements and a plurality of edges, wherein each edge joins two nodes in the plurality of nodes and represents a strength of relationship between said two nodes and includes associating weights with graph elements, and determining at least some of said weights as modifiable weights. This method also includes using training data comprising established relationship strength values between pairs of nodes, determining relationships between pairs of nodes by establishing a set of qualifying paths between each pair of nodes. Once the qualifying paths have been established, the strength of the relationship between the pair of nodes is determined by summing weights associated with each qualifying path in the set of qualifying paths to calculate a relationship strength value for the edge between the pair of nodes. Then, the established relationship strength values are compared with the calculated relationship strength values and one or more of the modifiable weights are modified so that the established relationship strength values match the calculated relationship strength values. Steps to calculate the relationship strength values and modifying modifiable weights are repeated until a matching criterion is reached. The relationship strengths are then established between data elements in the data store in accordance with the relationship strength calculated between the nodes representing those data elements.
As paths beyond a given length tend to lose their importance, the method may further include qualifying paths comprising as a qualifying parameter a cut-off condition wherein there is a beneficial trade-off between computational time and the importance of paths as well as qualifying paths comprising as a qualifying parameter a filter wherein pathways are filtered away that match one or several irrelevance pattens. It should be noted that the qualifying parameters are specified as input during initialization.
In some embodiments, the calculated relationship strength values are dependent on the modifiable weights that characterise the graph elements. The modifiable weights are then adjusted to reduce the error between the calculated relationship strength values and the established relationship strength values recorded in the training data. This method further includes the matching criterion which is only satisfied when the changes in the modifiable weights become smaller than a predefined threshold.
In a second aspect, this disclosure provides a method of selecting a manufacturing resource for manufacturing a product, wherein the method comprises a computer-implemented method of determining relationships between data elements according to the first aspect. The data elements relate to the manufacturing of products using a set of manufacturing resources, and the method comprises determining relationship strengths between first nodes representing the manufacturing resources in the set of manufacturing resources and a second node representing a product, wherein the resulting relationship strengths between the first nodes and the second node are ranked to order manufacturing resources for the product.
A third aspect of this disclosure includes a corresponding system to determine relationships between data elements comprising data obtained from multiple information sources. The system includes one or more connections that provide access to multiple data sources comprising the data elements, and computing apparatus comprising at least a memory and a processor. The processor is programmed to establish a knowledge graph having graph elements including a plurality of nodes each representing one of the data elements and a plurality of edges, wherein each edge joins two nodes in the plurality of nodes and represents a strength of relationship between said two nodes. The system also includes weights associated with graph elements, wherein some of said weights are modifiable weights and training data comprising established relationship strength values between pairs of nodes, wherein the training data is used to determine relationships between pairs of nodes. To determine relationships between pairs of nodes with an edge between them, a set of qualifying paths is established and a strength of relationship between the pair of nodes is determined by summing weights associated with each qualifying path in the set of qualifying paths to calculate the relationship strength value of the edge between the pair of nodes. The established relationship strength values are then compared to the calculated relationship strength values and one or more of the modifiable weights are modified so that the established relationship strength values match the calculated relationship strength values. The steps of calculating relationship strength values and modifying modifiable weights are repeated until the matching criterion is reached. Resulting relationship strengths are established between data elements in the data store in accordance with the relationship strength calculated between the nodes representing those data elements.
This system also includes one or more connections to data sources comprising at least one database in the memory of the system. The one or more connections to data sources comprise at least one connection via an API to an external system.
In a fourth aspect, the disclosure provides a system for selecting a manufacturing resource for manufacturing a product, wherein the system comprises a system for determining relationships between data elements according to the third aspect, wherein the data elements relate to manufacturing of products using a set of manufacturing resources. The system is programmed to determine relationship strengths between first nodes representing the manufacturing resources in the set of manufacturing resources and a second node representing a product. Furthermore, the resulting relationship strengths between the first nodes and the second node are ranked to order manufacturing resources for the product.
This is a challenging problem as knowledge graphs can be large, and the calculation of the relationship strength needs to be accurate and consistent throughout the graph. One context in which this problem is relevant is in the manufacturing of machine parts. In practice, it is difficult to determine how best to assign parts for printing using 3D printing devices, where several 3D printing devices can be available—for example, 3D printing devices in the network of an organisation, or in a farm of 3D printing devices. There may even be a choice as to whether the parts could be sourced by another manufacturing route. Currently, 3D printing farm technologies are unable to make an accurate decision about whether it is faster or more efficient to print parts in-house or outsource to an external supplier.
Specific embodiments of the disclosure will now be described, by way of example, with reference to the accompanying drawings of which:
The server (140) communicates with an API (150), exposing the data outputs to 3D printing devices (180) and environments in the cloud. In this embodiment, the API (150) is in communication with manufacturing environments (160), third-party environments (170) and 3D printing devices (180). The embodiment described here is of particular value in determining the best approach to providing a component—in particular, whether it is sourced through a manufacturer or some other supplier, or directly 3D printed internally. This is however only one of many possible applications for the approach described here, and other embodiments may relate to other decision-making processes.
In
Referring to
The training dataset comprising of nodes and established relationship strength values between the nodes is also initialized at block (200). Once the knowledge graph (120) and training dataset have been defined, they are read into the system at blocks (210) and (220) from the data store (130). The weights associated with the nodes and edges are then initialized at block (230) and a subset of these weights are modified at block (240). For example, in one embodiment, only the edge types can be modified. The modifiable weights for each type of edge, node, and attribute in the knowledge graph (120) are parameters that can transform the input data during training.
In
During the training process (40), the set of relevant qualifying paths between nodes that satisfies a cut-off condition is retrieved at block (320). If the qualifying path is readily available from the previous iteration at block (330), it is fetched at block (390). These qualifying paths can be computed on the fly or obtained from a cache if available. In one possible embodiment, the set of qualifying paths can be computed at block (340) during the first iteration when a given relationship strength value is processed. In order to compute the set of qualifying paths, the following sub-steps can be followed: first, the cut-off condition should be met. This is a beneficial trade-off between the computational time and the importance of the qualifying paths, and in this embodiment could be denoted as avoiding generating paths longer than a given threshold. In practice, paths beyond a given length tend to lose their importance. For example, in a social network, a short path such as one-hop path: “John is a friend of Mark” suggests a much stronger relationship than a multiple-hop path such as “John is a friend of a friend of a friend of Peter”. Once the cut-off condition has been met, qualifying paths are further reduced through filtering means at block (350) in order to remove information that is irrelevant to the application. This is performed by using one or several given patterns and filtering away pathways that match such a pattern. After filtering the paths, a cache of the final set of qualifying paths is generated for future uses at block (360). It should be understood that parameters such as the cut-off threshold and the set of irrelevance patterns need to be specified as input.
In
Referring to
However, in practice not all elements of the same type have the same weight. That is, a given element such as a weight can have two weights, one corresponding to its type, and one corresponding to the element at hand. For instance, the weight of the type is the weight of the friend relationship, and the weight of the element is the weight that two friends who interact very often could have a stronger relationship than two friends who hardly interact. This disclosure focuses on tuning type-specific weights and allows the combination with element weights.
Lastly, in
This embodiment could be used for a number of different purposes. For instance, this approach is useful in manufacturing, particularly in the supplier and 3D printing space where manufacturing resources need to be selected for the manufacturing of products. This process includes automatically assigning parts from highly ranked suppliers to print products using 3D printing devices where there can be several 3D printing devices available. This is advantageous as it can make accurate decisions about whether it is faster or more efficient to print parts in-house or outsource to an external supplier.
Once the training process is complete, the manufacturing selection process and selection API are updated at blocks (700) and (710) respectively by ranking the options with the tuned knowledge graph. Pairs of nodes are ranked based on their calculated relationship strength values, wherein the highest ranked pair can automatically identify which supplier to choose that will result in the best outcome based on criteria such as cost, quality, reliability, or a plurality of other factors. Lastly, the data store (130) is updated with the newly ranked pairs of nodes at block (720) and the process ends at block (800).
The skilled person will appreciate that further embodiments may be provided within the spirit and scope of the presently disclosed subject matter as claimed—these may, for example, relate to decision-making processes other than manufacture of components.
Number | Date | Country | Kind |
---|---|---|---|
202211004941 | Jan 2022 | IN | national |
This application is a national phase filing under 35 C.F.R. § 371 of and claims priority to PCT Patent Application No. PCT/EP2022/057704, filed on Mar. 23, 2022, which claims the priority benefit under 35 U.S.C. § 119 of Indian Patent Application No. 202211004941, filed on Jan. 29, 2022, the contents of which are hereby incorporated in their entireties by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2022/057704 | 3/23/2022 | WO |