The present invention generally relates to systems and computer-implemented methods for predicting and classifying event sequences embedded in a knowledge graph.
Knowledge graphs (KGs) are widely known in prior art and are utilized, for instance, in connection with managing knowledge-intensive processes. In this context, for example, a process instance may be defined as a sequence of events, and each event may be characterized by a number of attributes, where the attribute values are references to nodes of the knowledge graph. Given such a process instance as the input, embodiments of the present invention address the problem of accurately and efficiently predicting some property of the process instance, including but not limited to properties of future events. The links into the knowledge graph provide valuable information about the events, which the invented method and system takes into account systematically.
The computation of graph embeddings is a state-of-art approach to determine a fixed-dimensional vector representation of each entity and relation type, which can then be used as input to further machine learning tasks. For example, Q. Wang et al.: “Coke: Contextualized knowledge graph embedding”, arXiv preprint arXiv: 1911.02168, proposes a method to compute context-dependent embeddings, where the context of a node is a sub-graph around it. Applying this method to the problem underlying the present invention, one could compute an embedding for each node and relation of the union of the knowledge graph and all events. The event embeddings could then serve as the input for the prediction task, and the context of an event is the process instance it occurs in.
However, computing a graph embedding is a computationally intensive process requiring substantial time, processing power, and energy. In application scenarios, in which new events and new process instances arrive frequently over time, it is not an option to re-compute the embedding for each newly arrived event. To make things worse, re-computation of the embedding also triggers the need to re-train the prediction model which uses the embeddings as input, increasing computation time even more.
C. Esteban et al.: “Predicting the co-evolution of event and knowledge graphs”, in 2016 19th International Conference on Information Fusion (FUSION), pp. 98-105, IEEE, addresses the problem of predicting events with regard to the knowledge graph itself, where events represent nodes and relations becoming valid or obsolete over time. In contrast to this approach, embodiments of the present invention address the problem of events that are parts of processes appearing outside the context of the knowledge graph evolution.
Furthermore, state of the art solutions are not able to do real-time updates when new events occur, and they do not take into account the order of events in a sequence. Thus, the remaining problem is the real-time prediction and classification using sequences of events embedded in knowledge graphs, including updates when new events occur.
In an embodiment, the present disclosure provides a computer-implemented method for event sequence forecasting of a process instance, the method comprising: building up and training a three-layered prediction model including a first, a second and a third layer, wherein the first layer is a graph embedding layer that assigns a fixed-dimensional graph embedding vector to each node and relation type in a fused event and knowledge graph that contains available structural information including events, knowledge graph nodes, and links between the events and the knowledge graph nodes, wherein the second layer is an event embedding layer that assigns to each event of the process instance a fixed-dimensional event embedding vector, and wherein the third layer is a prediction layer that receives as input a sequence of event embeddings from the second layer and that generates as output a prediction of an unknown property of the event sequence used as input.
Subject matter of the present disclosure will be described in even greater detail below based on the exemplary figures. All features described and/or illustrated herein can be used alone or combined in different combinations. The features and advantages of various embodiments will become apparent by reading the following detailed description with reference to the attached drawings, which illustrate the following:
In accordance with an embodiment, the present invention improves and further develops a method and a system of the initially described type for predicting and classifying event sequences embedded in a knowledge graph in such way that re real-time predictions are possible, even for knowledge-intensive processes, with high accuracy and in a computationally efficient way.
In accordance with another embodiment, the present invention provides a computer-implemented method for event sequence forecasting of a process instance, the method comprising building up and training a three-layered prediction model including a first, a second and a third layer, wherein the first layer is a graph embedding layer that assigns a fixed-dimensional graph embedding vector to each node and relation type in a fused event and knowledge graph that contains available structural information including events, knowledge graph nodes, and the links between them, wherein the second layer is an event embedding layer that assigns to each event of the process instance a fixed-dimensional event embedding vector, and wherein the third layer is a prediction layer that receives as input a sequence of event embeddings from the second layer and that generates as output a prediction of an unknown property of the event sequence used as input.
Furthermore, in accordance with another embodiment, the present invention provides a system for event sequence forecasting of a process instance the system comprising one or more processors configured to build up and train a three-layered prediction model including a first, a second and a third layer, wherein the first layer is a graph embedding layer that assigns a fixed-dimensional graph embedding vector to each node and relation type in a fused event and knowledge graph that contains available structural information including events, knowledge graph nodes, and the links between them, wherein the second layer is an event embedding layer that assigns to each event of the process instance a fixed-dimensional event embedding vector, and wherein the third layer is a prediction layer that receives as input a sequence of event embeddings from the second layer and that generates as output a prediction of an unknown property of the event sequence used as input.
According to another embodiment, the present invention provides methods and systems that use a fusion of Knowledge Graphs and Event Sequences as input data to effectively solve state representation and outcome prediction (regression or classification). Embodiments of the invention enable the efficient application of AI algorithms on the fused data, allowing to incorporate relevant background knowledge in event sequence predictions or classifications. This is relevant for a wide range of applications such as administrative case handling, cyber-physical systems with control loops, business processes, etc.
According to embodiments, the present invention relates to a method and system for predicting and classifying event sequences embedded in a knowledge graph. More specifically, embodiments of the invention provide methods and systems that provide enhanced outcome and attribute prediction for event sequences by using graph-structured knowledge in a three layered process which includes an knowledge graph embedding layer, an event embedding layer and a prediction layer. The knowledge graph embedding layer may be trained in an unsupervised manner, while the event embedding layer and prediction layer may be trained end-to-end in a supervised manner. By using graph-structured knowledge in such a three layered process, the present invention can provide enhanced outcome and attribute prediction for event sequences. The trained model may be used to dynamically update predictions when new events arrive. More specifically, the event embedding layer may be used to compute an embedding of an event, and the prediction layer may be used to update the prediction about the event sequence.
As already mentioned above, according to an embodiment the event embedding layer may be trained end-to-end in a supervised manner. In this regard, an event history containing previous event sequences with known outcomes may be used as training samples for training the event embedding layer.
According to embodiments, the event embedding layer may be configured to determine the event embedding vectors by using a parametrized function that maps the graph neighborhood of any event in the fused event and knowledge graph to a fixed-dimensional vector. In this context, it may be provided that the graph neighborhood of an event comprises the 2-hop neighborhood of the event. As will be appreciated by those skilled in the art, alternative definitions of the graph neighborhood of an event may be employed likewise (e.g., a 3-hop neighborhood).
According to embodiments of the invention, upon arrival of new events, the trained prediction model may be used to dynamically update predictions. In this regard, it may be provided that updating predictions includes using the event embedding layer to compute an embedding of the event from the embeddings of the related knowledge graph nodes. Additionally, it may be provided that the resulting embedding is fed into the prediction layer to update the predictions about the event sequence.
According to embodiments of the invention, the prediction layer may be configured to generate the prediction output by using a transformer-based sequence model. However, at this point it should be noted that the invention is not restricted to any particular model architectures for the three layers specified. Many graph embedding mechanisms (for the graph embedding layer), neighbor aggregation functions (for computing the event embeddings), and sequence models have been proposed in literature, and those skilled in the art will be able to identify and implement suitable embodiments of these layers.
According to embodiments of the invention, it may be provided that the third layer, or the second and the third layer, or all three layers is/are re-trained from time to time. In particular, such re-training may be executed as soon as an amount of new events that has arrived exceeds a critical mass defined by a configurable threshold. The amount of critical mass of new events may be configurable, or dynamically adjusted if e.g. the predictive accuracy drops below a certain threshold.
According to embodiments, the obtained predictions may be used for automated decision making in a control loop. Specifically, the control loop may be configured to receive as input a desired outcome (i.e. setpoint) and to use the trained prediction model to predict the outcome for several possible actions, leading to a several expected outcomes. The control loop may include a control element that selects those actions that lead to an expected outcome having the smallest difference to the setpoint.
According to an embodiment, the present invention relates to a tangible, non-transitory computer-readable medium having instructions thereon which, upon being executed by one or more processors, alone or in combination, provide for execution of a method according to one or more of the aspects described above.
There are several ways how to design and further develop the teaching of the present invention in an advantageous way. To this end it is to be referred to the dependent claims on the one hand and to the following explanation of preferred embodiments of the invention by way of example, illustrated by the figure on the other hand. In connection with the explanation of the preferred embodiments of the invention by the aid of the figure, generally preferred embodiments and further developments of the teaching will be explained.
The control system 100 includes a control element 120 that is configured to select those actions that lead to the expected outcome having the smallest difference to the setpoint. Those actions will act on a plant (the real world) 130, which might also be influenced by other events (which cannot be controlled) and/or noise, as shown at 140.
According to an embodiment of the invention, the output of the plant 130, i.e. a changed state, is then taken as the input to a state representation update module 150. This module 150 is configured to update the representation of the real world.
Further, the control system 100 includes a prediction model 160 to predict the outcome based on the state representation. The model 160 is trained in accordance with embodiments of the present invention by using context knowledge 170 about the application environment as well as an event history 172 collected by previous instantiations and iterations of the control loop. The model 160 is configured to predict the outcome for several possible actions, leading to several expected outcomes 180i. This then closes the control loop, because, again, the control system 100 will take the actions leading to an expected outcome 180i close to the desired outcome 110. The system 100 will stop, as soon as the desired outcome 110 is reached.
As already mentioned above,
As shown in
Representing the State as a Sequence of Events with links into a Knowledge Graph has the following advantages: Using a Sequence model allows for taking into account the order of events, the time information and the progress over time. Fusing the sequence of events with a knowledge graph, allows to capture the attributes for each event, but also to capture complex relationships, as well as high-order features (e.g. motifs—statistically significant subgraph structures) captured by the graph structure.
According to embodiments of the invention, the machine learning prediction model 160 may be specifically designed to take all available structural information into account, including the knowledge graph 220, the stream/sequence of events 210, and the links between those parts, which form the unified graph 200. Furthermore, the machine learning prediction model 160 may be designed to enable real-time predictions in face of new events that are again inter-linked with the knowledge graph 220, without having to run the training procedure again even though the structure of the unified graph 200 may have changed.
According to embodiments of the invention, the machine learning model is built up from three layers with a specific training methodology in order to achieve both of the above design goals. More specifically, the prediction model 160 may be composed of the following three layers:
The Graph Embedding Layer may be a configured to assign a fixed-dimensional graph embedding vector to each node and relation type in the fused Event and Knowledge Graph 200. As explained above, the fused graph 200 contains all available structural information, including events, knowledge graph nodes, and the links between them. This way it is made sure that the full potential of the available information is made use of. The output of the first layer, i.e. the graph embedding vectors, may serve as the input to the next, i.e. the second layer.
The Event Embedding Layer may be configured to receive as input the graph embedding vectors from the first layer, and to assign to each event a fixed-dimensional event embedding vector. This vector may be determined by a parameterized function mapping the graph neighborhood (e.g. 2-hop neighborhood) of any event in the fused event and knowledge graph 200 to a fixed-dimensional vector.
In this context, it is important to note that events are also nodes of the fused graph 200, so already in the first layer they have received an embedding vector. However, in the second layer they receive an embedding vector that only depends on their neighborhood in the knowledge graph.
The Prediction Layer may be configured to receive as input a sequence of event embeddings from the second layer, and to provide as output a prediction of an unknown property of the event sequence used as input, including but not limited to expected outcomes of actions. For example, if the outcome is discrete (e.g. a court decision ‘charge decision because enough evidence gathered’, or ‘charge decision because not enough evidence gathered’, or ‘charge decision for other reasons’; or a patient outcome ‘person healthy’, ‘person non-healthy’), the prediction layer will function as a classifier. On the other hand, if the outcome is continuous (e.g. blood pressure of a person), the sequence model will be a regression model.
According to an embodiment of the invention, the model training may be performed as follows:
The first layer of the prediction model, i.e. the graph embedding layer, may be trained using an unsupervised graph embedding training algorithm on the unified events and knowledge graph 200. The goal of this training step is to obtain a meaningful representation of the graph nodes and relation types. Taking not only the knowledge graph 220 as input for the embedding training, but taking the events as well has the advantageous effect of discovering additional relations that were not present in the pure knowledge graph 220 (for example, a pair of drugs that are often prescribed together are likely to be related to each other).
The second and the third layers of the prediction model, i.e. the event embedding layer and the prediction layer, may be trained together in an end-to-end fashion using supervised learning. It is assumed that the event history contains previous event sequences with known outcomes, which can be used as training or validation examples.
According to embodiments of the invention, it may be provided that, as new events arrive, the event embedding layer is used to compute an embedding of the new events, and those embeddings are used as input to the prediction layer, updating the predictions. Usage of the event embedding layer has the advantage of not having to re-train as new events arrive.
According to an embodiment, the present invention provides a computer-implemented method for event sequence forecasting comprising the steps of
Over time, new event sequences will be recorded by the system, and new training labels will become available. Therefore, it will be beneficial to re-train the three layers from time to time. To save computational resources, it is possible to re-train only the third layer, or only the second and third layer.
It should be noted that the present invention does not impose any restrictions on the model architectures for the three layers specified. As such, any graph embedding mechanism (for the graph embedding layer), neighbor aggregation functions (for computing the event embeddings), and sequence models as described in literature may be employed.
In an embodiment of the event embedding layer, for each event, the layer may provided as an input a set of vectors (e.g. the embeddings of entities and relationships in the events' 2-hop neighborhood) V={v1, v2, . . . , vl}, with vi∈n×1, i∈1, . . . , l. Here, l describes the maximum number of 2-hop neighbors any entity has, and vl, . . . vl-k=[
n×1 if an event has l−k 2-hop neighbors. n describes the fixed number of elements in the embedding vectors. As an output, an aggregation function may give a vector vagg ∈
n×1. The machine learning model may then be initialized to return the mean of the input vectors vagg,0=mean(v1, v2, . . . , vl), but will be trained to update its parameters in an end-to-end training process as described above.
Embodiments of the present invention provide to the advantage that, by integrating knowledge graphs into the prediction of event sequences, performance accuracy is significantly improved. Furthermore, embodiments of the present invention are computationally more efficient than prior art solutions since there is no need to (re)compute node embeddings as new events are observed by the system (in fact, only occasional retraining is advisable).
According to embodiments of the invention, the prediction model 160 including the context knowledge 170 and the event history 172, as well as the updated state representation module 150 including the initial state 152 shown in
Embodiments of the present invention can be implemented in the context of various application scenarios. Generally, any time-dependent process or control problem could be a targeted that can be expressed as a sequence of events and for which knowledge can be graph-modeled, for instance processes with event data, including payroll management, case management, or day care management. Hereinafter, a number of exemplary use cases are outlined in some more detail:
This embodiment describes a process in a public safety context for risk prediction and supervision of (previous) offenders with the goal of avoiding re-offending. In order to do this, for each offender a so-called risk-score may be computed, describing the risk of re-offending under certain given information. The goal of this embodiment is to create an optimal supervision process for each person, who has previously offended: Actions can for example be more intense supervision (with counselors), but also closer monitoring (e.g. automated electronic monitoring with GPS monitoring, radiofrequency monitoring, e.g., as disclosed at https://offender-management.com/) or adaption of probation regulations.
In this case, the knowledge graph describes the background information on the offenders (e.g. crime types, locations, weapons, educational background, relations to other offenders), and the events in the process are specific noticeable events happening for the offender recorded in the police database (e.g. the offender being witness to another crime or fight, the offender being at a location known for drug dealing). The goal is to predict (and dynamically adjust) the risk score in order to ensure a timely reaction to increasing (or decreasing) risk of the offender re-offending. In this case, real-time online updates may be necessary to react to the offenders changed circumstances by adjustment of supervision. Based on the predicted outcome as risk-score (real-time) the related actions can be adapted. An example: The method in accordance with embodiments of the present invention predicts a significantly higher risk score due to an event related to the offender. In reaction to this, an external system for electronic monitoring could be triggered to start monitoring the location and to restrict the offenders from specific geographical areas or even to enforce curfew hours (given that this is within probation regulations decided by the court).
In this embodiment the process is a process in a public safety context, from call-room (emergency calls) up until court decisions. The knowledge graph describes the background information (e.g. people involved, place of crime, weapons, evidence information), and the events in the process are the steps, e.g. emergency call, talk to the dispatcher, officers visiting the place of crime, interviews with suspects, etc. In this case, real-time online updates may be necessary, in order to react to new events (not only after costly re-training of the representation). The method in accordance with embodiments of the present invention can be used to predict the outcome of the case, i.e. the court outcome (charge, or no charge decision). Based on the predicted outcome (real-time) the case-related actions can be adapted. An example: The method in accordance with embodiments of the present invention predicts a negative court outcome because not enough evidence is presented. In reaction to this, an external system for automated information gathering may be triggered through a suitable automatically sent message to the external system, to enhance evidence. The system for automated information gathering may automatically set-up calls and connect the investigators with potential witnesses or suspects, and may automatically scan databases from related available video surveillance systems for related useable information (e.g., pictures showing the suspect close to the place to crime). The thus additional collected evidence may be used to potentially change the court outcome.
This embodiment is about the treatment of a patient in the hospital. The patient with their symptoms, background information, and health-related attributes (e.g. age, weight, . . . ) is represented in the knowledge graph. The treatment steps for the patient are the events in the process. The goal is to predict the outcome, i.e. will the patient be cured or not with the given attributes and (planned) treatment steps. By formulating the problem in this way, it will be possible to simulate potential future treatment steps and analyze the different predicted outcomes. When treatment steps are actually happening in real-life the embedding and representation will be modified in real-time, in order to timely react to changes. The outcome of this simulation will be used to adapt the treatment steps: The outcomes of the simulation can be evaluated and applied by a human, e.g., by adjusting the medical devices (e.g. ventilator or the amount of morphine). Alternatively, these devices can be directly adjusted without a human in the loop to directly modify a machine to give more or less medicine—or to modify the dispensed daily amount and/or mix medication for patients on an individual level (within pre-defined bounds or according to bounds derived from a knowledge graph). For instance, the outcomes of the simulation may be used for controlling a medication dispensing robot (as described in https://www.pc-control.net/pdf/special_packaging_2014/solutions/pcc_special_packaging_2014_eco-dex_d.pdf).
The sensor network of a smart city provides a rich set of information including satisfaction of the citizen, crimes, waste, traffic issues, control of water and electricity. In this context, methods and systems described herein are able to provide recommendations to optimize and redirect the traffic flow in respect of pollution, scene of crime, traffic jams. Hence, in accordance with an embodiment of the invention it may be provided that all of this information is gathered through the sensor network as a Knowledge Graph and that the future development/trend for a city district is computed and predicted. Events in this case are for example traffic jams, road accidents, road blocks, high traffic volume, described over time by the event sequence. A timely reaction is needed in order to assure safety of all people at all time-steps (e.g., timely reaction to traffic accidents). The knowledge graph is needed in order to include information on relations and structures given in the city (e.g., road networks). The system predicts the outcome, e.g. “high fine dust pollution expected on a certain day”. Based on the predictions an intelligent traffic routing system can adjust the traffic for that day, e.g. conduct automated road blocks and adjust traffic lights.
Many modifications and other embodiments of the invention set forth herein will come to mind to the one skilled in the art to which the invention pertains having the benefit of the teachings presented in the foregoing description and the associated drawings. Therefore, it is to be understood that the invention is not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.
While subject matter of the present disclosure has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. Any statement made herein characterizing the invention is also to be considered illustrative or exemplary and not restrictive as the invention is defined by the claims. It will be understood that changes and modifications may be made, by those of ordinary skill in the art, within the scope of the following claims, which may include any combination of features from different embodiments described above.
The terms used in the claims should be construed to have the broadest reasonable interpretation consistent with the foregoing description. For example, the use of the article “a” or “the” in introducing an element should not be interpreted as being exclusive of a plurality of elements. Likewise, the recitation of “or” should be interpreted as being inclusive, such that the recitation of “A or B” is not exclusive of “A and B,” unless it is clear from the context or the foregoing description that only one of A and B is intended. Further, the recitation of “at least one of A, B and C” should be interpreted as one or more of a group of elements consisting of A, B and C, and should not be interpreted as requiring at least one of each of the listed elements A, B and C, regardless of whether A, B and C are related as categories or otherwise. Moreover, the recitation of “A, B and/or C” or “at least one of A, B or C” should be interpreted as including any singular entity from the listed elements, e.g., A, any subset from the listed elements, e.g., A and B, or the entire list of elements A, B and C.
Number | Date | Country | Kind |
---|---|---|---|
21177403.9 | Jun 2021 | EP | regional |
This application is a U.S. National Phase application under 35 U.S.C. § 371 of International Application No. PCT/EP2021/066554, filed on Jun. 18, 2021, and claims benefit to European Patent Application No. EP 21177403.9, filed on Jun. 2, 2021. The International Application was published in English on Dec. 8, 2022 as WO 2022/253449 A1 under PCT Article 21(2).
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/066554 | 6/18/2021 | WO |