Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Process flows are widely encountered in many fields of knowledge. For example, organizations may perform processes in order to deliver services or products to customers.
It may be desirable to create a model of such process flows, for example in the context of computer processing. However, establishing such process models can be time-consuming and error-prone. This can be attributable to reliance upon knowledge of details of a specific process by domain experts, who themselves may not be familiar with model creation.
Embodiments afford recommendations in the accurate modeling of complex process flows. A repository is provided of known models (in graph form) of complex processes. Semantics expressed by the repository models, are constrained within an existing vocabulary (e.g., one that does not include a particular term). During an initial training phase, a fine-tuned sequence-to-sequence language model is generated from a pre-trained language model (e.g., T5) and semantics of the known repository process models, using transfer-learning techniques (e.g., from Natural Language Processing—NLP). During runtime, an incomplete process model (also in graph form) is received having an unlabeled node. Embodiments afford a recommendation of a label for that node, based upon the fine-tuned sequence-to-sequence language model. The resulting recommended node label, is expressed in a vocabulary that extends beyond the limited vocabulary of the repository (e.g., includes the particular term). In this manner, accuracy and/or flexibility of modeling a complex process flow (e.g., affording a node label recommendation), can be enhanced.
The following detailed description and accompanying drawings provide a better understanding of the nature and advantages of various embodiments.
Described herein are methods and apparatuses that implement process modeling. In the following description, for purposes of explanation, numerous examples and specific details are set forth in order to provide a thorough understanding of embodiments according to the present invention. It will be evident, however, to one skilled in the art that embodiments as defined by the claims may include some or all of the features in these examples alone or in combination with other features described below, and may further include modifications and equivalents of the features and concepts described herein.
The storage layer includes a process model repository 108. This repository comprises a number of different known process models 110 in the form of graphs 112 (e.g., including nodes and edges). One specific example of a graph is a directed attribute graph.
Each of the known process models further includes a respective vocabulary 114 comprising terms 116. Examples of such terms can be labels for specific nodes of the existing process model.
During a training phase 118, the modeling engine seeks to create an enriched, fine-tuned language model 120, relying on transfer learning techniques. That fine-tuned model includes a vocabulary 140 of a scope extending beyond the existing vocabulary of the models in the repository.
Accordingly, the modeling engine creates the fine-tuned language model by extracting 124 sequences from models of the repository. Next, the modeling engine verbalizes 126 the extracted sequences.
The storage layer also includes a pre-trained sequence-to-sequence language model 128 (one possible example of which is T5). The pre-trained language model includes a vocabulary 130 comprising term(s) 132 not present in any of the models of the repository.
Accordingly, the modeling engine references 133 the pre-trained language model and the verbalized extracted sequences to generate 134 the fine-tuned language model 120. That fine-tuned model is stored 135 in the storage layer. The fine-tuned model language model comprises a vocabulary 140 that includes terms 142 outside of the model repository.
Next, during a runtime 144 the modeling engine receives 145 from a user 146, an incomplete process model 148. That incomplete process model is also in graph form, comprising nodes and edges.
However, the incoming process model received from the user is incomplete. That is, the incomplete process model features at least one node 150 that is not labeled.
In order to recommend a label for the node(s), the modeling engine extracts 152 sequences from the incomplete model. Then, the modeling engine verbalizes 154 those extracted sequences to create input sequences 156.
Then, the input sequences are processed 158 according to the fine-tuned language model. This may comprise activity recommendation processing 159. The resulting output sequences 160 are ranked and stored 162 in the database.
Next, in order to afford 164 an output to the user, the output sequences are retrieved 166. These output sequences include at least one recommended node label 168 that includes a term 170 that is not present in existing vocabularies of the known repository models.
Process modeling performed according to embodiments, may offer one or more benefits. One possible benefit is increased accuracy.
In particular, process model recommendation relying upon a repository, may find limited applicability in situations where a process model under development includes activities not present in the models of the repository. That is, embodiments relying upon the model repository can only recommend activity labels (or, at best, combinations of label parts) which already exist in the repository. This can limit the scope of recommended labels, and reduce a usefulness of recommendation for modeling of new processes.
At 204, an incomplete process model is received. At 206, a sequence is extracted from the incomplete process model.
At 208, the sequence is verbalized to create an input sequence. At 210, the input sequence is processed by the fine-tuned language model to generate an output sequence.
At 212, the output sequence is stored. At 214, the output sequence is afforded as a label recommendation for an unlabeled node of the incomplete process model.
Further details regarding process modeling according to various embodiments, are now provided in connection with the following example. In this particular example, process modeling is implemented based upon Business Process Model and Notation (BPMN).
The BPMN model in
Following this decision point, the model synchronizes the two branches using an XOR-join gateway. After this gateway, a new activity has been inserted.
The activity-recommendation task is to suggest one or more suitable labels for that new activity. As shown in
This recommended label is appropriate here, because the preceding nodes indicate that the outcome of handling of the claim has now been determined. Following this outcome, it is natural and appropriate to seek to inform the insurance claimant.
Modeling of this insurance claim process, may be complicated by one or more factors. For example, in cross-departmental settings (such as is shown in
Another factor complicating accurate formation of a process model, is that an individual possessing specialized knowledge of the process domain (e.g., insurance claim handling), may likely be unfamiliar with even the general outline of how to create a model of such a process.
Accordingly, this example offers an approach for activity recommendation which uses a transformer-based, sequence-to-sequence language model (e.g., T5). This transformer-based, sequence-to-sequence language model extends recommendation capabilities to models and activities above and beyond those specifically available in training data (e.g., an existing repository of known process models).
Sequence-to-sequence models may call for ordered, textual sequences as input, whereas process model nodes can be only partially ordered. Thus, a first phase may be to lift activity recommendation to the format of sequence-to-sequence tasks.
Sequence-to-sequence tasks are concerned with finding a model that maps a sequence of inputs (x1, . . . , xT) to a sequence of outputs (y1, . . . , yT′). The output length T′ is unknown a priori, and may differ from the input length T.
One example of a sequence-to-sequence problem in NLP, is machine translation. There, the input sequence is given by text in a source language. The output sequence is the translated text in a target language.
In the context of activity recommendation, the output sequence corresponds to the activity label λ({circumflex over (n)}) to be recommended for node {circumflex over (n)}. This may comprise one or more words, e.g., “notify about outcome” as shown in
Defining the input sequence can be more complex. This is because the input to an activity-recommendation task comprises an incomplete process model M1, whose nodes may be partially ordered rather than forming a single sequence.
Thus, embodiments convert a single activity-recommendation task, into one or more sequence-to-sequence tasks. To accomplish this, embodiments first extract multiple node sequences from M1 that each end in {circumflex over (n)}.
Formally, we write that Sl{circumflex over (n)}=(n1, . . . , nl) is a node sequence of length l, ending in node {circumflex over (n)} (nl={circumflex over (n)}), for which it must hold that ni is a predecessor of ni+1 for all i=1, . . . , l−1.
Then, since an input sequence should comprise text (rather than model nodes), embodiments apply verbalization to the node sequence. This verbalization strings together the types and (cleaned) labels of the nodes in Sl{circumflex over (n)}, i.e., τ(nl)λ(n1) . . . τ(nl−1)λ(nl−1)τ({circumflex over (n)}).
For example, using sequences of length four (4), we obtain the following two (2) verbalized input sequences for the recommendation problem in
This sequence extraction and verbalization, is used to fine-tune a transformer-based sequence—to sequence model (e.g., T5) for activity recommendation. Specifically, having lifted activity recommendation to the format of sequence-to-sequence tasks, embodiments may then fine-tune a transformer-based sequence-to-sequence model for activity recommendation based on process knowledge encoded in a process model repository.
In one example, we fine-tune a transformer-based sequence-to-sequence model (such as T5) for activity recommendation. This is done by extracting a large number of sequence-to-sequence tasks from the models in an available process model repository . Specifically, for each model M∈, we extract possible sequences of a certain length l that end in an activity node, i.e., (n1, . . . , nl).
Afterwards, we apply verbalization on this node sequence to get the textual input sequence, as described above. Whereas, the output sequence corresponds to the label of nl.
As one possible example, consider the exemplary training process model (stored within a repository) that is depicted in
Setting l=4, the training process model comprises nine (9) sequences of length four (4), that end in an activity node. Following verbalization, these nine sequences result in the following textual (input,output) sequences:
These can be used to fine-tune a transformer-based sequence-to-sequence language model (e.g., T5) for use in activity recommendation.
Once fine-tuned, the language model may be used to solve instances of the activity-recommendation problem. We first solve multiple sequence-to-sequence tasks, whose results are then aggregated in order to return one or more label recommendations.
Label recommendations may be generated as follows. Given an incomplete process model M1 with an unlabeled activity node {circumflex over (n)}, for which we want to provide label recommendations, we first extract all sequences of length l that end in {circumflex over (n)}.
We then verbalize these sequences and feed the resulting input sequences as sequence-to-sequence tasks into the fine-tuned sequence-to-sequence model.
Looking again to the insurance claim process example of
We solve the individual sequence-to-sequence tasks, by feeding each input sequence into the fine-tuned sequence-to-sequence model. This generates ten (10) alternative output sequences (i.e., 10 possible labels) per input.
To do this, we use beam search as a decoding method, with beam width w=10. The beam search procedure uses conditional probabilities to track the w most likely output sequences at each generation step.
The beam search procedure can lead to output sequences that repeat words or even short sequences, i.e., n-grams. Accordingly, following activity labeling convention, we favor the suggestion of short labels that do not contain any recurring terms. For example, rather than suggesting labels such as:
In order to achieve this, we apply n-gram penalties during beam search. Specifically, we penalize the repetition of n-grams of any size (including single words) by setting the probability of next words that are already included in the output sequence to zero. The tables shown in
Finally, we aggregate the different lists of output sequences obtained by using beam search to solve individual sequence-to-sequence tasks. We end up with a single list of ranked recommended activity labels.
The contents of the lists may be aggregated using a maximum strategy. This maximum strategy may be employed by rule-based methods to rank proposed entities according to the different confidence values of the rules that suggested them.
To apply the maximum strategy, we establish an aggregated recommendation list, sorted according to the maximal probability score that a recommended label received. For example the “notify about outcome” label receives a score of 0.64 from the output sequences generated for I1, even though that label also appears in I2's list (with a score of 0.42).
If two recommendations have the same maximum probability, we may sort them based on their second-highest probability, if available. Analogously, if two recommendations share maximum and second-highest probability, we continue until we find a probability that makes a difference.
In the end, this example embodiment thus provides a list of ten (10) ranked label recommendations for the unlabeled node {circumflex over (n)}, that are the most probable candidates. These recommendations are arrived at according to:
The final list obtained for the running example, is shown in the table of
It is noted that the instant example relates to processes models that are in the Business Process Model and Notation (BPMN) format. However, process modeling is not limited to this or any other specific process model notation format.
For example, embodiments could be applied to a process model in the Petri Nets graph notation, or in Unified Modeling Language (UML). Or, embodiments could be applied to any repository storing process models in abstracted form as directed attributed graphs.
And while the instant example describes beam search, this is not required. Other forms of decoding methods for language generation with transformer-based models, including random search involving sampling, could be employed.
Moreover, while the instant example describes aggregation using a maximum strategy, this is also not required. Other embodiments could employ other aggregation strategies, including a noisy OR approach.
Returning now to
Rather, alternative embodiments could leverage the processing power of an in-memory database engine (e.g., the in-memory database engine of the HANA in-memory database available from SAP SE), in order to perform one or more various functions as described above.
Thus
In view of the above-described implementations of subject matter this application discloses the following list of examples, wherein one feature of an example in isolation or more than one feature of said example taken in combination and, optionally, in combination with one or more features of one or more further examples are further examples also falling within the disclosure of this application:
Example 1. Computer implemented systems and methods comprising:
Example 2. The computer implemented systems or method of Example 1 further comprising:
Example 3. The computer implemented systems or methods of Examples 1 or 2 wherein the first graph comprises a directed attributed graph.
Example 4. The computer implemented systems or methods of Examples 1, 2, or 3 wherein the first graph is in the Business Process Modeling Notation (BPMN) format.
Example 5. The computer implemented systems or methods of Examples 1, 2, 3, or 4 further comprising:
Example 6. The computer implemented systems or methods of Example 5 wherein the ranking comprises aggregating.
Example 7. The computer implemented systems or methods of Examples 5 or 6 wherein the ranking comprises a maximum strategy.
Example 8. The computer implemented systems or methods of Examples 1, 2, 3, 4, 5, 6, or 7 wherein the processing further comprises beam search.
Example 9. The computer implemented systems or methods of Example 8 wherein the processing further comprises calculating an n gram penalty.
An example computer system 700 is illustrated in
Computer system 710 may be coupled via bus 705 to a display 712, such as a Light Emitting Diode (LED) or liquid crystal display (LCD), for displaying information to a computer user. An input device 711 such as a keyboard and/or mouse is coupled to bus 705 for communicating information and command selections from the user to processor 701. The combination of these components allows the user to communicate with the system. In some systems, bus 705 may be divided into multiple specialized buses.
Computer system 710 also includes a network interface 704 coupled with bus 705. Network interface 704 may provide two-way data communication between computer system 710 and the local network 720. The network interface 704 may be a digital subscriber line (DSL) or a modem to provide data communication connection over a telephone line, for example. Another example of the network interface is a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links are another example. In any such implementation, network interface 704 sends and receives electrical, electromagnetic, or optical signals that carry digital data streams representing various types of information.
Computer system 710 can send and receive information, including messages or other interface actions, through the network interface 704 across a local network z20, an Intranet, or the Internet 730. For a local network, computer system 710 may communicate with a plurality of other computer machines, such as server 715. Accordingly, computer system 710 and server computer systems represented by server 715 may form a cloud computing network, which may be programmed with processes described herein. In the Internet example, software components or services may reside on multiple different computer systems 710 or servers 731-735 across the network. The processes described above may be implemented on one or more servers, for example. A server 731 may transmit actions or messages from one component, through Internet 730, local network 720, and network interface 704 to a component on computer system 710. The software components and processes described above may be implemented on any computer system and send and/or receive information across a network, for example.
The above description illustrates various embodiments of the present invention along with examples of how aspects of the present invention may be implemented. The above examples and embodiments should not be deemed to be the only embodiments, and are presented to illustrate the flexibility and advantages of the present invention as defined by the following claims. Based on the above disclosure and the following claims, other arrangements, embodiments, implementations and equivalents will be evident to those skilled in the art and may be employed without departing from the spirit and scope of the invention as defined by the claims.