AUTOMATED DECISION MODELLING FROM TEXT

Information

  • Patent Application
  • 20230297784
  • Publication Number
    20230297784
  • Date Filed
    March 17, 2022
    2 years ago
  • Date Published
    September 21, 2023
    a year ago
Abstract
The present inventive concept provides for a method for automated decision modelling from text including obtaining a text corpus including a policy. Terms and syntax are identified within the text corpus related to the policy. Sentence similarities and co-references based on the terms and syntax are identified. Discourse and sentence level semantic parsing is performed based on the terms and the sentence similarities and the co-references using machine learning. A decision model template is generated based on the discourse and semantic parsing, and the decision model template is transformed into an automated decision model.
Description
Claims
  • 1. A method for automated decision modelling from text, the method comprising: obtaining a text corpus including a policy;identifying terms and syntax within the text corpus related to the policy;identifying sentence similarities and co-references based on the terms and the syntax;performing discourse and sentence level semantic parsing based on the terms and the sentence similarities and the co-references using machine learning;generating a decision model template based on the discourse and sentence level semantic parsing; andtransforming the decision model template into an automated decision model.
  • 2. The method of claim 1, wherein the terms and syntax within the text corpus are identified using natural language processing (NLP), and wherein the identified terms include terms that are related to a decision and criteria therefor.
  • 3. The method of claim 2, wherein the NLP includes named-entity recognition (NER) and a parsing tree.
  • 4. The method of claim 3, wherein the identifying the terms includes the use of a knowledge graph.
  • 5. The method of claim 1, wherein the text corpus includes a decision, and wherein the similar sentences are rules in the decision.
  • 6. The method of claim 5, wherein the identifying sentence similarities and co-references is based upon predetermined thresholds of matching terms, synonyms, semantics, syntax, and ontology.
  • 7. The method of claim 6, wherein anaphora resolution (AR) is used to identify synonyms to identified terms and pronouns, wherein sentence fragments of the sentence similarities and co-references are given text embedding vectors using pretrained language models, wherein a bag of noun-phrases and verb-phrases is extracted from the sentence fragments using an abstract meaning representation (AMR) parser, and wherein ssimilarity between any pair of sentence fragments is calculated as an aggregated similarity of the text embedding vectors and the bags of noun-phrases of verb-phares using similarity between sentence embeddings.
  • 8. The method of claim 1, further comprising: identifying categories for text spans of the policy related to a decision in the text corpus; andgenerating a decision model template outlining decision logic and identified categories from the text corpus.
  • 9. The method of claim 8, further comprising: parsing text spans based upon rhetoric structure theory (RST); andusing a cluster algorithm to group text spans based on embeddings into the identified categories.
  • 10. The method of claim 9, wherein the decision model includes a decision requirement diagram (DRD) depicting the identified categories and decision logic in the text corpus.
  • 11. The method of claim 10, wherein the DRD includes a hierarchy of decision boxes based upon the categories and decision logic, wherein each decision box includes a decision table with rules expressed in Boolean, wherein the Boolean is extracted from text spans using curated labelled data and AMR parsers with provided syntactical dependency rules for extracting logical expressions, and wherein the rules expressed inside the decision tables are in the standardized S-FEEL language.
  • 12. A computer program product for automated decision modelling from text, the computer program comprising: one or more computer-readable storage media and program instructions stored on the one or more computer-readable storage media, the program instructions including a method for the automated decision modelling from text, the method comprising: obtaining a text corpus including a policy;identifying terms and syntax within the text corpus related to the policy;identifying sentence similarities and co-references based on the terms and the syntax;performing discourse and sentence level semantic parsing based on the terms and the sentence similarities and the co-references using machine learning;generating a decision model template based on the discourse and sentence level semantic parsing; andtransforming the decision model template into an automated decision model.
  • 13. The method of claim 12, wherein the terms and syntax within the text corpus are identified using natural language processing (NLP), and wherein the identified terms include terms that are related to a decision and criteria therefor.
  • 14. The method of claim 13, wherein the NLP includes named-entity recognition (NER) and a parsing tree.
  • 15. The method of claim 14, wherein the identifying the terms includes the use of a knowledge graph.
  • 16. The method of claim 12, wherein the text corpus includes a decision, and wherein the similar sentences are rules in the decision.
  • 17. The method of claim 16, wherein the identifying sentence similarities and co-references is based upon predetermined thresholds of matching terms, synonyms, semantics, syntax, and ontology.
  • 18. The method of claim 17, wherein anaphora resolution (AR) is used to identify synonyms to identified terms and pronouns, wherein sentence fragments of the sentence similarities and co-references are given text embedding vectors using pretrained language models, wherein a bag of noun-phrases and verb-phrases is extracted from the sentence fragments using an abstract meaning representation (AMR) parser, and wherein similarity between any pair of sentence fragments is calculated as an aggregated similarity of the text embedding vectors and the bags of noun-phrases of verb-phares using similarity between sentence embeddings.
  • 19. The method of claim 12, further comprising: identifying categories for text spans of the policy related to a decision in the text corpus; andgenerating a decision model template outlining decision logic and identified categories from the text corpus.
  • 20. A computer system for automated decision modelling from text, the system comprising: one or more computer processors, one or more computer-readable storage media, and program instructions stored on the one or more of the computer-readable storage media for execution by at least one of the one or more processors, the program instructions including a method for the automated decision modelling from text, the method comprising: obtaining a text corpus including a policy;identifying terms and syntax within the text corpus related to the policy;identifying sentence similarities and co-references based on the terms and the syntax;performing discourse and sentence level semantic parsing based on the terms and the sentence similarities and the co-references using machine learning;generating a decision model template based on the discourse and sentence level semantic parsing; andtransforming the decision model template into an automated decision model.
  • 21. The method of claim 20, wherein the terms and syntax within the text corpus are identified using natural language processing (NLP), and wherein the identified terms include terms that are related to a decision and criteria therefor.
  • 22. The method of claim 21, wherein the NLP includes named-entity recognition (NER) and a parsing tree.
  • 23. The method of claim 22, wherein the identifying the terms includes the use of a knowledge graph.
  • 24. The method of claim 20, wherein the text corpus includes a decision, and wherein the similar sentences are rules in the decision.
  • 25. The method of claim 24, wherein the identifying sentence similarities and co-references is based upon predetermined thresholds of matching terms, synonyms, semantics, syntax, and ontology.