Aspects of the present disclosure relate to decision processing using machine learning models. More specifically, aspects of the present disclosure relate to ambiguity resolution in decision processing using machine learning models.
Machine learning models are well adapted to processing large quantities of data to find patterns of interest. Some data processing tasks that machine learning models excel at include cyber-security breach detection, fraud detection, investment trends, etc. However, outlier or ambiguous data can confuse the machine learning model. Ambiguous data may prevent a machine learning model from arriving at a definitive decision. As a result, human intervention, by an expert, may be necessary to overcome the ambiguous data. However, the expert is typically not aware of the progress made by the machine learning model in evaluating the data. Moreover, feedback provided by the expert to the machine learning model may not provide adequate information for the ambiguous data to be resolved by the machine learning model.
Certain aspects provide a method including identifying an ambiguity during decision processing of first input data by a decision machine learning (ML) model. The method may also include conveying the ambiguity to an expert agent for evaluation. The method may furthermore include receiving, by a large language model (LLM), feedback regarding the ambiguity from the expert agent. The method may in addition include determining, by the LLM, that the feedback, received from the expert agent, resolves the ambiguity. The method may moreover include generating second input data by the LLM, the second input data having the first input data and the feedback determined to resolve the ambiguity. The method may also include processing the second input data by the decision ML model to generate a decision based on processing of the second input data. The method may furthermore include outputting, by the LLM, the decision received from the ML model. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
Certain aspects provide a processing system including a memory having computer-executable instructions and a processor configured to execute the computer-executable instructions and cause the processing system to: identify an ambiguity during decision processing of first input data by a decision machine learning (ML) model; convey the ambiguity to an expert agent for evaluation; receive, by a large language model (LLM), feedback regarding the ambiguity from the expert agent; determine, by the LLM, that the feedback, received from the expert agent, resolves the ambiguity; generate second input data by the LLM, the second input data having the first input data and the feedback determined to resolve the ambiguity; process the second input data by the decision ML model to generate a decision based on processing of the second input data; and output, by the LLM, the decision received from the ML model. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
Certain aspects provide a method for providing a risk assessment of customer activity. For example, a method for providing a risk assessment of customer activity may include receiving customer activity as first input data. The method may also include evaluating the first input data by the decision ML model. The method may furthermore include identifying, by the decision ML model, an ambiguity preventing the decision ML model from satisfying a confidence threshold condition. The method may in addition include transmitting information relating to the ambiguity to an expert agent. The method may moreover include providing, to the decision ML model, second input data determined to resolve the ambiguity from the expert agent. The method may also include applying, by the decision ML model, the second input data determined to resolve the ambiguity to the first input data to complete evaluation of the first input data. The method may furthermore include outputting the completed evaluation of the first input data as a risk assessment decision. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
Other aspects provide processing systems configured to perform the aforementioned methods as well as those described herein; non-transitory, computer-readable media comprising instructions that, when executed by a processors of a processing system, cause the processing system to perform the aforementioned methods as well as those described herein; a computer program product embodied on a computer readable storage medium comprising code for performing the aforementioned methods as well as those further described herein; and a processing system comprising means for performing the aforementioned methods as well as those further described herein.
The following description and the related drawings set forth in detail certain illustrative features of one or more aspects.
The appended figures depict certain aspects and are therefore not to be considered limiting of the scope of this disclosure.
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the drawings. It is contemplated that elements and features of one embodiment may be beneficially incorporated in other embodiments without further recitation.
Aspects of the present disclosure provide apparatuses, methods, processing systems, and computer-readable mediums for resolving ambiguities identified by a decision machine learning (ML) model.
Decision ML models can be used for processing data to provide a binary (e.g., Yes/No) risk assessment for activities ranging from cybersecurity to credit card fraud. However, some data that the decision ML model may process to arrive at a risk assessment may not neatly match the training scenarios, giving rise to an ambiguity. Anomalies can prevent the decision ML model from generating an accurate decision. An ambiguity occurs when data is encountered that in some circumstances could signify a high risk, for example, but under other conditions the same data could indicate a low risk situation. Ambiguous data is data that results in an ambiguity.
When a decision ML model detects ambiguous data, human intervention from an expert in the field (also referred herein as an “expert agent”) may be necessary to resolve the ambiguity. When an expert agent is called upon to resolve the ambiguity, the expert agent may receive the particular data causing the ambiguity from the decision ML model, but generally very little additional information is provided. The expert agent may not be aware of the processing already performed on the input date by the decision ML model up to the point where the ambiguity was encountered, nor even the assessment by the decision ML model up to that point. Consequently, the expert agent is left having to reevaluate all the data of the case, not just the ambiguous data in order to render an accurate analysis (also referred to herein as “feedback” or “expert insight”) of the ambiguous data. Moreover, the expert agent may not have properly or fully addressed the issues causing the data to be considered ambiguous by the decision ML model. Thus, the evaluation of the ambiguous data by the expert agent still may not allow the decision ML model to continue the processing of the data to arrive at a final decision.
Aspects of the present disclosure provide for a decision ML model that is trained to provide details of the preceding processing, preliminary decisions based on the preceding processing, the ambiguous data, and any other pertinent information to an expert agent.
In some aspects of the present disclosure, a large language model (LLM) orchestrator is provided as an intermediary between the decision ML model and the expert agent that arranges the information provided by the decision ML model in a format that enhances readability by the expert agent. Additionally, the LLM orchestrator may be trained to evaluate the analysis of the ambiguous data provided by the expert agent to determine if the feedback fully and appropriately addresses the ambiguity so that the decision ML model can complete the processing of the data and arrive at an accurate final decision. Feedback by the expert agent that does not address the ambiguity is returned to the expert agent by the LLM orchestrator for further evaluation. The evaluation performed by the LLM orchestrator, to determine if the feedback is adequate for the decision ML model to resolve the ambiguity, requires complex analysis of the data and feedback. Moreover, such an analysis may be time prohibitive for an individual to perform, and thus impractical for an individual to perform manually, within the time frame that a decision would need to be rendered.
Aspects of the present disclosure are not limited to risk determinations, but rather can be applied to any decision data processing where the data is processed to obtain a binary result.
As shown in
The decision ML model 104 may obtain data 110 from multiple sources, such as a datastore 112, the Internet 114, user terminal 116 (e.g., laptop computer, desktop computer, mobile device, point-of-sale (POS) devices), electronic locks, video feeds, and other data sources depending on the particular application of the decision ML model 104. For example, in embodiments where the decision ML model is tasked with deciding if an account activity is fraudulent, the data 110 may be financial data, such as customer credit card activity, bank account activity, POS activities, and other relevant financial activities.
The decision ML model 104 processes the data 110 to identify, for example, an activity that may be fraudulent. When a decision (for example, Yes the activity is fraudulent, or No the activity is not fraudulent) is reached by the decision ML model 104, the result is provided to the LLM orchestrator 108 (represented by data transmission 1). The LLM orchestrator 108 formats the result from the decision ML model 104 as a decision output 118.
In some circumstances, the data being processed may include data that cannot be easily characterized by the decision ML model 104, such data, identified by the ambiguity detector 106, is termed “ambiguous data.” Since the decision ML model 104 is unable to continue to process other data until the ambiguous data is characterized, the ambiguous data along with additional supporting information, as described below, are provided to the LLM orchestrator 108 (data transmission 1). The LLM orchestrator 108 is configured to prepare, as an ambiguity report 120, the ambiguous data and supporting information for review by an expert agent. The ambiguity report 120 is transmitted to a workstation 122 of the expert agent. A completed evaluation of the ambiguous data is transmitted, as feedback 124, from the workstation 122 to the LLM orchestrator 108. The LLM orchestrator 108 analyzes the feedback 124 and determines if the feedback 124 resolves the ambiguity. Feedback 124 that resolves the ambiguity is provided to the decision ML model 104 (data transmission 2) so that the remaining data can be processed to arrive at a result. The decision ML model 104 can use the feedback 124 as labeled data for subsequent training, and provided to other decision ML models, as well. The result (including the data processed with the feedback) from the decision ML model 104 is provided to the LLM orchestrator 108 (data transmission 3) to format for output as the decision output 118.
While the majority of data received by the decision ML model 202 can be readily characterized, the decision ML model 202 may encounter data that cannot be characterized (e.g., ambiguous data) by the decision ML model 202 as trained. The decision ML model 202 may not be able to continue evaluating the remaining data to arrive at a final decision. The ambiguity identifier 204 identifies this ambiguous data and collects additional information useful for evaluating the ambiguous data. Additionally, the ambiguity identifier 204 generates a record of the previous processing performed by the decision ML model 202 up to a point when the ambiguous data is encountered. The ambiguous data, additional information, and the record of previous processing performed on the input data, are referred herein as “ambiguity evaluation information”.
The ML system 200 also includes an LLM orchestrator 206. The LLM orchestrator 206 receives outputs from the decision ML model 202 and provides the decision ML model 202 with expert insight based on the ambiguity evaluation information. For example, the decision ML model 202 may provide an initial decision to the LLM orchestrator 206 upon completing an evaluation. When the decision ML model 202 encounters ambiguous data, ambiguity evaluation information (also referred to as “ambiguity information”) is provided to the LLM orchestrator 206. Also, once the decision ML model 202 has received expert insights and evaluated the ambiguous data using the expert insights, the decision ML model 202 sends a decision reassessment to the LLM orchestrator 206. The LLM orchestrator 206 is trained to receive the initial decision, ambiguity evaluation information, and decision reassessment as input prompts. The LLM orchestrator 206 responds to the input prompts by generating an output in an appropriate format. For example, an initial decision or decision reassessment, may be output as a final decision 210 formatted as human-readable output. In other embodiments, the initial decision or decision reassessment may be output as a final decision 210 formatted as computer-readable output.
In some embodiments, the final decision 210 may be output to a fraud and risk services system. In cases where the decision ML model is evaluating consumer financial fraud, for example, the final decision 210 may be used by the fraud and risk services system to either allow or deny a fanatical transaction. In a case where the decision ML model is evaluating cybersecurity breaches, for example, the final decision 210 may be used by the fraud and risk services system to lock an account and/or notify a security specialist for further action. Aspects of the present disclosure are not limited to the examples described herein nor are aspects of the present disclosure limited to fraud and risk determinations. Other applications of aspects of the present disclosure can be realized without deviating from the scope of the present disclosure.
The LLM orchestrator 206 may be trained to generate an ambiguity report from a received ambiguity evaluation information, the ambiguity report may be formatted in a human accessible form that is evaluated by an expert agent 212. The expert agent 212 evaluates the ambiguity report to generate an expert feedback of the ambiguity data as an input prompt to the LLM orchestrator 206. The LLM orchestrator 206 evaluates the expert feedback to determine if the expert feedback adequately addresses the ambiguity data. In some embodiments, the LLM orchestrator 206 may be prompted to compare the expert feedback against the ambiguity identified by the decision ML model 202. Expert feedback that addresses the ambiguity data is formatted as an expert insight output to the decision ML model 202. As described above, the expert insight is applied by the decision ML model 202 to evaluate the ambiguity data and complete processing the data to arrive at a decision (e.g., decision reassessment). In some embodiments the expert insight is formatted by the LLM orchestrator as labeled data 208a that may be provided as training data to other decision ML models as well to further refine the decision ML models.
At step 304, the method 300 processes the input data 208b. Specifically, the input data 208b is characterized by the decision ML model 202. For example, a decision ML model 202 trained to detect credit card fraud evaluates the input data 208b to determine if a current transaction is a potential fraud event or an authorized event.
At step 306 the method 300 determines if an ambiguity is encountered. Step 306 of method 300 may, in some embodiments, be performed by the decision ML model 202 simultaneously with step 304. Thus, during the processing of the input data 208b, if the decision ML model 202 identifies an ambiguity by operation of the ambiguity identifier 204, then the method 300 proceeds to step 310. However, if processing the input data 208b is completed without encountering an ambiguity, then the method 300 proceeds to step 308. An ambiguity may be identified using reason codes created using Shapley values, and counterfactual explanations. Counterfactual Explanation and Shapley values represent “what if” scenarios that provide insights on what are the “minimal” changes required on the model inputs to reverse a model decision (e.g., from funds being held, to funds being released to the customer). In some embodiments, the method 300 may consider factors such as days since a transaction decline was observed for a particular merchant, processing volume for the merchant, and how long the merchant has been in business. In other embodiments, method 300 may consider factors such as a high payment volume growth anomalies, and total amount of declined transactions in the last 30 days. Each factor may be assigned a different weighting.
At step 308, the method 300 sends the results of the completed processing of the input data 206b to an LLM orchestrator (e.g., 206 in
As described above, the method 300 proceeds to step 310 when an ambiguity is identified by the ambiguity identifier 204 of the decision ML model 202. At step 310 the method 300 collects ambiguity information. Once the ambiguity information is collected, the method 300 proceeds to step 312.
At step 312 the method 300 sends the collected ambiguity information to the LLM orchestrator 206. The ambiguity information may include ambiguous data, additional information related to the ambiguous data, and a record of previous processing performed by the decision ML model 202.
Method 300 proceeds from either step 308 or step 312 to step 402 of the LLM orchestrator 206 method 400.
Turning now to
At step 404, the method 400 generates an output decision (e.g., final decision 210 at
When the LLM orchestrator 206 determines that the input prompt is ambiguity information, the method 400 proceeds to step 406. At step 406, the method 400 prepares an ambiguity report (e.g., 120 at
At step 408, the method 400 sends the ambiguity report 120 to the expert agent 212. Once the expert agent 212 has completed evaluation of the ambiguity report 120, the expert agent 212 prepares feedback (e.g., 124 at
At step 410, the method 400 receives the feedback 124 from the expert agent 212.
At step 412, the method determines whether the feedback 124 received from the expert agent 212 addresses the ambiguity sufficiently to allow the decision ML model 202 to resolve the ambiguity and arrive at a decision. Determining that the feedback 124 does not adequately address the ambiguity causes the method 400 to returns to step 410, where the LLM orchestrator 206 resubmits the ambiguity report 120 to the expert agent 212 for reevaluation. Determining that the feedback 124 does adequately address the ambiguity causes the method 400 to proceed to step 414.
At step 414, the method 400 sends the feedback 124 as an input to the decision ML model 202. Returning to
In some embodiments, methods 300 and 400 may be performed by an apparatus, such as processing system 700 of
At step 502, the method 500 identifies an ambiguity during decision processing of first input data by the decision ML model 202. The ambiguity may be identified using reason codes and counterfactual explanations. (e.g., from funds being held, to funds being released to the customer). In some embodiments, the method 500 may consider factors such as days since a transaction decline was observed for a particular merchant, processing volume for the merchant, and how long the merchant has been in business. In other embodiments, method 500 may consider factors such as a high payment volume growth anomalies, and total amount of declined transactions in the last 30 days. Each factor may be assigned a different weighting.
At step 504, the method 500 conveys the ambiguity to an expert agent (e.g., 212 in
At step 506, the method 500 receives, by the LLM, feedback (e.g., 124 in
At step 508, the method 500 determines, by the LLM, that the feedback, received from the expert agent, resolves the ambiguity. In some embodiments, at step 508, the method 500 instructs the LLM to verify that the feedback provides data that resolves the ambiguity. In some embodiments, at step 508, the method 500 instructs the LLM to verify that the feedback fails to resolve the ambiguity; and notify the expert agent that additional insight is needed.
At step 510, the method 500 generates second input data (e.g., expert insight in
At step 512, the method 500 processes the second input data by the decision ML model to generate a decision (e.g., decision reassessment) based on processing of the second input data.
At step 514, the method 500 outputs the decision (e.g., 210 in
In some embodiments, method 500 may be performed by an apparatus, such as processing system 700 of
At step 602, the method 600 to receive customer activity as first input data.
At step 604, the method 600 evaluates the first input data (e.g., 208b in
At step 606, the method 600 identifies an ambiguity preventing the decision ML model from satisfying a confidence threshold condition. The ambiguity may be identified using reason codes, and counterfactual explanations. In some embodiments, the method 600 may consider factors such as days since a transaction decline was observed for a particular merchant, processing volume for the merchant, and how long the merchant has been in business. In other embodiments, method 600 may consider factors such as a high payment volume growth anomalies, and total amount of declined transactions in the last 30 days. Each factor may be assigned a different weighting.
At step 608, the method 600 transmits information relating to the ambiguity to an expert agent (e.g., 212 in
At step 610, the method 600 provides, from the LLM to the decision ML model, second input data determined to resolve the ambiguity from the expert agent. In some embodiments, the method 600, at step 610, receives, by the LLM, feedback (e.g., 124 in
At step 612, the method 600 applies the second input data determined to resolve the ambiguity to the first input data to complete evaluation of the first input data.
At step 614, the method 600 outputs the completed evaluation of the first input data as a risk assessment decision (e.g., final decision 210 in
In some embodiments, method 600 may be performed by an apparatus, such as processing system 700 of
Processing system 700 is generally an example of an electronic device configured to execute computer-executable instructions, such as those derived from compiled computer code, including without limitation personal computers, tablet computers, servers, smart phones, smart devices, wearable devices, augmented and/or virtual reality devices, and others.
In the depicted example, processing system 700 includes one or more processors 702, one or more input/output devices 704, one or more display devices 706, and one or more network interfaces 708 through which processing system 700 is connected to one or more networks (e.g., a local network, an intranet, the Internet, or any other group of processing systems communicatively connected to each other), and computer-readable medium 712.
In the depicted example, the aforementioned components are coupled by a bus 710, which may generally be configured for data and/or power exchange amongst the components. Bus 710 may be representative of multiple buses, while only one is depicted for simplicity.
Processor(s) 702 are generally configured to retrieve and execute instructions stored in one or more memories, including local memories like the computer-readable medium 712, as well as remote memories and data stores. Similarly, processor(s) 702 are configured to retrieve and store application data residing in local memories like the computer-readable medium 712, as well as remote memories and data stores. More generally, bus 710 is configured to transmit programming instructions and application data among the processor(s) 702, display device(s) 706, network interface(s) 708, and computer-readable medium 712. In certain embodiments, processor(s) 702 are included to be representative of a one or more central processing units (CPUs), graphics processing unit (GPUs), tensor processing unit (TPUs), accelerators, and other processing devices.
Input/output device(s) 704 may include any device, mechanism, system, interactive display, and/or various other hardware components for communicating information between processing system 700 and a user of processing system 700. For example, input/output device(s) 704 may include input hardware, such as a keyboard, touch screen, button, microphone, and/or other device for receiving inputs from the user. Input/output device(s) 704 may further include display hardware, such as, for example, a monitor, a video card, and/or other another device for sending and/or presenting visual data to the user. In certain embodiments, input/output device(s) 704 is or includes a graphical user interface.
Display device(s) 706 may generally include any sort of device configured to display data, information, graphics, user interface elements, and the like to a user. For example, display device(s) 706 may include internal and external displays such as an internal display of a tablet computer or an external display for a server computer or a projector. Display device(s) 706 may further include displays for devices, such as augmented, virtual, and/or extended reality devices.
Network interface(s) 708 provide processing system 700 with access to external networks and thereby to external processing systems. Network interface(s) 708 can generally be any device capable of transmitting and/or receiving data via a wired or wireless network connection. Accordingly, network interface(s) 708 can include a communication transceiver for sending and/or receiving any wired and/or wireless communication. For example, Network interface(s) 708 may include an antenna, a modem, a LAN port, a Wi-Fi card, a WiMAX card, cellular communications hardware, near-field communication (NFC) hardware, satellite communication hardware, and/or any wired or wireless hardware for communicating with other networks and/or devices/systems. In certain embodiments, network interface(s) 708 includes hardware configured to operate in accordance with the Bluetooth® wireless communication protocol.
Computer-readable medium 712 may be a volatile memory, such as a random access memory (RAM), or a nonvolatile memory, such as nonvolatile random access memory, phase change random access memory, or the like. In this example, computer-readable medium 712 includes a decision ML model component 714 (e.g., decision ML model 202 in
In certain embodiments, the decision ML model component 714 is configured to receive input data, such as live data 208b in
The conveying component 718 is configured to transmit ambiguity-related information from the decision ML model component 714 to the LLM orchestrator component 716 as described above with respect to the method 300 shown in
The identifying component 720 is configured to identify ambiguities occurring during processing of the input data 208b. The identifying component 720 may be implemented as an ambiguity identifier (e.g., 204 in
The receiving component 722 is configured receive, by the LLM orchestrator component 716, feedback 124 from the expert agent 212, as described above with respect to the method 400 shown in
The verifying component 724 is configured to verify that feedback 124 received from the expert agent 212 resolves the ambiguity, as in method 400 described above with respect to
The generating component 726 is configured to generate second input data (e.g., expert insight) by the LLM orchestrator component 716, as in method 400 described above with respect to
The processing component 728 is configured as a functionality of the decision ML model 202. The processing component 728 processes the second input data to generate a decision (e.g., decision reassessment) based on processing the second input data, as in method 300 described above with respect to
The outputting component 730 is configured to output a final decision 210 by the LLM orchestrator 206, as in method 400 described above with respect to
Note that
Implementation examples are described in the following numbered clauses:
The preceding description is provided to enable any person skilled in the art to practice the various embodiments described herein. The examples discussed herein are not limiting of the scope, applicability, or embodiments set forth in the claims. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments. For example, changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as appropriate. For instance, the methods described may be performed in an order different from that described, and various steps may be added, omitted, or combined. Also, features described with respect to some examples may be combined in some other examples. For example, an apparatus may be implemented or a method may be practiced using any number of the aspects set forth herein. In addition, the scope of the disclosure is intended to cover such an apparatus or method that is practiced using other structure, functionality, or structure and functionality in addition to, or other than, the various aspects of the disclosure set forth herein. It should be understood that any aspect of the disclosure disclosed herein may be embodied by one or more elements of a claim.
As used herein, a phrase referring to “at least one of” a list of items refers to any combination of those items, including single members. As an example, “at least one of: a, b, or c” is intended to cover a, b, c, a-b, a-c, b-c, and a-b-c, as well as any combination with multiples of the same element (e.g., a-a, a-a-a, a-a-b, a-a-c, a-b-b, a-c-c, b-b, b-b-b, b-b-c, c-c, and c-c-c or any other ordering of a, b, and c).
As used herein, the term “determining” encompasses a wide variety of actions. For example, “determining” may include calculating, computing, processing, deriving, investigating, looking up (e.g., looking up in a table, a database or another data structure), ascertaining and the like. Also, “determining” may include receiving (e.g., receiving information), accessing (e.g., accessing data in a memory) and the like. Also, “determining” may include resolving, selecting, choosing, establishing and the like.
The methods disclosed herein comprise one or more steps or actions for achieving the methods. The method steps and/or actions may be interchanged with one another without departing from the scope of the claims. In other words, unless a specific order of steps or actions is specified, the order and/or use of specific steps and/or actions may be modified without departing from the scope of the claims. Further, the various operations of methods described above may be performed by any suitable means capable of performing the corresponding functions. The means may include various hardware and/or software component(s) and/or module(s), including, but not limited to a circuit, an application specific integrated circuit (ASIC), or processor. Generally, where there are operations illustrated in figures, those operations may have corresponding counterpart means-plus-function components with similar numbering.
The following claims are not intended to be limited to the embodiments shown herein, but are to be accorded the full scope consistent with the language of the claims. Within a claim, reference to an element in the singular is not intended to mean “one and only one” unless specifically so stated, but rather “one or more.” Unless specifically stated otherwise, the term “some” refers to one or more. No claim element is to be construed under the provisions of 35 U.S.C. § 112 (f) unless the element is expressly recited using the phrase “means for” or, in the case of a method claim, the element is recited using the phrase “step for.” All structural and functional equivalents to the elements of the various aspects described throughout this disclosure that are known or later come to be known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the claims. Moreover, nothing disclosed herein is intended to be dedicated to the public regardless of whether such disclosure is explicitly recited in the claims.