This invention relates generally to misuse and abuse detection systems for transactions of commercial cards, and in one particular embodiment, a system, method, and apparatus for self-adaptive scoring to detect misuse or abuse of commercial cards.
Employee misuse and abuse of commercial credit cards is a problem. According to the Association of Certified Fraud Examiners (ACFE), billions are lost every day to employee misuse and abuse. As a result, corporations are seeking new ways to keep misuse/abuse in control and minimize the significant financial risks accompanying such improper uses.
Unlike fraud, misuse and abuse are not usually reported by the cardholders themselves, who are the bad actors. Therefore, the misuse and abuse must be detected independent of the cardholders. Second, the bad actors continually devise new schemes of misuse and abuse of commercial cards, and these new schemes may go unnoticed when no adequate investigative and detection resources are available.
System modeling for detecting misuse or abuse of commercial cards is very difficult. Misuse and abuse detection with analytic processing are important for detecting previously undetected anomalies in company credit card transactional data. However, traditional approaches to misuse and abuse prevention are not particularly efficient. For example, improper payments are often managed by analysts auditing what amounts to only a very small sample of transactions.
Existing commercial card misuse and abuse detection systems and methods employ fixed sets of rules, and are limited to a data intensive task which involves sifting through a multitude of attributes to find new and evolving patterns. In addition, validation of scores is very difficult. Existing models use static rule sets to score cases once a subset of features has been identified.
Further, existing spend management systems have provided travel managers, purchasing managers, finance managers, and card program managers access to online systems to control commercial card purchases. In addition to purchase administration, these systems provide traditional procurement management functions, such as accounting structure support, default coding, split coding, workflow, and direct integration to accounting systems. For example, managers can administer purchases for personal use, company policy, and procedure compliance, and approve of transactions. Adoption of existing systems includes basic reporting, full-feature expense reporting, multinational rollup reporting, and white labeled solutions. For travel accounts, systems include detailed travel data, central travel account support, and full-feature expense reporting with receipt imaging, policy alerts, and approval options.
Accordingly, there is a need in the technological arts for providing systems and methods for updating data models capable of capturing new patterns of misuse and abuse. Additionally, there exists a need in the technological arts for providing systems for improved spend management, out-of-compliance commercial card transaction annotations, past due accounts and overspend monitoring, approval threshold triggers, preferred supplier designation and monitoring, and enhanced regulatory reporting. Finally, a need exists for providing compliance management using critical intelligence assistance for optimal card program management.
Accordingly, it is an object of the present invention to provide a system, method, and apparatus for a self-adaptive scoring process to detect misuse or abuse of commercial cards automatically using supervised feedback as well as unsupervised anomaly detection algorithms for refining machine learning anomaly detection algorithms.
According to a non-limiting embodiment, provided is a computer-implemented method for detecting non-compliant commercial card transactions from a plurality of transactions associated with a plurality of merchants, comprising: receiving, with at least one processor, a plurality of settled transactions for commercial cardholder accounts; generating, with at least one processor, at least one score for each settled transaction of the plurality of settled transactions as each settled transaction is received based at least partially on at least one scoring model; determining, with at least one processor, whether each settled transaction is compliant or non-compliant based at least partially on the at least one score for each settled transaction; receiving, with at least one processor from at least one user, case disposition data corresponding to at least one settled transaction of the plurality of settled transactions; and automatically modifying, at predefined intervals, the scoring model based at least partially on heuristics, anomaly scoring and case disposition data.
According to a non-limiting embodiment, provided is a system for detecting at least one non-compliant commercial card transaction from a plurality of transactions associated with a plurality of merchants, comprising at least one transaction processing server having at least one processor programmed or configured to: receive, from a merchant, a plurality of settled transactions for commercial cardholder accounts; generate at least one score for each settled transaction of the plurality of settled transactions as each settled transaction is received based at least partially on at least one scoring model; determine whether each settled transaction is compliant or non-compliant based at least partially on the at least one score for each settled transaction; receive, from at least one user, score influencing heuristics corresponding to at least one settled transaction of the plurality of settled transactions; receive, from at least one user, case disposition data corresponding to at least one settled transaction of the plurality of settled transactions; and automatically modify, at predefined intervals, the scoring model based at least partially on the heuristics, anomaly detection and case disposition data.
According to a further non-limiting embodiment, provided is a computer program product for processing non-compliant commercial card transactions from a plurality of transactions associated with a plurality of merchants, comprising at least one non-transitory computer-readable medium including program instructions that, when executed by at least one processor, cause the at least one processor to: receive, from a merchant point of sale system, a plurality of settled transactions for commercial cardholder accounts; generate at least one score for each settled transaction of the plurality of settled transactions as each settled transaction is received based at least partially on at least one scoring model; determine whether each settled transaction is compliant or non-compliant based at least partially on the at least one score for each settled transaction; receive, from at least one user, score influencing heuristics corresponding to at least one settled transaction of the plurality of settled transactions; receive, from at least one user, case disposition data corresponding to at least one settled transaction of the plurality of settled transactions; and automatically modify, at predefined intervals, the scoring model based at least partially on the heuristics and case disposition data.
Further embodiments or aspects are set forth in the following numbered clauses:
Clause 1: A computer-implemented method for detecting non-compliant commercial card transactions from a plurality of transactions associated with a plurality of merchants, comprising: receiving, with at least one processor, a plurality of settled transactions for commercial cardholder accounts; generating, with at least one processor, at least one score for each settled transaction of the plurality of settled transactions as each settled transaction is received based at least partially on at least one scoring model; determining, with at least one processor, whether each settled transaction is compliant or non-compliant based at least partially on the at least one score for each settled transaction; receiving, with at least one processor from at least one user, case disposition data corresponding to at least one settled transaction of the plurality of settled transactions; and automatically modifying, at predefined intervals, the scoring model based at least partially on heuristics and case disposition data.
Clause 2: The computer-implemented method of clause 1, wherein the at least one scoring model is based at least partially on at least one of a probability-based outlier detection algorithm and a clustering algorithm.
Clause 3: The computer-implemented method of clauses 1 and 2, wherein receiving the case disposition data comprises: generating at least one graphical user interface comprising at least a subset of the plurality of settled transactions; and receiving user input through the at least one graphical user interface, the user input comprising the case disposition data.
Clause 4: The computer-implemented method of clauses 1-3, wherein generating the at least one score for each settled transaction of the plurality of settled transactions as each settled transaction is received comprises generating the at least one score for a subset of settled transactions on a daily basis or on a real-time basis.
Clause 5: The computer-implemented method of clauses 1-4, further comprising receiving, with at least one processor from the at least one user, at least one score influencing rule corresponding to at least one settled transaction of the plurality of settled transactions, wherein the scoring model is modified based at least partially on the at least one score influencing rule.
Clause 6: The computer-implemented method of clauses 1-5, receiving by a case presentation server the score influencing rule, wherein the score influencing rule is assigned to a first company.
Clause 7: The computer-implemented method of clauses 1-6, further comprising in response to generating at least one score for each settled transaction, determining with at least one processor, reason codes that communicate information about a particular scored feature.
Clause 8: The computer-implemented method of clauses 1-7, further comprising in response to generating at least one score for each settled transaction, determining with at least one processor, reason codes that communicate information about a particular scored feature, wherein a contribution to the score is indicated by the reason code.
Clause 9: The computer-implemented method of clauses 1-8, wherein the clustering algorithm is processed first, providing at least one scored settled transaction before the at least one probability-based outlier detection algorithm.
Clause 10: The computer-implemented method of clauses 1-9, further comprising feedback for model scoring, the feedback including at least one of score influencing rules, case dispositive data, old model scores, and new historical data.
Clause 11: The computer-implemented method of clauses 1-10, wherein the feedback updates at least one attribute associated with a scored transaction.
Clause 12: A system for detecting at least one non-compliant commercial card transaction from a plurality of transactions associated with a plurality of merchants, comprising at least one transaction processing server having at least one processor programmed or configured to: receive, from a merchant, a plurality of settled transactions for commercial cardholder accounts; generate at least one score for each settled transaction of the plurality of settled transactions as each settled transaction is received based at least partially on at least one scoring model; determine whether each settled transaction is compliant or non-compliant based at least partially on the at least one score for each settled transaction; receive, from at least one user, score influencing heuristics corresponding to at least one settled transaction of the plurality of settled transactions; receive, from at least one user, case disposition data corresponding to at least one settled transaction of the plurality of settled transactions; and automatically modify, at predefined intervals, the scoring model based at least partially on the heuristics and case disposition data.
Clause 13: The system of clause 12, wherein the at least one processor is further programmed or configured to score the at least one model based at least partially on at least one of a probability-based outlier detection algorithm and a clustering algorithm.
Clause 14: The system of clauses 12 and 13, wherein the at least one processor is further programmed or configured to: generate at least one graphical user interface comprising at least a subset of the plurality of settled transactions; and receive user input through the at least one graphical user interface, the user input comprising the case disposition data.
Clause 15: The system of clauses 12-14, wherein the at least one processor is further programmed or configured to generate at least one score for each settled transaction of the plurality of settled transactions as each settled transaction is received, comprising generating the at least one score for a subset of settled transactions on a daily basis or on a real-time basis.
Clause 16: The system of clauses 12-15, wherein the at least one processor is further programmed or configured to receive, with at least one processor from the at least one user, at least one score influencing rule corresponding to at least one settled transaction of the plurality of settled transactions, wherein the scoring model is modified based at least partially on the at least one score influencing rule.
Clause 17: The system of clauses 12-16, wherein the score influencing rule is assigned to a first company, the score influencing rule.
Clause 18: The system of clauses 12-17, wherein the at least one processor is further programmed or configured to in response to generating at least one score for each settled transaction, determine with at least one processor, reason codes that communicate information about a particular scored feature, wherein a contribution to the score is indicated by the reason code.
Clause 19: The system of clauses 12-18, wherein the at least one processor is further programmed or configured to process the clustering algorithm first, providing at least one scored settled transaction, before at least one probability-based outlier detection algorithm is processed.
Clause 20: The system of clauses 12-19, wherein the at least one processor is further programmed or configured to include at least one or more score influencing rules, case dispositive data, old model scores, and new historical data.
Clause 21: The computer-implemented method of clauses 12-20, wherein the feedback updates at least one attribute associated with a scored transaction.
Clause 22: A computer program product for processing non-compliant commercial card transactions from a plurality of transactions associated with a plurality of merchants, comprising at least one non-transitory computer-readable medium including program instructions that, when executed by at least one processor, cause the at least one processor to: receive, from a merchant point of sale system, a plurality of settled transactions for commercial cardholder accounts; generate at least one score for each settled transaction of the plurality of settled transactions as each settled transaction is received based at least partially on at least one scoring model; determine whether each settled transaction is compliant or non-compliant based at least partially on the at least one score for each settled transaction; receive, from at least one user, score influencing heuristics corresponding to at least one settled transaction of the plurality of settled transactions; receive, from at least one user, case disposition data corresponding to at least one settled transaction of the plurality of settled transactions; and automatically modify, at predefined intervals, the scoring model based at least partially on the heuristics and case disposition data.
Various ones of the appended drawings merely illustrate example embodiments of the present disclosure and cannot be considered as limiting its scope.
For purposes of the description hereinafter, the terms “end,” “upper,” “lower,” “right,” “left,” “vertical,” “horizontal,” “top,” “bottom,” “lateral,” “longitudinal,” and derivatives thereof shall relate to the invention as it is oriented in the drawing figures. However, it is to be understood that the invention may assume various alternative variations and step sequences, except where expressly specified to the contrary. It is also to be understood that the specific devices and processes illustrated in the attached drawings, and described in the following specification, are simply exemplary embodiments or aspects of the invention. Hence, specific dimensions and other physical characteristics related to the embodiments or aspects disclosed herein are not to be considered as limiting.
Non-limiting embodiments of the present invention are directed to a system, method, and computer program product for detecting at least one misuse or abuse of a commercial card during a commercial card transaction associated with a company or institution. Embodiments of the invention allow for a self-adaptive refinement of scoring rules defined using feedback provided by supervised learning from account owners, supervised scoring rules, and dispositive data. In a non-limiting embodiment of the invention, the system makes use of the known and available misuse and abuse data to learn using machine learning algorithms to find new patterns and generate more accurate reason codes. The scores and codes become more accurate when the available data is used to make new determinations. Rather than waiting for human intervention to update the rules gradually, non-limiting embodiments may include supervised learning, comprising case information, score influencing rules, and transactional updates, some based on previous score models, to form new scoring models at a predetermined time. The self-adaptive refresh causes the scoring algorithm to predict new anomalies by eliminating old cases that could unduly influence new rules or contain false-positive commercial card transactions.
As used herein, the term “commercial card” refers to a portable financial device issued to employees or agents of a company or institution to conduct business-related transactions. A commercial card may include a physical payment card, such as a credit or debit card, or an electronic portable financial device, such as a mobile device and/or an electronic wallet application. It will be appreciated that a commercial card may refer to any instrument or mechanism used to conduct a transaction with an account identifier tied to an individual and a company or institution.
As used herein, the terms “misuse” and “abuse” refer to the characterization or classification of a transaction based on predictions using attributes of the associated data to determine the nature of a transaction. Abuse may refer to intentionally or unintentionally violating policies and procedures for personal gain. Misuse may refer to the unauthorized purchasing activity by an employee or agent to whom a commercial card is issued. Misuse may comprise a wide range of violations, varying in the degree of severity, from buying a higher quality good than what is deemed appropriate to using non-preferred suppliers. The term “fraud” may refer to the unauthorized use of a card, resulting in an acquisition whereby the end-user organization does not benefit. Fraud may be committed by the cardholder, other employees of the end-user organization, individuals employed by the supplier, or persons unknown to any of the parties involved in the transaction.
As used herein, the terms “communication” and “communicate” refer to the receipt or transfer of one or more signals, messages, commands, or other type of data. For one unit (e.g., any device, system, or component thereof) to be in communication with another unit means that the one unit is able to directly or indirectly receive data from and/or transmit data to the other unit. This may refer to a direct or indirect connection that is wired and/or wireless in nature. Additionally, two units may be in communication with each other even though the data transmitted may be modified, processed, relayed, and/or routed between the first and second unit. For example, a first unit may be in communication with a second unit even though the first unit passively receives data and does not actively transmit data to the second unit. As another example, a first unit may be in communication with a second unit if an intermediary unit processes data from one unit and transmits processed data to the second unit. It will be appreciated that numerous other arrangements are possible.
As used herein, the term “merchant” may refer to an individual or entity that provides goods and/or services, or access to goods and/or services, to customers based on a transaction, such as a payment transaction. The term “merchant” or “merchant system” may also refer to one or more computer systems operated by or on behalf of a merchant, such as a server computer executing one or more software applications. A “merchant point-of-sale (POS) system,” as used herein, may refer to one or more computers and/or peripheral devices used by a merchant to engage in payment transactions with customers, including one or more card readers, near-field communication (NFC) receivers, RFID receivers, and/or other contactless transceivers or receivers, contact-based receivers, payment terminals, computers, servers, input devices, and/or other like devices that can be used to initiate a payment transaction. A merchant POS system may also include one or more server computers programmed or configured to process online payment transactions through webpages, mobile applications, and/or the like.
As used herein, the term “supervised learning” may refer to one or more machine learning algorithms that start with known input variables (x) and an output variable (y), and learn the mapping function from the input to the output. The goal of supervised learning is to approximate the mapping function so that predictions can be made about new input variables (x) that can be used to predict the output variables (y) for that data. The process of a supervised algorithm learning from the training dataset can be thought of as a teacher supervising the learning process. The correct answers are known. The algorithm iteratively makes predictions on the training data and is corrected by the teacher. Learning stops when the algorithm achieves an acceptable level of performance. Supervised learning problems can be further grouped into regression problems and classification problems. Supervised learning techniques can use labeled (e.g., classified) training data with normal and outlier data, but are not as reliable because of the lack of labeled outlier data. For example, multivariate probability distribution based systems are likely to score the data points with lower probabilities as outliers. A regression problem is when the output variable is a real value, such as “dollars” or “weight”. A classification problem is when the output variable is a category, such as “red” and “blue,” or “compliant” and “non-compliant”.
As used herein, the term “unsupervised learning” may refer to an algorithm which has input variables (x) and no corresponding output variables. The goal for unsupervised learning is to model the underlying structure or distribution in the data in order to learn more about the data. Unlike supervised learning, in unsupervised learning there are no correct answers and there is no teacher. Unsupervised learning algorithms are used to discover and present the interesting structure in the data. Unsupervised learning problems can be further grouped into clustering and association problems. A clustering problem is modeling used to discover the inherent groupings in a dataset, such as grouping customers by purchasing behavior. An association rule learning problem is where you want to discover rules that describe large portions of data, such as people that buy A also tend to buy B. Some examples of unsupervised learning algorithms are clustering and likelihood modeling.
Referring now to
In a non-limiting embodiment of the scoring system 100 shown in
With continued reference to
The commercial card transaction data 104 may refer to standard transaction data and may include, for example, transaction date, transaction time, supplier, merchant, total transaction value, customer-defined reference number (e.g., a purchase order number, separate sales tax amount), and/or line-item detail, such as the item purchased. The stored commercial data 106 may include data that can be associated with a transaction by comparing key identifying fields that may include, for example, one or more of name, cardholder ID, merchant ID, or Merchant Category Code (MCC). In non-limiting embodiments, such matching may incorporate data from existing tables and may include, for example, one or more of lodging data, case data, car rental data, and/or account balance data. Heuristics and dispositive data 108 may refer to rules that are based on user inputs during a review, which each company in the system will have the capability to create for influencing score values based on certain criteria. For example, it will be appreciated that if MCC has a value of 5812 (fast food) and the amount is less than $5, the score may be in the low range (indicating a proper transaction) across most commercial systems. If the amount is over $100, the transaction may be considered abnormal for the purposes of lunchtime fast-food purchase. Such a rule, and others of similar and increasing complexity, may be stored in the system 100 and may characterize transactions when processed. The rules are statements that include one or more identifying clauses of what, where, who, when, and why a certain transaction should be influenced.
The score influencing rules may also further refine or adjust the dataset scores in the set. Parameters of an old score model may be added to the model data. The old unsupervised scoring model may be used to score elements of the dataset to assign score rules to features of the data and create more attributes in the data. A query processor may be configured to update historical data with provisions about cases based on dispositive tagging by an end-user and score influencing rules for tagging records. The system includes a case presentation application for receiving communications for entering, updating, copying, and changing rules and tagging or scoring records. Case dispositive data, or a decision matrix, indicates information about a case, such as tagging, to show explicitly that a case is ‘good,’ ‘misuse,’ ‘abuse,’ and/or ‘fraud.’ The labels can be used before modeling to remove abusive transactions from the model data before running unsupervised algorithms.
In one non-limiting embodiment, the scoring state feedback 110 may refer to a process of dynamically shaping the scores based on feedback from the data and input sources. The state of the dynamic scoring system 100 is based on a collection of variables or attributes that permit detection of new anomalies. Such incremental changes in the system are entered into the scoring algorithms. The incremental changes in such attributes can have powerful effects during the training of new model scores. They may be defined by differences introduced in the state of the system. The incremental changes may refer to changes in commercial data, updated or new case dispositive or influencing rules, and new transaction data. The feedback may affect or influence the features of the model.
The scoring model 102, in response to receiving a model data set, generates predictions on new raw data for which the target is not known. For example, to train a model to predict if a commercial card transaction is a misuse or abuse, training data is used that contains transactions for which the target is known (e.g., a label that indicates whether a commercial card transaction is abused or not abused). Training of a model is accomplished by using this data, resulting in a model that attempts to predict whether new data will be abuse/misuse or not.
Referring now to
With continued reference to
With continued reference to
Still referring to
Utility processing 204 includes the training process, which fit the scoring model with data to create the scoring algorithms. Data training server 220, which generates score rules defined by the scoring model using training data, includes one or more feature values for entity classification, and associates each entity with one or more classifiers. The training server may build the model scores using at least the data training server 220 for a gradient boosting system that applies a machine learning process that can be used to build scoring models including one or more of sub-models. For example, each of the one or more sub-models can be decision trees. Candidate features of the trees are defined by normalized transactional data, lodging data, case data, rules data, account level aggregates, transaction history, and/or balance data. The training data includes compliant transactions and/or one or more raw non-compliant transactions. The features of the data are determined using processes for unsupervised machine learning. The final mode being delivered is a decision tree. The model scoring training builds a scoring algorithm using gradient boosting trees. In addition, reason codes may be determined by estimating feature importance in each tree. The estimated feature contribution in the scores of each terminal node is used to generate the reason codes. A clustering method and likelihood model are built using the training data and a record's outlier-ness is tested against it. In a non-limiting embodiment, the machine learning can be run in sequence, with the clustering running twice, and then using likelihood modeling after the clustering training.
During the implementation phase, the score rules are used to process incoming transactions for detection of misuse and abuse. Monitor reports 222 can be used to transfer analytic knowledge. A second set of queries 224, similar to the queries 214, are used to generate a dataset 226. The dataset 226 may be scored by one or more of a decision matrix 234 and preconfigured rules 232. A scoring engine 228 processes the scoring dataset 226 using the score influencing rules, the decision matrix 234, and the scored dataset 236. As cases are scored, they are communicated to a case management server.
Unlike fraud detection for regular consumer credit cards, not all misuses and abuses can be easily detected. Unsupervised machine learning techniques have been adopted to capture new and undetected trends automatically. Prediction systems provide predictive analysis that utilizes past and present data to detect questionable transactions. The system uses advanced analytic techniques, such as machine learning, to identify new areas of risk and vulnerability.
Machine learning may refer to a variety of different computer-implemented processes that build models based on a population of input data by determining features of the entities within the population and the relationships between the entities. To build the model, the machine learning process can measure a variety of features of each entity within the population, and the features of different entities are compared to determine segmentations. For example, a machine learning process can be used to cluster entities together according to their features and the relationships between the entities.
As used herein, the terms “classifier” and “classification label” refer to a label (e.g., tag) describing an attribute of an entity. A classifier may be determined by a human or dynamically by a computer. For example, a person may classify a particular transaction as ‘good,’ ‘misuse,’ ‘abuse,’ and/or ‘fraud.’ In another example, transactions may be classified based on what type of goods or services are purchased (e.g., “food” or “hotel”) or other details of the transactions. One or more classification labels may be applied to each entity. Entities having the same classification label may have one or more features having similar values.
As used herein, the term “features” refers to the set of measurements for different characteristics or attributes of an entity as determined by a machine learning process. As such, the features of an entity are characteristic of that entity such that similar entities will have similar features depending on the accuracy of the machine learning process. For example, the “features” of a transaction may include the time of the transaction, the parties involved in the transaction, or the transaction value. In addition, the features of a transaction can be more complex, including a feature indicating the patterns of transactions conducted by a first party or patterns of the other parties involved in a transaction with the first party. The features determined by complex machine learning algorithms may not be able to be interpreted by humans. The features can be stored as an array of integer values. For example, the features for two different entities may be represented by the following arrays: [0.2, 0.3, 0.1, . . . ] for the first entity and [0.3, 0.4, 0.1, . . . ] for the second entity. Features such as bench-marking statistics (e.g., mean dollar per MCC) may be calculated for the company or institution and/or card-type.
The data services 202 include, for example, at least one or more volumes of data that are related to a transaction. Once in the system, the data is stored and used in the normal course of business. In addition, the data services 202 are able to match records with transactions. Data that does not conform to the normal and expected patterns are called outliers. Outliers can involve a wide range of commercial transactions involving various aspects of a purchase transaction. The system stores large amounts of data, which may be unstructured, creating the opportunity to utilize big data processing technologies. Unstructured data may refer to raw data that has not been tagged.
The modeling approach segments data into groups based on attributes of the data. The groups are defined by attributes and differing combinations of attributes, such as card-type (e.g., purchase card or travel card), transaction type, or company type. In addition, the transactions may be segmented based on MCG, MCC, airline, hotel chain, car rental, demographic information, business unit, supplier location, cardholder state, cardholder country, transaction type, amount, supplier country, and/or supplier country and city.
As an example, detections may determine, for company A, that most of the commercial card users pay approximately $25.00 for lunch. The determination may be used to detect lunch transactions outlying typical lunch transactions by calculating the mean and standard deviation. Transactions diverging from the standard deviation could be determined to be an instance of abuse or possible abuse. In one aspect of the invention, a rule could be programmed to compare records that deviate and report them as possible abuse. A transaction time combined with an MCC may be used to determine that the transaction is for lunch, and therefore that the transaction should be compared with typical lunch transactions.
A location attribute may indicate a location from which a transaction originates. For example, the attribute “City” may indicate “Paris” or “New York.” Other dimensions available include one or more of MCC occurrence rate, lodging data, case data, car rental data, and/or account balance data. Each transaction processed by the data scoring system 200 is assigned an MCC, a four-digit number that denotes the type of business providing a service or selling merchandise. The MCC for dating and escort services is 7273, and for massage parlors it is 7297. The table below shows several exemplary MCC codes which are used in the system:
The MCC may be used, for example, to monitor one or more aspects of and restrict spending on commercial cards. The MCCs, along with the name of the merchant, give card issuers an indication of cardholders' spending. The system can use MCCs for many different rules. In embodiments, a rating of MCCs could distinguish between common and rare merchant categories, or any range between. Rare MCCs may be scored as possible misuse and abuse.
With continued reference to
Still referring to
With continued reference to
At step 310 in
With reference to
Transaction groups are formed by attribute and then compared for finding anomalies. In a non-limiting embodiment, the MCC, which is an attribute of all transactions, is used to categorize the transactions. For example, Table 2 illustrates the transactions arranged in MCC groupings, the membership count for each MCC group, and a probability of occurrence for each MCC category. Of the total transactions, 1,145,225 are associated with an MCC of 5812. In another example, Table 3 shows the transaction records arranged as categories based on the amount billed. For example, 3,464,982 had transactions in the spending range of $25 or less.
Still referring to
Still referring to
P(Xi)=P(X=i).
qval(Xi)=ΣxexP(x) where X={x:P(x)<=P(Xi)}
At step 408, it is determined whether rval<α or qval<β. In a non-limiting embodiment, the threshold values (α=0.01, β=0.0001) are provided to compare with the rval and qval of a transaction. Transaction 1 is not an outlier because the threshold value is not met:
At step 410, if the threshold comparison is true, then the matching record(s) is tagged as an outlier, or scored according to the determination. If not, the system returns to the next record for processing until rval and qval are calculated for each record.
With reference to
Still referring to
The compliance management processor 542 for auditing and presenting non-compliant transactions presents the scored non-compliant cases for tagging after scoring with the dynamic score rules, compliance workflow, and self-adaptive feedback. The compliance system adds a layer of protection and control for commercial card programs. In one aspect of the invention, the compliance management processor 542 includes a dashboard that is used to provide metrics, e.g., a macro view of certain performance factors. Compliance management processor 542 also includes displays for the selection and updating of records during auditing. For example, an audit of non-compliant transactions can be sorted by at least one or more of consumer demographic details, merchant details, or supplier details. For example, in a non-limiting embodiment, fields used to perform an audit may include one or more of MCG, MCC, airline identifier, hotel chain identifier, car rental identifier, supplier address, cardholder country, transaction type, amount, total spend, percent of spend, transaction counts, delinquency dollars, count, amounts, misused case count, type, and/or spend. In addition, non-compliant cases may be audited by a threshold percent, such as top ten MCC by spend or some other threshold. The merchant profile may be defined by frequency of transactions across the company or other groupings. Transaction geography may define purchases at locations never previously visited or infrequently visited by any employee that may identify or influence identifying a settled transaction. Transaction values may also define deviant measures for evaluating whether a transaction is anomalous to a card program level. Transaction velocity and splitting may include, for example, a high value purchase that is split into multiple transactions to game the system or high velocity ATM withdrawals. Detailed level data may define lodging transactions, with a detailed breakdown to levels and/or subcategories within lodging transactions, such as gift store, movie, telephone, minibar, or cash advance purchases.
The compliance management processor 542 provides an interface for scored commercial transaction case review. The case presentation system communicates existing case dispositions (B) and score influencing rules (C) to the compliance management processor 542 which further communicates the feedback to the data repository for storage until refinement of the score rules. In an embodiment of the invention, the compliance management processor 542 provides additional data manipulation on the interface 550 for activating at least one new or updated score influencing rule, sampling, or prediction processes to identify questionable transactions to be processed through the compliance management processor 542. Sampling statistics may refer to a sampling of results to define conditions for handling a case. The score influencing rules may refer to stored logic for comparing a transaction against criteria set in one or more standard rules, set of rules, or customizable rules to identify potential out-of-policy spend. Case disposition data may define a transaction or grouping of transactions, for example, including at least one of misuse, abuse, fraud, or valid.
The compliance management processor 542 receives input including, for example, one or more non-compliant scored cases for constant surveillance to help identify misuse and abuse updates and to provide those updates into the rules in the dynamic scoring system. The compliance processor also provides an intervention algorithm to automatically monitor specified card programs and provide suggestions for updates to move the program closer or back into compliance. In an aspect of the invention, the interface 550 may be a web-based, flexible application for commercial payment programs for maximization of savings and benefits by operating according to a company's policies.
The processed data flows may be displayed or presented in the case presentation's interface 550. The review is initiated in the first step by a manager in the compliance case management system 538. Next, appropriate personnel may respond to the initiated case, to clarify aspects of the case, for example, receipts may be required for a questioned transaction. The case is reviewed and accepted or rejected in response. Final disposition information is provided when the case is closed and placed into a configuration file.
The supervised learning may leverage attributes to influence scores. For example, the score influencing rules can include one or more attributes or influencing adjustments. Card profile characteristics may determine the expected transaction behavior defined by related historical transactions. Score influencing may be defined using attributes of the record, including by company title and hierarchy level adjustments (e.g., CEO, VP, and engineer).
With reference to
In non-limiting embodiments, at least six months of historical data is used to perform the model scoring. Some of the data may be data labeled with classification labels, comprising features, disposition data, heuristic logic, case data, and unsupervised score rules. Other data may be in a raw format, with no tagging or classification. The anomalies are derived from the datasets, which include compliant cases and one or more non-compliant cases.
In addition to historical data, other sources of data are used for anomaly detection. Case data is defined by and associated with supervised learning about each company or institution. In an aspect of the invention, each company or institution will have the capability for including score values based on certain criteria. For example, the case data may indicate a low score for an MCC of 5812 and an amount less than $5. In another example, a commercial card associated with a CEO of a commercial cardholder company may be configured to suppress any amount less than $50k. In another non-limiting example, when a company that does business across industries identifies commercial card holders purchasing from an ecommerce company, the transaction may be scored to indicate it as misuse. To detect this type of probable misuse, a rule can be added to flag all such transactions based on the MCC of the transaction under a supervised learning model. Alternatively, machine learning algorithms may be used to detect such anomalies. In yet another example, any adult entertainment commercial transaction during a hotel stay may be identified as misuse.
In a non-limiting embodiment, the transactions are each tagged (e.g., labeled) as ‘good,’ ‘misuse,’ ‘abuse,’ and/or ‘fraud.’ Commercial cards that are used to make weekend purchases may be tagged as probable abuse and/or misuse. Scoring rules are stored in configuration files and processed in association with the model data. The configuration file may be executed when the data services are provisioning the modeling data before the performance tagging using machine learning or on each transaction as it arrives. In this way, obsolete data is removed from the system before the machine learning algorithms are run. This limits the effect that known old cases could otherwise have on the learning process. Such rules can be used to eliminate transactions from the modeling dataset or can be used to adjust the impact to influence the score of cases before the performance tagging acts on the data.
In a non-limiting embodiment, and with continued reference to
Still referring to the non-limiting embodiment in
The tables below show the results of comparing a legacy system with non-limiting embodiments of the new self-adaptive dynamic scoring system described herein. The system-wide quantitative results illustrate the significant increase in accuracy. The cross-company aggregated data shows much higher detection in both the top 5% and 10%. The “Bads” are the cases that are are ultimately labeled as ‘misuse,’ ‘abuse,’ and/or ‘fraud.’
Tables 4 and 5 show the difference in results between two scoring systems, table 4 using the new scoring model generation and the other not using such scoring methods. Table 4 shows the accuracy increasing significantly as risk for accounts increases among the riskiest groups as compared to the same groups in the old system. For example, the bad-rate in the top 5% of riskiest accounts is 5× better using the new scoring than those using the old scores. These rates are increased for a high percentage of the riskiest cases based on the unsupervised learning algorithms. Below, table 6 and 7 further divide the riskiest 1% to exemplify coverage, the probability that the scoring will produce an interval containing a bad case. Coverage is a property of the intervals. Table 6 shows probabilities with coverages for the top 1%, with a further division of this group in Table 7. The coverage in in the top 5% is 4× better with the new scoring than the old scoring.
Referring now to
At step 704 of
At step 706 of
Still referring to
With continued reference to
The resulting features are then stored and compared with a training dataset to form a scoring model.
With continued reference to
The system is then configured to repeat the model steps at step 718, as the old scoring model is used at least once a month to refine, rebuild, or refresh the score rules with self-adaptive learning from the supervised state of the system. The feedback eliminates non-compliant cases from the normal cases and influences future unsupervised rule scores. The dataset includes at least one undetected anomaly and removes at least one previously detected anomaly, thereby increasing the probability of spotting an abusive trend in the remaining cases.
Referring now to
Next, and still referring to
In the scoring step 806, a supervised machine learning process can use a set of population data and associated tags for each object in the training data and generate a set of logic to determine tags for unlabeled data. For example, a person may report that a particular transaction is “fraudulent” or “not-fraudulent.” The score influencing rules can include one or more attributes or influencing adjustments related to card profile characteristics that may determine the expected transaction behavior defined by related historical transactions. Score influencing may be defined using attributes of the record, including by company title and hierarchy level adjustments (e.g., CEO, VP, and engineer). Scoring step 806 also includes performance or automatic tagging (e.g., labeling) of the raw data based on detected anomalies in an unsupervised machine learning process. Performance tagging may be defined as automatic machine or computer-implemented tagging of records without human intervention. Performance tagging may further transform the attributes of transaction records to categorical values. For example, in a first transaction a record is determined to not be an outlier because the threshold value is not met. Accordingly, a score or disposition can be assigned for categorizing the record based on the identified feature score. Alternatively, when a threshold value is met in one or a combination of a record's attributes, a field in the record may be labeled as an outlier, for further characterizing the record. If something is scored high using performance tagging, an administrator review and score the performance tag as incorrect to make the score lower, and effect the unsupervised scoring in the next update of the scoring model.
With continued reference to
At step 810, the system automatically modifies the scoring model. In a non-limiting embodiment, the system makes use of the known and available misuse and abuse data to learn using unsupervised machine learning algorithms to find new patterns and generate more accurate reason codes. The scores and codes become more accurate when the self-adapting feedback is used to make new determinations by identifying categories of good and bad cases with case dispositive data and influencing scoring with new rules. The self-adaptive refresh causes the scoring algorithm to predict new anomalies.
Although the invention has been described in detail for the purpose of illustration based on what is currently considered to be the most practical and preferred embodiments, it is to be understood that such detail is solely for that purpose and that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover modifications and equivalent arrangements that are within the spirit and scope of the appended claims. For example, it is to be understood that the present invention contemplates that, to the extent possible, one or more features of any embodiment can be combined with one or more features of any other embodiment.