1. Field
This disclosure is generally related to fraud detection. More specifically, this disclosure is related to generating fraud-detection score that is weighed based on an inverse frequency for the fraud type.
2. Related Art
Governing organizations routinely audit certain entities, such as people or companies, to ensure that these entities are following the organization's policies, and thus are not committing fraudulent acts. The policies are typically written as documents that specify rules which need to be enforced by agents of the governing organization. The Internal Revenue Service (IRS), for example, employs agents to audit the tax filings from tax payers and companies to ensure that these tax payers have not omitted revenue from their tax filings, either intentionally or accidentally.
As a further example, pharmacies typically dispense controlled drugs, such as narcotics, which can only be handled by licensed pharmacists and doctors, and should only be made available to patients with a proper prescription. The pharmacy or the drug enforcement agency (DEA) may routinely audit how the controlled drugs are being dispensed to ensure that pharmacists are not committing fraud by dispensing controlled drugs in an illegal manner. Also, a health-insurance agency may audit the insurance claims filed by a pharmacy to ensure that the pharmacy is not committing fraud, for example, by dispensing or refilling drugs for which a patient has not submitted a prescription.
However, to detect fraudulent entities, an organization typically has to audit these entities individually, which can be a time-consuming and resource-consuming effort. The organization may instead opt for auditing a randomly selected batch of entities at a time, and/or may audit entities that are suspected of having committed fraud in some way.
One embodiment provides a fraud-detection system that facilitates detecting fraudulent entities. During operation, the system can obtain fraud warnings for a plurality of entities, and for a plurality of fraud types. The system computes, for a respective entity, a fraud-detection score which indicates a normalized cost of fraudulent transactions from the respective entity. The system then determines, from the plurality of entities, one or more anomalous entities whose fraud-detection score indicates anomalous behavior. The system can determine an entity that is likely to be fraudulent by comparing the entity's fraud-detection score to fraud-detection scores for other entities.
In some embodiments, an entity can include one or more of: a pharmacy; a health clinic; a pharmacy patient; a merchant; and a credit card holder.
In some embodiments, a cost of the fraudulent transactions for a fraud type a can indicate a number of transactions associated with fraud type a, or an aggregate price for the transactions associated with fraud type a.
In some embodiments, the system processes transactions associated with the respective entity, using a set of fraud-detecting rules, and generates a set of fraud-warning for the respective entity based on the fraud-detecting rules. A respective fraud warning indicates a transaction which may be associated with a fraud type for a corresponding fraud-detecting rule.
In some embodiments, while computing a fraud-detection score for the respective entity, the system computes a fraud weight, fraud_weight(a), for the respective fraud type a. The system also computes a weighted fraud cost, wfc(a,p), for the respective entity p and fraud type a:
wfc(a,p)=N(a,p)*fraud_weight(a)
wherein N(a,p) indicates an aggregate cost for transactions from entity p that are associated with fraud type a. The system then computes a fraud-detection score for the respective entity p by aggregating weighted fraud costs for the plurality of fraud types.
In some embodiments, while computing the fraud weight for the respective fraud type, the system computes:
Here, T indicates a total number of entities, a indicates the fraud type, and T(a) indicates a total number of entities that have at least a predetermined number of associated with fraud type a.
In some embodiments, while computing the fraud-detection score for entity p involves, the system computes:
Here, A indicates the plurality of fraud types.
In some variations, while computing the fraud-detection score for the respective entity p, the system computes:
Here, N(a,p) indicates an aggregate cost for transactions that are associated with fraud type a from entity p, and T indicates an aggregate cost for all transactions from all entities.
In some variations, while computing the fraud-detection score for the respective entity p, the system computes:
Here, N(a,p) indicates an aggregate cost for transactions that are associated with fraud type a from entity p, and T indicates an aggregate cost for all transactions from all entities.
In some embodiments, while computing the fraud-detection score for the respective entity p, the system computes:
Here, N(A,p) indicates an aggregate cost for transactions that are associated with any fraud in set A from entity p, r(a) indicates an average violation rate for fraud type a, and T(p) indicates an aggregate cost for all transactions from entity p.
In the figures, like reference numerals refer to the same figure elements.
The following description is presented to enable any person skilled in the art to make and use the embodiments, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present disclosure. Thus, the present invention is not limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
Embodiments of the present invention provide a fraud detection system that solves the problem of processing fraud warnings for a plurality of entities to determine which entities are likely to be committing fraudulent transactions. For example, an organization such as a medical-insurance agency can generate rules for detecting possibly fraudulent activity. The system can process transactions, such as insurance claims from pharmacies, using these fraud-detecting rules to generate a fraud warning for a transaction that violates a rule.
However, not all fraud warnings indicate that a fraudulent transaction has occurred, or that an entity is intentionally fraudulent. In some embodiments, the system analyzes the set of fraud warnings to detect an anomaly in the fraud warnings. For example, the medical-insurance agency may have a policy that restricts pharmacies from performing early refills. The system thus generates a fraud warning related to early refills whenever the system detects that a pharmacy has performed such an early refill for a patient prior to receiving the prescription from the patient.
Some pharmacies may perform an early refill from time to time, for example, to accommodate a request from a patient. Other pharmacies may routinely perform early refills to file more insurance claims, which is against the insurance agency's policies. The system can distinguish fraudulent entities from others that violated a rule unintentionally by determining whether the type of fraud warning is uncommon across a population of entities, and whether the number or cost of the fraud violations from a given entity are greater than that of other entities. A fraudulent pharmacy that routinely commits fraudulent transactions may incur a high “cost” associated with a given fraud type, such that this type of fraudulent transaction may have a low frequency across many pharmacies (e.g., the transaction is not a common transaction).
Fraud-detection server 102 can include a storage device 120, which stores fraud-detecting rules 122, fraud warnings 122, and fraud-detection scores 124. During operation, fraud-detection server 102 can receive fraud-detection rules 114 from Insurance agency server 104, which configures fraud-detection server 102 to generate fraudulent entity report 126 for the insurance agency. Fraud-detection server 102 can also periodically receive entity information 116 and transaction information 118 from insurance agency server 104. Fraud-detection server 102 can process this information using fraud-detection rules 122 to generate fraud warnings 124, and to compute fraud-detection scores for the entities under investigation. Fraud-detection server 102 can also analyze fraud-detection scores 126 to generate fraudulent entity report 128, which can indicate a set of identify entities that are likely to have committed fraud.
In some embodiments, fraud-detection server 102 can receive fraud warnings 124 from the organization's computer system (e.g., from insurance agency server 104), which fraud-detection server 102 can use to generate fraud-detection scores 126 and fraudulent entity report 128 without having to process sensitive information from the entities under investigation.
Fraud-detecting rules 122 and fraud warnings 124 can correspond to a variety of possible fraud types. For example, some of fraud-detecting rules 122 generate fraud warnings based on DEA violations against pharmacies under investigation. A fraud-detecting rule can include a condition for generating a fraud warning when a pharmacy has received at least a threshold number of warnings or violations from the DEA. Another fraud-detecting rule can include a condition for generating a fraud warning for a pharmacy when a total amount of money owed and/or paid to the DEA in fines against this pharmacy is greater than a threshold fine amount.
As a further example, some of fraud-detecting rules 122 generate fraud warnings based on a pharmacy's transactions that violate certain operating procedures and/or policies (e.g., policies or procedures instituted by the DEA, an insurance agency, and/or the pharmacy's corporate organization), such as by performing an early fill or refill, regardless of whether these transactions have resulted in a DEA violation. A pharmacy is said to have performed an “early fill” when the pharmacy re-fills a prescription prior to the patient consuming at least a percentage of an earlier fill. In many cases, a pharmacy may be performing an early fill during slow work hours to lessen the number of prescriptions that may need to be filled in the near future, or to accommodate a patient that may not be able to pick-up the refill in the near future (e.g., due to travel arrangements).
This practice is not ideal, however, because it results in patients getting access to additional controlled substances that they may not use or they may abuse, and because it results in additional costs the medical-insurance agency. Hence, some of the fraud-detecting rules may include a condition for generating a fraud warning based on a number or cost of early fills performed by a pharmacy. For example, a fraud-detecting rule may define that an “early fill” has occurred when a pharmacy fills a prescription before a predetermined percentage of the previous fill is consumed (e.g., before the patient consumes at least 75% of the previous fill). The fraud-detecting rule may generate a fraud warning when a pharmacy has completed at least a threshold number of transactions associated with an early fill, based on a total amount of money associated with the early-fill transactions, or based on a total amount of money associated with the un-used portion of the previous fill (e.g., as determined based on a per-pill cost).
The fraud-detecting rule may also generate fraud warnings based on other metrics for detecting early refills. For example, an organization's server (e.g., server 104) or fraud-detection server 102 may keep track of a patient's number of unused medication (e.g., a number of pills) when a pharmacy performs a refill transaction, and computes an overall “unused-medication ratio” that indicates an aggregate percentage of unused medication associated with the pharmacy's refill transactions. A pharmacy that typically refills prescriptions two days before a patient's prior fill runs out may incur an unused-medication ratio of approximately 6.7%. On the other hand, a pharmacy that typically refills prescriptions one week early may incur an unused-medication ratio of approximately 75%. Hence, a fraud-detecting rule may generate a fraud warning when a pharmacy's overall “unused ratio” reaches a predetermined threshold level (e.g., 75%).
In some embodiments, fraud-detection server 102 can generate fraudulent entity report 126 for a pharmacy by receiving, from a pharmacy server 106, fraud-detection rules and transaction information (e.g., prescription claims) related to patients that are under investigation. Fraudulent entity report 126 can identify doctors, patients, and/or pharmacists that may be committing fraud, for example, to get illegitimate access to controlled drugs.
Fraud-detection server 102 can generate fraudulent entity report 126 for a credit institution by receiving, from a credit institution server 108, fraud-detection rules and transaction information (e.g., loan transactions and/or credit-card purchases) related to customers that are under investigation. Fraudulent entity report 126 can identify customers that are committing credit fraud, or to identify legitimate accounts that may have become compromised.
Further, fraud-detection server 102 can generate fraudulent entity report 126 for a merchant by receiving, from a merchant server 110, fraud-detection rules and transaction information (e.g., purchase, returns, and/or exchange transactions) related to customers that are under investigation. Fraudulent entity report 126 can identify customers that may be abusing the merchant's returns policy.
In some embodiments, the system can also receive a set of fraud warnings from the third-party organization desiring to expose fraudulent entities. The third-party organization may identify the fraud warnings in-house, and may generate fraud warnings that do not reveal personal information about the individual entities being investigated. For example, the third-party organization can generate a different unique entity identifier to identify each entity, associates a fraud warning to an entity using the entity's unique identifier.
Recall that not all fraud warnings indicate actual fraudulent behavior. In some embodiments, a fraud warning may indicate that a certain entity has performed a transaction in a way that violates the organization's preferred procedures, or using a procedure that has been previously exploited by others to commit fraud. Once the system has obtained fraud warnings for the suspicious transactions, the system analyzes the fraud warnings to identify a set of entities that are likely to be committing fraud (operation 206).
Otherwise, if there are no more entities to investigate, the system analyzes the plurality of fraud-detection scores to determine a set of entities whose fraud-detection score indicates anomalous behavior (operation 310). In some embodiments, a high-fraud-detection score for a certain entity indicates that a pattern of fraud warnings associated with this entity does not follow a typical pattern of fraud warnings for other typical entities.
In equation (1), T indicates a total number of entities, a indicates the fraud type, and T(a) indicates a total number of entities that have at least a predetermined number of associated with fraud type a.
The system then computes a weighted fraud cost, wfc(a,p) (operation 406). The weighted fraud cost indicates a cost due to an entity p committing fraudulent transactions associated with fraud type a, such that the cost is weighted by fraud_weight(a). In some embodiments, the system computes the weighted fraud cost using:
wfc(a,p)=N(a,p)*fraud_weight(a) (2)
In equation (2), N(a,p) indicates an aggregate cost for transactions from entity p that are associated with fraud type a.
The system then determines whether there are more fraud types to consider for entity p (operation 408). If so, the system returns to operation 402 to select another fraud type. Otherwise, the system proceeds to compute a fraud-detection score for the respective entity p by aggregating weighted fraud costs for the plurality of fraud types (operation 410). For example, the system can aggregate the weighted fraud costs using:
In equation (3), A indicates a plurality of fraud types. The system processes equation (3) to compute the fraud-detection score for entity p by adding the weighted fraud costs for entity p.
In other words, the system computes the fraud-detection score by computing:
Equation (4) shows how the cost for each fraud type ai is weighted by the inverse frequency of its fraud warnings across an entity population T (e.g., an indication of how infrequent the fraud warning is across T). For example, T(ai) may indicate a number of entities that have received at least one fraud warning of type ai. However, if a large percentage of entities have received such a fraud warning (e.g., T(ai)/T>0.75), then fraud type ai receives a small weight (e.g., weight=log(1.33)=0.12), making fraud type ai less significant than other fraud types that occur less frequently. For example, a fraud type detected from approximately 2% of entities (e.g., T(ai)/T>0.02) receives a relatively large weight (e.g., weight=log(50)=1.70).
Hence, entity p can receive a larger weighted fraud cost wfc(ai,p), for a fraud type ai, than other entities when: i) fraud type ai occurs infrequently across entity population T; and/or ii) entity p is associated with a fraud count N(ai,p) that is significantly larger than other entities in population T.
In some embodiments, the system can analyze histogram 500 to identify entities whose fraud-detection scores do not fit within a trend of histogram 500. For example, the score interval [0, 0.55] follows a normal decay pattern, and is associated with a set of “normal” entities that may not be engaged in fraudulent transactions. However, bars 508 and 510 indicate that a small number of entities have an anomalous fraud-detection score within the intervals [0.55, 0.6] and [0.65, 0.7], respectively, which does not fit within the normal decay pattern of histogram 500. The system can label these entities within the interval [0.55, 0.7] as “anomalous,” which allows an organization to investigate these entities further to determine whether they are committing fraudulent transactions intentionally.
In some embodiments, communication module 602 can communicate with an organization's computer system to obtain entity information, transaction information, or any other information that facilitates detecting fraud (e.g., fraud warnings regarding to a plurality of entities). Fraud-detecting module 604 can process transactions associated with one or more entities, and generates a set of fraud-warning based on the fraud-detecting rules. Fraud-warning repository 606 can obtain and/or store fraud warnings for a plurality of entities, and for a plurality of fraud types. A fraud warning indicates a transaction performed by a certain entity, such that the transaction may be associated with a fraud type for a corresponding fraud-detecting rule.
Score-computing module 608 can compute, for a respective entity, a fraud-detection score which indicates a normalized cost of fraudulent transactions from the respective entity. Fraudulent-entity-detecting module 610 can determine, from the plurality of entities, one or more entities whose fraud-detection score indicates anomalous behavior.
Fraud detection system 718 can include instructions, which when executed by computer system 702, can cause computer system 702 to perform methods and/or processes described in this disclosure. Specifically, fraud detection system 718 may include instructions for communicating with an organization's computer system to obtain entity information, transaction information, or any other information that facilitates detecting fraud (communication module 720). Further, fraud detection system 718 can include instructions for processing transactions associated with one or more entities, and generates a set of fraud-warning based on the fraud-detecting rules (fraud-detecting module 722). Fraud detection system 718 can also include instructions for obtaining and/or storing fraud warnings for a plurality of entities, and for a plurality of fraud types (fraud-warning-manager module 724).
Fraud detection system 718 can include instructions for computing, for a respective entity, a fraud-detection score which indicates a normalized cost of fraudulent transactions from the respective entity (score-computing module 726). Fraud detection system 718 can also include instructions for determining, from the plurality of entities, one or more entities whose fraud-detection score indicates anomalous behavior (fraudulent-entity-detecting module 728).
Data 726 can include any data that is required as input or that is generated as output by the methods and/or processes described in this disclosure. Specifically, data 726 can store at least entity information, transaction information, a fraud-detecting rule, a fraud warning, a fraud-detection score, and/or a fraudulent entity report.
The data structures and code described in this detailed description are typically stored on a computer-readable storage medium, which may be any device or medium that can store code and/or data for use by a computer system. The computer-readable storage medium includes, but is not limited to, volatile memory, non-volatile memory, magnetic and optical storage devices such as disk drives, magnetic tape, CDs (compact discs), DVDs (digital versatile discs or digital video discs), or other media capable of storing computer-readable media now known or later developed.
The methods and processes described in the detailed description section can be embodied as code and/or data, which can be stored in a computer-readable storage medium as described above. When a computer system reads and executes the code and/or data stored on the computer-readable storage medium, the computer system performs the methods and processes embodied as data structures and code and stored within the computer-readable storage medium.
Furthermore, the methods and processes described above can be included in hardware modules. For example, the hardware modules can include, but are not limited to, application-specific integrated circuit (ASIC) chips, field-programmable gate arrays (FPGAs), and other programmable-logic devices now known or later developed. When the hardware modules are activated, the hardware modules perform the methods and processes included within the hardware modules.
The foregoing descriptions of embodiments of the present invention have been presented for purposes of illustration and description only. They are not intended to be exhaustive or to limit the present invention to the forms disclosed. Accordingly, many modifications and variations will be apparent to practitioners skilled in the art. Additionally, the above disclosure is not intended to limit the present invention. The scope of the present invention is defined by the appended claims.