The present disclosure relates generally to graph based relationships and more specifically, but not exclusively, to enhanced detection of fraudulent electronic transactions based on relationship graphs.
Both commercial and law-enforcement organizations have a vital interest in determining whether some attempted financial transactions are fraudulent. In this context, fraudulent means that a purchaser does not have legitimate authority to use funds involved in the transaction. As an example, the purchaser may be using a stolen identity and/or using funds (or privileges) that properly belong to the entity whose identity has been stolen. Such funds include a credit card account, a debit card account, an online banking account, and an e-commerce site-issued online account.
It is difficult to detect all fraudulent transactions using conventional systems and methods. There are not only several types of fraud, but also the nature of fraudulent activity is that it is disguised. For example, purchase fraud includes using stolen accounts, creation of new accounts, using a false identity, and demonstrating chargeback fraud (e.g., demanding a refund). In other cases, the merchant is the fraud perpetrator: billing excess charges, failing to deliver products, or not existing as a legitimate business. Accordingly, conventional systems for detecting fraudulent transactions typically use an assemblage of methods, each of which detects and assesses some attribute which distinguishes transactions that are likely valid from likely fraudulent.
Various schemes have been used to detect or block fraudulent transactions. These schemes include using a secret password, a biometric identifier, a monetary limit, and examining the pattern of transactions. The first three methods (i.e., using secret passwords, biometric identifiers, and monetary limits) provide simple pass-fail tests. However, passwords and biometrics often are not used for credit card transactions because legitimate customers and vendors find them to be too troublesome or unpleasant. Setting a maximum monetary limit is flawed because it can allow many small fraudulent transactions while also blocking large but legitimate transactions.
A particular problem for current fraud detection systems is adequately following and modeling the complex patterns of online commerce. As online shopping and other transactional activity becomes easier, more common, and more global, users are engaging in ever more complex transactional patterns with regard to which merchants receive their business. The transactional patterns amount to a social network, with some shoppers referring peer shoppers to particular merchants, and with some shoppers intentionally (or unintentionally) emulating the shopping characteristics of like-minded shoppers. Two merchants can be related, not necessarily because they sell similar products, but because they are both used by many shoppers.
Many common fraud detection schemes perform a time-consuming analysis of their full set of transactional data, to try to define global rules, so that the same rules would apply to all shoppers or financial accounts. This approach not only misses fast-moving changes in shopping behavior, but also fails to allow for the legitimate differences in shopper behavior. If the fraud detection rules are individualized, they typically only include parameterized versions of a single filtering or rule model.
In view of the foregoing, a need exists for an improved system that leverages the constantly changing social network and social role behavior of electronic transactions to better measure the likelihood that a legitimate user would submit a transaction to the specified merchant in an effort to overcome the aforementioned obstacles and deficiencies of conventional fraud detection systems.
It should be noted that the figures are not drawn to scale and that elements of similar structures or functions are generally represented by like reference numerals for illustrative purposes throughout the figures. It also should be noted that the figures are only intended to facilitate the description of the preferred embodiments. The figures do not illustrate every aspect of the described embodiments and do not limit the scope of the present disclosure.
Since currently-available fraud detection systems are deficient because they fail to capture the rapidly changing and complex social network-like nature of purchaser-merchant behavior, a fraud detection system that provides a dynamically evolving merchant relationship graph can prove desirable and provide a basis for a wide range of fraud detection applications, such as the ability to predict the likelihood that a given account would make a transaction with a given merchant. This result can be achieved, according to one embodiment disclosed herein, by a fraud detection system 1000 as illustrated in
Turning to
The fraud detection system 1000 further includes a transaction validator 103 in communication with the transaction server 102. In some embodiments, the transaction server 102 and the transaction validator 103 are components within the same computer. In other embodiments, the transaction server 102 and the transaction validator 103 are disposed within separate computers and communicate with one another via a network connection (not shown).
In some embodiments, the transaction request 100 contains information equivalent to a Financial Account ID (FID), a Merchant ID (MID), and a monetary amount (AMT). In some embodiments, the transaction request 100 contains information, such as a timestamp (TIME) and a description of any goods to be exchanged (DESC).
The transaction server 102 receives the transaction request 100 and forwards the transaction request 100 to the transaction validator 103. The transaction validator 103 performs a real-time analysis of the transaction request 100 to predict the likelihood that the transaction request 100 is fraudulent. The transaction validator 103 uses this predicted likelihood to generate a validity assessment 104, which the transaction validator 103 sends back to the transaction server 102. In some embodiments, the validity assessment 104 can have one of two values: “Valid” or “Invalid.” If the validity assessment 104 is “Valid,” then the transaction server 102 proceeds with executing the transaction request 100. If the validity assessment 104 is “Invalid,” then the transaction server 102 will not execute the transaction request 100. If the transaction of transaction request 100 already has been started, the transaction is rolled back. Here, rolling back refers to the reversal of any parts of the transaction operations requested in transaction request 100 that may have already taken place. In either case, the transaction server 102 replies to the client device 101, indicating whether the transaction succeeded or failed.
Turning now to
A validation supervisor 201 is responsible for coordinating the merchant relationship validator 202 with any other transaction validator modules 203 to generate a single response to the transaction server 102. The validation supervisor 201 will forward a copy of the transaction request 100 to the merchant relationship validator 202 and to the other transaction validator modules 203.
The merchant relationship validator 202 creates, maintains, and analyzes the account-merchant relationship table 204 and the merchant relationship table 205 in order to provide enhanced assessment of the fraud risk of transaction request 100. The merchant relationship validator 202 reports its assessment in the form of a merchant relatedness score 200.
The account-merchant relationship table 204 stores a summary of the transactions between financial accounts and merchants. For example, a sample data entry 300 that can be stored in the account-merchant relationship table 204 is shown in
Stated in another way, the collection of one or more sample data entries 300 that can be stored in the account-merchant relationship table 204 is analogous to the account-merchant bipartite graph 304.
Turning to
The merchant relationship table 205 can be a data table having entries that store one or more relationship attributes between two merchants. For example, a sample data entry 500 that can be stored in the merchant relationship table 205 is shown in
Each data entry 500 in merchant relationship table 205 can be interpreted as an edge in a merchant relationship graph 305, as shown in
When the transaction server 102 receives the transaction request 100, the transaction server 102 can perform a full fraud-detecting and transaction-servicing method according to any method described herein, including a fraud detection method 7000 shown in
Turning to
In step 702, the transaction server 102 requests that the transaction validator 103 assess the validity of transaction request 100.
When the transaction validator 103 receives the transaction request 100 from the transaction server 102, the transaction validator 103 begins the validation test 710 (which will be described further below). The transaction validator 103 concludes the validation test 710 by sending its validity assessment 104 to the transaction server 102.
After performing the validation test 710, the transaction server 102 conditionally executes the transaction request (step 720).
Also after performing the validation test 710, the merchant relationship validator 202 then uses the data from the transaction request 100 to update the statistics recorded in the account-merchant relationship table 204 and the merchant relationship table 205 (step 730).
With reference now to
Turning to
If there are other transaction validator modules 203, the validation supervisor 201 issues requests to some or all of the other optional transaction validator modules 203 (step 704). Each of the other transaction validator modules 203 reports its result as a validity score 213. The data type of the validity score 213 may be a Boolean (True or False value), or the data type may be numerical. In some embodiments, each of the validity scores 213 and the merchant relatedness score 200 has two possible values. In some embodiments, the validation supervisor 201 prioritizes the other transaction validator modules 203 such that the merchant relationship validator 202 and some transaction validator modules 203 are always used, and some other transaction validator modules 203 are used subsequently, if and only if the higher priority validation tests do not produce conclusive results. For example, if the validator prioritization aspect of the validation supervisor 201 is implemented in an imperative programming language such as C or Java, then the prioritization can be implemented by using conditional IF statements in sequence. An inconclusive validation test result could be, for example, noticing that the transaction request 100 is with a merchant in a foreign country. The cardholder may be traveling, or the card number may have been stolen.
As discussed above, the optional transaction validator modules 203 verify any aspects of the transaction request 100 other than the account-merchant relationship. Accordingly, the validation test in step 710 continues when the other optional transaction validator modules 203 perform their other validation tests and send validity scores 213 to the validation supervisor 201 (step 714). Other validation tests may include, for example, checking credit card numbers and purchaser identities against lists of known stolen cards and stolen identities.
Subsequently, in step 705, the validation supervisor 201 evaluates the received validity scores 200 and 213 and tries to reach a decision on the validity of the transaction request 100. One embodiment of the decision on the validity of the transaction request 100 in step 705 is shown in
With reference now to
If neither of the two previous cases (from decision block 750 or decision block 752) is true, then it is the case that some of the validity scores 200 and 213 is given an uncertain risk. Then, if there are still other transaction validators 203 that have not yet returned a validity score 213 (decision block 754), then the validation supervisor 201 authorizes some or all of the remaining transaction validators 203 (step 766). However, if there are no more remaining transaction validators 203 (and there are some uncertain risk scores), then the validation supervisor 201 must use the available validity scores 200 and 213 to determine whether the transaction request 100 is “Valid” or “Invalid” (step 764). The validation supervisor 201 may use any viable decision method. Methods include having step 764 sending a positive validity assessment 104 (optimistic), having step 764 sending a negative validity assessment 104 (pessimistic), and/or having step 764 generating a random validity assessment 104 (probabilistic).
Returning to
If the validity assessment 104 is positive (decision block 707), the transaction request 100 is executed (step 708). If the validity assessment 104 is negative (decision block 707), the transaction request 100 is aborted and rolled back if necessary (step 709). As stated earlier, rolling back refers to the reversal of any parts of the transaction operations requested in transaction request 100 that may have already taken place.
In step 730, the merchant relationship validator 202 uses data from the transaction request 100 to update statistics in the account-merchant relationship table 204 and merchant relationship table 205. In some embodiments, the step 730 may occur in parallel with step 720.
As discussed above, the merchant relationship validator 202 can use the edge-weighted merchant relationship graph 305 to compute the cumulative relatedness between the merchant mx and the set of merchants previously used by customer account fi in step 703. In some embodiments, the merchant relationship table 205 stores the commonality 501 between merchants for the cases of one degree of separation. The commonality score 501 takes into account only first-degree connections between merchants. That is, a first-degree connection between merchants mx and my occurs when an account has transacted with both mx and my. However, more distant relationships between merchants are possible. For example, while no single account may have transacted with both merchants m1 and m4, there may be some accounts that have transacted with both m1 and m2, and a disjoint set of accounts which have transacted with both m2 and m4. Therefore, there exists some transitive relatedness between m1 and m4. With reference to
m1→m2→m4 (path 1)
m1→m3→m4 (path 2)
m1→m2→m3→m4 (path 3)
The merchant relationship validator 202 extracts all eligible paths from the customer Account fi's past merchants to the current merchant mx in the merchant relationship graph 305 and applies a path aggregation method, which combines all the eligible paths to produce a merchant relatedness score 200. Any path aggregation method can be used. For example, in one embodiment, the path aggregation method is as follows:
The path aggregation method comprises mathematical rules which specify, for each individual path or set of paths P between merchants mx and my, a function value f(mx, my, P). The merchant relationship score 200 for (mx, my) is equal to the function value for the collection of all eligible paths from mx to my.
To compute the merchant relationship score 200 between merchants mx and my, merchant relationship validator 202 starts by computing the function values for the individual edges which comprise the full set of eligible paths from mx to my. The merchant relationship valuator 202 continues by computing the function values for longer paths and for paths in parallel with one another, until the merchant relationship validator 202 has computed the function value for the set of all eligible paths between mx and my. To compute the function values for individual edges and for larger groupings of edges, the merchant relationship validator 202 applies the following four rules:
1. Single edge: The function value of a single edge is the edge's merchant commonality score 501. A single edge is a path collection containing one path of length 1.
2. Series aggregation: If a path collection P1(mx,my) is appended to a path collection P2(my,mz) to make a longer path collection Q(mx,mz), the function value of Q(mx,mz) is less than or equal to either the function value P2(mx,my) or the function value P2(my,mz). For example, one particular rule is Q(mx,mz) =min(P1(mx,my),P2(my,mz))−1.
3. Parallel aggregation: If a path collection P1 (mx,my) is merged with another path collection P2(mx, my) to make a larger collection P3(mx, my), then the function value P3(mx, my) is greater than or equal to either the function value P1 (mx, my) or the function value P2(mx, my). For example, one particular rule is P3(mx, my) =max(Pl(mx, my), P2(mx, my))+1.
4. Maximum value for path of length 0: If a starting vertex (e.g., a past merchant of fi) is the same as the destination vertex (e.g., the current merchant mx), then this is a path with length 0. This path's function value is greater than that of any path with nonzero length. In one embodiment, the merchant relationship validator 202 first computes the function value for the aggregation of all the nonzero length paths. Then, if there is a zero length path, the merchant relatedness score 200 is set to be even higher than the function value of the set of non-zero length paths. Alternately, in another embodiment, if an account fi has previously transacted with the merchant mx, the merchant relationship validator 202 can automatically assign the transaction request 100 a very high relatedness score 200. The merchant relationship validator 202 would not need to consider any non-zero paths to compute the relatedness score 200.
As previously discussed, the merchant relationship validator 202 determines whether a path between two vertex points is an eligible path. In the preferred embodiment, all paths that are shorter than a predefined limit are eligible. Using a predetermined limit advantageously avoids an excessive number of paths. In another embodiment, only shortest paths from the start vertex to the end vertex are eligible. Longer paths are not eligible. To better understand these two embodiments for limiting the eligible paths, consider a 5-vertex graph in which every vertex has a direct connection to every other edge. The vertices in this example are A, B, C, D, and E. Because the graph is fully connected, any sequence of these five letters which begins with A and ends with B corresponds to a path between A and B. In an embodiment in which only shortest paths are eligible, then AB would be the only eligible path. In an embodiment in which all paths with up to 2 edges are eligible, than AB, ACB, ADB, and AEB are the eligible paths.
An example of the path aggregation method is illustrated in
The overall path aggregation computation is analogous to determining the conductance in an electrical network. Electrical resistance—the inverse of electrical conductance—is easy to calculate and handles the special case of a zero-length path (e.g., when the account fi has transacted with the merchant mx before) by simply assigning a resistance of 0. In particular, electrical resistance follows these rules:
1. The resistance of a merchant-merchant edge is the inverse of the edge's weight.
2. When there are two parallel paths, the net resistance is the inverse of the sum of the inverses of the individual path resistances.
3. When a path consists of two subpaths in series, the net resistance is the product of the resistances of the subpaths.
4. The resistance from a point to itself is 0.
In step 703 shown in
In some embodiments, the merchant relationship score 200 can have a fixed range of values (e.g., from 1 to 10) or be open-ended (i.e., with either no upper limit, no lower limit, or both). One possible interpretation of the value of the merchant relationship score 200 is that higher scores indicate stronger relationship and weaker risk. Any scoring system is acceptable, as long as the merchant relationship validator 202 and the validation supervisor 201 are based on the same interpretation.
In an alternative embodiment (e.g., employing the electrical resistance analogy discussed above), a score of 0 can be interpreted as maximum relatedness while increasing values can be interpreted as weaker relatedness.
The strongest possible relationship is when the account fi in question has transacted with the merchant mx in question several times before. In the resistance network embodiment, this strong direct relationship corresponds to a resistance of 0, which can be directly assigned a merchant relationship score of 0.
The aforesaid method for computing a path aggregation value provides a method for computing the merchant relativeness score 200 between any two merchants in the merchant relationship graph 305. When the client device 101 submits a transaction request 100, what the client needs to know, however, is not a relatedness between two merchants but a fraud risk between an account and a merchant. To complete the validation test 703, the validation supervisor 202 computes not just one merchant relatedness score 200. Instead, the merchant relationship validator 202 computes a merchant relatedness score 200 between the merchant mx of the transaction request 100 and each of the merchants previously used by the account fi of the transaction request. If fi has previously transacted with fifteen merchants, then the merchant relationship validator 202 computes up to fifteen merchant relatedness scores 200.
The merchant relationship validator 202 may not need to compute fifteen scores. If one merchant's validation assessment 104 is positive, then the validation supervisor 201 can terminate the validation test 710.
Returning to
The merchant relationship validator 202 next updates the merchant relationship table 205. A transaction with mx potentially affects the commonality score 501 between mx and each other past merchant of fi.
The following section describes several embodiments for the commonality score 501 (e.g., the edge weight) between two merchants. The commonality between two merchants mx and my (as distinguished from the cumulative, transitive relatedness) can be the sum of the commonalities contributed by each financial account:
sim(mx, my)=Σ{all accounts fi} C_fi (mx, my)
where C_fi (mx, my) is the commonality score 501 contributed by the account fi.
The number of transactions between fi and mx is notated as n(fi, mx). This number can be read directly from the account-merchant relationship table 204.
A simple commonality score 501 for one account fi is the lesser of n(fi, mx) and n(fi, my). That is, the relationship score is the lesser of the number of fi's transactions with mx and the number of fi's transactions with my. The net commonality score 501 sim(mx, my) is the sum of the account-specific relationship scores, taken over all accounts. This scoring scheme has the advantage of being simple.
The contribution from account fi is C1_fi (mx, my)=min(n(fi, mx), n(fi, my))
In an alternative embodiment, the relationship can represent an average. Specifically, whether the value n(fi, mx) is large or not is relative. One way to account for this relativity is to compare n(fi, mx) to the total number of transactions by user fi. Accordingly, the relatedness is the commonality score 501 determined above divided by the total number of transactions enacted by fi:
C2_fi (mx, my)=C1_fi (mx, my)/n(fi) where n(fi)=total number of transactions enacted by fi.
In yet another alternative, the relative importance of the number of transactions considers the total number of transactions with a given merchant. If one merchant is extremely popular, then the fact that an account has transacted with that merchant several times should not carry much significance. In this embodiment, the number of transactions with each merchant is divided by the logarithm of the total number of transactions with that merchant by any account, n(m). The logarithm is used because the range of values for n(m) can span many orders of magnitude, and the logarithm will compress the range. However, another compression function or no compression function at all can be used.
C3—fi (mx, my)=min(r(fi, mx), r(fi, my))
where r(f, m)=n(f, m)/log n(m)
and n(m)=total number of transactions with merchant m.
Each of the commonality score 501s above, C1_fi; C2_fi; and C3_fi, is for the contribution of a single financial account. The total direct relatedness between two merchants is computed by add the scores from each individual financial account.
In yet another alternative embodiment, the commonality score 501 between two merchants can consider each merchant as possessing a set of accounts. Accordingly, the commonality score 501 is determined by measuring the degree that these two sets overlap. For example, if F(mx) is the set of accounts which have transacted with a merchant mx, the relationship score is the number of accounts which mx and my have in common, divided by the number of accounts that mx and my each have when considered separately:
sim(mx, my)=|F(mx)∪F(my)|/[|F(mx)|+|F(my)|]
In this embodiment, sim(mx, my) is not simply the sum of contributions from each account. On the other hand, it is necessary to compute the F(m) sets, to count their members, and to perform a set union operation. F(m) can be computed by selecting and combining the records in the account-merchant relationship table 204, which reference a particular merchant m. When the number of merchants and accounts is large, a preferred embodiment performs the computation efficiently through distributed computation. Stated in another way, the Merchant Relationship Validator 202 may contain multiple processing units, each responsible for a subset of the merchants or accounts.
Relationship Scores Using Merchant and Transaction Attributes
In addition to the number of transactions which share the same FID, the monetary size and the recency of shared transactions can also be useful contributors to risk assessment. In some embodiments, as show in
As an example, in one embodiment, the commonality score 501 between merchants mx and my is the total dollar amount transacted by the common accounts with those two merchants, divided by the total amount transacted by any accounts with these merchants. If amt(fi, mx) is the total amount transacted by account fi with merchant mx, and amt(mx) is the total amount transacted with merchant mx (by any account), then
sim(mx, my)=Σ{each account fi in (F(mx)∪F(my))} [amt(fi, mx)+amt(fi, my)]/[amt(mx)+amt(my)]
Any suitable method to record and measure transaction age or recency can be used as desired. In one embodiment, transactions are assigned a weight that decreases with age. In other embodiments, a strict time limit is specified: transactions older than a set duration are not considered at all in computing the statistics in the account-merchant relationship table 204 and in the merchant relatedness score 200. The gradual aging and the strict time limit can be applied together or independently.
In some embodiments, the relative location of merchants may be included as a risk assessment. Being close increases the strength of relationship between two merchants. If an account has previously transacted with many merchants that are physically close to the proposed merchant, then the risk may be deemed lower.
Lower and Upper Score Thresholds
If an account fi has previously transacted with N merchants and then transacts with one additional merchant, this potentially added N new entries to the merchant relationship table 205 and to its analogous merchant relationship graph 305. If N is a large number, then this is a large increase in table entries in response to one new transaction. To limit this increase, in some embodiments, a minimum threshold is set for the value of the commonality score 501. The value is only recorded in the merchant relationship table 205 if the value is at least as large as the threshold.
In some embodiments, the fraud detection system 1000 may define an upper threshold for commonality score 501, meaning that if the score is higher than this level, then the risk is considered negligible and no further computation is needed. The merchant relationship validator 202 may define a special value to represent the upper threshold. Once a merchant-merchant pair (mx, my) achieves this score, additional nonfraudulent transactions with either mx or my will not affect this score.
Alternative Method for Updating Statistics in Relationship Tables
In the embodiment disclosed with reference to
Turning to
In an alternative embodiment, there is not a separate executed transaction queue 210. Instead, the transaction server 102 is coupled to or contains a transaction log 150, as shown in
Rather than having and maintaining a separate queue, this embodiment requires a single memory value, for example, an update pointer 152. The update pointer 152 records the location within the transaction log 150 of the last transaction that was used to update the account-merchant relationship table 204 and the merchant relationship table 205. Periodically, the update pointer 152 is accessed and the sequence of transactions from the location of the update pointer 152 to the most current are read and used to perform several updates to the relationship tables (tables 204 to 205). The update pointer 152 is then relocated to the end of the transaction log 104.
For example, when the fraud detection system 1000 is used for the first time, the update pointer 152 points to item 0 in the transaction log 150, because no transactions have been recorded in the relationship tables (tables 204 and 205). Suppose that after fifty transactions transpire, the validation supervisor 201 and the transaction server 102 agree to update the account-merchant relationship table 204 and the merchant relationship table 205. The fifty transaction requests are sent from the transaction log 150 to the merchant relationship validator 202. The merchant relationship validator 202 uses these transaction requests from the transaction log 150 in order to update the tables 204 and 205. The update pointer 152 is repositioned to point at item fifty in the transaction log, that is, at the point in the sequential log between those transactions that have gone through tables update in step 730 and those that have not yet.
The described embodiments are susceptible to various modifications and alternative forms, and specific examples thereof have been shown by way of example in the drawings and are herein described in detail. It should be understood, however, that the described embodiments are not to be limited to the particular forms or methods disclosed, but to the contrary, the present disclosure is to cover all modifications, equivalents, and alternatives.
This application claims priority to United States provisional patent application, Ser. No. 62/018,250, filed on Jun. 27, 2014. Priority to the provisional application is expressly claimed, and the disclosure of the application is hereby incorporated herein by reference in its entirety and for all purposes.
Number | Date | Country | |
---|---|---|---|
62018250 | Jun 2014 | US |