There may be situations in which law enforcement authorities may lack the name of an individual suspected of wrongdoing, but may have some information concerning where the individual was located at particular times in the past. It would be desirable in such situations if the authorities could gain leads as to the individual's identity.
In a situation that is different from the above in some ways but analogous in others, commercial enterprises may have information about some or all of their customers concerning where the customers were located at certain times. The commercial enterprises may find it desirable to learn more information concerning those customers in order to enhance or focus the marketing efforts of the commercial enterprise.
The present inventors have now recognized that payment card transaction data may be analyzed relative to a location profile of a known or unknown individual. Such an analysis may lead to possible or likely identities for an unknown individual by comparing the individual's known or suspected past whereabouts with the geographic and temporal patterns of purchase transactions with particular payment account cards. Such an analysis may also produce information about merchants' customers that may augment what the merchants already know about their customers.
Features and advantages of some embodiments of the present disclosure, and the manner in which the same are accomplished, will become more readily apparent upon consideration of the following detailed description of the disclosure taken in conjunction with the accompanying drawings, which illustrate preferred and exemplary embodiments and which are not necessarily drawn to scale, wherein:
Embodiments of the present invention relate to systems and methods for analyzing transaction data and data indicative of individuals' locations at various times. More particularly, embodiments relate to systems and methods for comparing location and time profiles for individuals with transaction profiles for individual holders of payment card accounts, in order to potentially match transaction profiles with location and time profiles.
A number of terms are used herein. For example, the term “location and time profile” refers to a set of data in which locations are paired with points in time to indicate where an individual (known or unknown) was located at those points in time.
The term “transaction profile” refers to a set of data that reflects payment card account transactions consummated by use of a particular payment card account.
The terms “de-identified data” or “de-identified data sets” are used to refer to data or data sets which have been processed or filtered to remove any personally identifiable information (“PII”). The de-identification may be performed in any of a number of ways, although in some embodiments, the de-identified data may be generated using a filtering process which removes PII and associates a de-identified unique identifier (or de-identified unique “ID”) with each record (as will be described further below).
The term “non-identified data” is used to refer to data that has never been associated with PII for an individual to which it pertains. One example of non-identified data may be data indicative of the whereabouts or possible whereabouts of an unknown individual at certain points in time in the past.
The term “payment card network” or “payment network” is used to refer to a payment network or payment system such as the systems operated by MasterCard International Incorporated (which is the assignee hereof), or other networks which process payment transactions on behalf of a number of merchants, issuers and cardholders. The terms “payment card network data” or “network transaction data” are used to refer to transaction data associated with payment transactions that have been processed over a payment network. For example, network transaction data may include a number of data records associated with individual payment transactions that have been processed over a payment card network. In some embodiments, network transaction data may include information identifying a payment device or account, transaction date and time, transaction amount, and information identifying a merchant or merchant category, and a location at which the transaction occurred. Additional transaction details may be available in some embodiments.
The data analysis system 100 may include a source 102 of network transaction data produced in and stored by a conventional payment network (not shown) in connection with payment card account transactions handled by the payment network. The transaction data may be in the form of transaction profiles, or may be processed so as to be in that form. Each transaction profile may represent transactions performed using a particular payment card account.
Also shown in
Block 106 in
Block 108 in
Also shown in
Features of some embodiments of the present invention will now be described with reference to
The data analysis system 100 includes a matching/probabilistic engine 202 to generate reports and analyses associated with data matched by the matching/probabilistic engine 202. In some embodiments, the matching/probabilistic engine 202 receives or analyzes data from several data sources, including transaction data 204 (which may come from the transaction data source 102 shown in
Furthermore, at block 212, the transaction data 204 may be anonymized by removing any PII therefrom. For example, the anonymizing block 212 may substitute a de-identified unique identifier code for the PII that was associated with each transaction profile be anonymization. In some embodiments the PII may be a PAN (primary account number) for the corresponding payment card account and the de-identified unique identifier code may be generated by applying a function to the PAN. The function may be, for example, a hash function or the like. The anonymizing block 212 may generate a lookup table 214 to link the de-identified unique identifier for each transaction profile to the PAN or other PII originally associated with the transaction profile before it was anonymized. Consequently, in some embodiments, the transaction data as provided to the matching/probabilistic engine 202 may be de-identified data.
At block 216, the location/time data may be pre-processed to place it in a correct format for the matching/probabilistic engine 202 and/or to remove unnecessary data elements.
In some embodiments, the matching/probabilistic engine 202 may operate to perform an inferred match analysis to assess an inferred linkage between the location/time data and the transaction data. The inferred match analysis may be based in part on the portion of the transaction data that indicates the dates/times/locations of the transactions.
As used herein, a module of executable code could be a single instruction, or many instructions, and may even be distributed over several different code segments, among different programs, and across several memory devices. Similarly, operational data may be identified and illustrated herein within modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices, and may exist, at least partially, merely as electronic signals on a system or network. In addition, entire modules, or portions thereof, may also be implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices or the like or as hardwired integrated circuits.
In some embodiments, the modules of
The matching/probabilistic engine 202 may be operated to establish a linkage between the location/time data and the transaction data. In some embodiments, the linkage may be a probability score or other scoring measure that indicates, as between a location and time profile and a transaction profile how likely it is that the two profiles correspond to the same individual. Examples of suitable analytic techniques will be discussed below in connection with
Location and time profiles and transaction profiles may be linked in a many-to-many fashion and given some level of probability and/or a score and/or a classification label for each pattern match (e.g., 100 location and time profiles and 100 transaction profiles could result in 10,000 probabilities or the like).
The profile matching computer 302 may be conventional in its hardware aspects but may be controlled by software to cause it to function as described herein. In some embodiments, functionality disclosed herein may be distributed among two or more computers having a hardware architecture similar to that described below.
The profile matching computer 302 may include a computer processor 300 operatively coupled to a communication device 301, a storage device 304, an input device 306 and an output device 308.
The computer processor 300 may be constituted by one or more conventional processors. Processor 300 operates to execute processor-executable steps, contained in program instructions described below, so as to control the profile matching computer 302 to provide desired functionality.
Communication device 301 may be used to facilitate communication with, for example, other devices (such as sources of location/time data and transaction data). For example, communication device 301 may comprise one or more communication ports (not separately shown), to allow the profile matching computer 302 to communicate with other computers and other devices.
Input device 306 may comprise one or more of any type of peripheral device typically used to input data into a computer. For example, the input device 306 may include a keyboard and a mouse. Output device 308 may comprise, for example, a display and/or a printer.
Storage device 304 may comprise any appropriate information storage device, including combinations of magnetic storage devices (e.g., hard disk drives), optical storage devices such as CDs and/or DVDs, and/or semiconductor memory devices such as Random Access Memory (RAM) devices and Read Only Memory (ROM) devices, as well as so-called flash memory. Any one or more of such information storage devices may be considered to be a computer-readable storage medium or a computer usable medium or a memory.
Storage device 304 stores one or more programs for controlling processor 300. The programs comprise program instructions (which may be referred to as computer readable program code means) that contain processor-executable process steps of the profile matching computer 302, executed by the processor 300 to cause the profile matching computer 302 to function as described herein.
The programs may include one or more conventional operating systems (not shown) that control the processor 300 so as to manage and coordinate activities and sharing of resources in the profile matching computer 302, and to serve as a host for application programs (described below) that run on the profile matching computer 302.
The programs stored in the storage device 304 may also include, in some embodiments, a data preparation program 310 that controls the processor 300 to enable the profile matching computer 302 to perform operations on data received by the profile matching computer 302 to place the data in an appropriate condition for subsequent analysis and profile matching. For example, and as will be described in more detail below, location and/or geographic information may be translated into a standard format to facilitate detection of possible matches between location/time profiles and transaction profiles.
Another program that may be stored in the storage device 304 is profile matching and scoring application program 312 that controls the processor 300 to enable the profile matching computer 302 to perform analysis with respect to the profile data, to detect potential matches between location/time profiles and transaction profiles and to calculate scores that may be applied to the potential matches to indicate the degree of confidence that the potential matches are correct.
In addition, the storage device 304 may store a reporting application program 314 that controls the processor 300 to enable the profile matching computer 302 to report results of the matching and scoring analysis performed by the matching/scoring application program 312.
The storage device 304 may also store, and the profile matching computer 302 may also execute, other programs, which are not shown. For example, such programs may include, e. g., communication software, device drivers, etc.
The storage device 304 may also store one or more databases 316 required for operation of the profile matching computer 302. Such databases, for example, may store at least temporarily the location/time profiles and the transaction profiles to be analyzed and matched by the profile matching computer 302.
At 402 in
As another example, the request may come from a commercial enterprise such as a merchant or service company. The commercial enterprise may have location and time information about their customers, but may wish to increase their understanding of their customer base by learning about the customers' spending behavior as tied in with the customers' movements from place to place.
It is also possible that the commercial enterprise may be a social network that has “check-in” information about its users, and would like to learn about the user's spending habits to improve targeting of advertising to the social network's users. In some embodiments, the location and time location may be obtained based on interactions with customer's mobile phones and/or may be supplied by a mobile network operator (MNO).
For convenience of reference, the entity requesting the analysis, whether law-enforcement related or commercial or otherwise, will hereinafter be referred to as the “client”.
In addition to the foregoing, other types of requests from clients are also possible.
At 404 in
The time information in the location/time data pairs may also be presented with varying degrees of granularity. In some embodiments the indicated time is a date, such that the granularity of the time information is daily. Other granularities are also possible, such as weekly, monthly or hourly.
At 406 in
In some embodiments, for the purpose of the intended profile matching analysis, each profile may be formatted as a collection of location/time data pairs (with a unique identifier appended to each profile), with each such pair indicative of the time and location at which the payment card account in question was used for a respective transaction. Both the granularity of the location information and of the time information in the transaction profile data set may match the granularities of the data set received from the client at 404.
In some embodiments, the data as received from the client, and/or the transaction data, may not be in an optimal format for the intended matching analysis. In such situations, the profile matching computer 302 may reformat the data from the client and/or the transaction data to facilitate the matching analysis.
At 408, the profile matching computer 302 performs an analysis to detect matches or potential matches between the location/time profiles in the client-supplied data set with the profiles in the transaction data set. It is contemplated to use one or more of a considerable number of matching techniques, including one or more measurements of correlation, linear or logistic regression, variable reduction analysis, distance statistics, clustering analysis, and/or decision tree analysis. Other matching analysis techniques may also or alternatively be used. In one particular embodiment, as described below in connection with
In some embodiments, test data sets may be generated using volunteers to travel from place to place and engage in payment card transactions, and the suitability and/or effectiveness of matching analysis techniques may be evaluated by applying the techniques to the test data sets.
A location data indicator in a client-supplied data pair need not necessarily be identical to a location data indicator in a transaction data pair for matching or partial or tentative matching to be declared or detected. For example, a distance threshold, or more than one threshold, may be applied such that location indicators within the threshold distance(s) relative to each other may be deemed to be matching or partially matching. For example, and assuming that both the client data set location indicators and the transaction data set location indicators are in the form of highly granular longitude/latitude information, in some embodiments locations within five miles of each other could be deemed matching; i.e., five miles could be the threshold distance for declaring a match in this embodiment. Other threshold distances may alternatively be employed. In some embodiments, the threshold distances may be adjusted by population density in the relevant area, such that two locations in an urban area must be closer to each other to be deemed “matching” as compared to locations in a rural area.
At 410, the profile matching computer 302 may generate or apply scores to the proposed matches of client-provided profiles with transaction data profiles (unless the matching analysis technique(s) employed at 408 inherently provided confidence/probability scoring). In addition or alternatively at 410, the profile matching computer 302 may classify the profile matches. For example, as noted above, thresholds may be applied to classify the profile matches as “close”, “moderate”, “loose” or “no match”. Other sets of classifications may alternatively be used in other embodiments.
At 412, the profile matching computer 302 may report the result of steps 408 and 410 to an operator and/or to the client.
Referring to
At 504, for the current location/time profile, each transaction data profile is considered in turn. For the transaction data profile under consideration, a percentage is calculated as the percentage of location/time data pairs in the current location/time profile that are deemed to have at least one matching transaction in the transaction data profile under consideration (i.e., that are deemed to be represented in the transaction data profile under consideration).
In some embodiments, a decision block 506 may follow block 504. At decision block 506 a threshold may be applied to the percentages calculated at 504, and only for percentages in excess of the threshold will the corresponding transaction data profile be deemed a match or potential match for the current location/time profile.
Block 508 may follow decision block 506 if one or more positive determinations are made at decision block 506 (or block 508 may directly follow block 504 if decision block 506 is not present in the process of
For each proposed matching transaction data profile selected at 508, block 510 may be performed. At block 510, for the proposed matching transaction data profile in question, a percentage is calculated as the percentage of transactions in the transaction data profile that are deemed to have at least one matching location/time data pair in the current location/time profile (i.e., that are deemed to be represented in the current location/time profile). The percentage calculated at 504 for the proposed match of the current location/time profile and the transaction profile selected at 508 and considered at 510 may be referred to as the “first direction representation percentage”. The percentage calculated at 510 may be referred to as the “second direction representation percentage”.
Block 512 may follow block 510. At block 512, the profile matching computer 302 may calculate a score for each proposed match of the current location/time profile with a transaction data profile selected at 508. For example, the score at block 512 may be calculated as a formula that has the first direction representation percentage and the second direction representation percentage as inputs. Such a formula may, for example, be a linear combination of the first direction representation percentage and the second direction representation percentage, with respective weighting factors being applied to the two percentages. Other approaches are also possible, including nonlinear combinations of the two percentages. Other types of scoring may also be performed at 512, including scoring that does not utilize one or both of the first direction representation percentage and the second direction representation percentage.
In the process illustrated in
An inferred match analysis as described in connection with
In cases where the client is a commercial enterprise, it may be more likely that the operator of the profile matching computer 302 may not provide PII to the client. However, the matches or potential matches developed with the processes of
Although a number of “assumptions” are provided herein, the assumptions are provided as illustrative but not limiting examples of one particular embodiment—those skilled in the art will appreciate that other embodiments may have different rules or assumptions.
As used herein and in the appended claims, the term “computer” should be understood to encompass a single computer or two or more computers in communication with each other.
As used herein and in the appended claims, the term “processor” should be understood to encompass a single processor or two or more processors in communication with each other.
As used herein and in the appended claims, the term “memory” should be understood to encompass a single memory or storage device or two or more memories or storage devices.
The flow charts and descriptions thereof herein should not be understood to prescribe a fixed order of performing the method steps described therein. Rather the method steps may be performed in any order that is practicable.
As used herein and in the appended claims, the term “payment card system account” includes a credit card account, a deposit account that the account holder may access using a debit card, a prepaid card account, or any other type of account from which payment transactions may be consummated. The terms “payment card system account” and “payment card account” are used interchangeably herein. The term “payment card account number” includes a number that identifies a payment card system account or a number carried by a payment card, or a number that is used to route a transaction in a payment system that handles debit card and/or credit card transactions. The term “payment card” includes a credit card, debit card, prepaid card, or other type of payment instrument, whether an actual physical card or virtual.
Although the present disclosure has been described in connection with specific exemplary embodiments, it should be understood that various changes, substitutions, and alterations apparent to those skilled in the art can be made to the disclosed embodiments without departing from the spirit and scope of the disclosure as set forth in the appended claims.
This application claims the benefit of U.S. Provisional Patent Application No. 61/974,670 filed on Apr. 3, 2014, the contents of which are hereby incorporated by reference for all purposes.
Number | Date | Country | |
---|---|---|---|
61974670 | Apr 2014 | US |