Customers of financial institutions often find it difficult to keep track of their account activities. These customers may be unaware of the details of their transactions, account balances, and account policies and may miss potential opportunities and susceptibilities associated with their accounts. For example, a customer may not realize that they are eligible for an upgraded service because they are unfamiliar with their bank's policies and products. Moreover, financial institutions usually have large volumes of data to organize and maintain, and may not have the resources to easily analyze the data and keep customers informed. Such financial institutions may miss opportunities for growth by failing to inform their customers of possible issues, offers, and product updates at the most opportune times. For example, a financial institution may fail to timely notify a customer of an investment offer and may miss an opportunity to strengthen their relationship with the customer as a consequence.
The embodiments presented herein are directed to systems and methods for transforming historical transaction data, collected in response to certain triggers, in order that it may better be used to classify business names and other textual values. For example, in the transaction data, payer and payee names may be populated with many possible permutations. The invention creates “cluster ID”s to assign patterns to specific clusters of data and evaluates whether the textual values exist in an assigned cluster. If such data cannot be assigned to a specific cluster, then a secondary process may search the Internet for the most relevant names, and depending on the accuracy and precision desired by the application, the algorithm assigns a confidence level to the matches between the names found in the Internet search and the unclassified textual value. Thus, a cluster may be assigned to the data. This process may be leveraged for other databases to identify patterns, similarities and clustering to correct data and/or augment information value.
According to embodiments of the invention, a system for transforming historical data collected in response to one or more triggering events, in order to classify textual values includes a computer apparatus including a processor and a memory; and a software module stored in the memory, comprising executable instructions that when executed by the processor cause the processor to identify one or more distinct patterns within the plurality of textual values; group the textual values based on the one or more distinct patterns, thereby forming one or more clusters; apply a similarity gauge to the textual values of each of the clusters to determine similarity or dissimilarity among the textual values of each cluster; and filter the textual values of each cluster to determine which textual values belong in each cluster and which textual values do not belong in each cluster, wherein the textual values that belong are cluster values.
In some embodiments, the instructions, when executed, further cause the processor to remove undesired characters from the textual values.
In some embodiments, identifying one or more distinct patterns within the plurality of textual values comprises comparing pronunciations and/or phonetics of the textual values. In some embodiments, comparing pronunciations and/or phonetics of the textual values comprises applying a double metaphone algorithm to the textual values.
In some embodiments, applying a similarity gauge to the textual values comprises determining a Jaccard distance score among the textual values of each cluster.
In some embodiments, the instructions when executed further cause the processor to connect the textual values that belong in each cluster; and remove the textual values that do not belong in each cluster. In some such embodiments, connecting the textual values that belong in each cluster comprises applying an OPTNET algorithm to the textual values of each cluster.
In some embodiments, filtering the textual values of each cluster comprises determining a Jaccard distance score threshold; comparing the Jaccard distance score to the Jaccard distance score threshold for each of the textual values of each cluster, thereby filtering textual values based on their similarity and/or dissimilarity.
In some embodiments, the instructions when executed further cause the processor to apply a standardized value aggregate to the cluster values of each cluster.
According to embodiments of the invention, a computer program product for transforming historical data collected in response to one or more triggering events, in order to classify textual values, the computer program product includes a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising computer readable program code configured to access a plurality of textual values from historical transaction data; computer readable program code configured to identify one or more distinct patterns within the plurality of textual values; computer readable program code configured to group the textual values based on the one or more distinct patterns, thereby forming one or more clusters; computer readable program code configured to apply a similarity gauge to the textual values of each of the clusters to determine similarity or dissimilarity among the textual values of each cluster; and computer readable program code configured to filter the textual values of each cluster to determine which textual values belong in each cluster and which textual values do not belong in each cluster, wherein the textual values that belong are cluster values.
In some embodiments, the computer readable program code further comprising computer readable program code configured to remove undesired characters from the textual values.
In some embodiments, identifying one or more distinct patterns within the plurality of textual values comprises comparing pronunciations and/or phonetics of the textual values.
In some embodiments, comparing pronunciations and/or phonetics of the textual values comprises applying a double metaphone algorithm to the textual values.
In some embodiments, applying a similarity gauge to the textual values comprises determining a Jaccard distance score among the textual values of each cluster.
In some embodiments, the computer readable program code includes computer readable program code configured to connect the textual values that belong in each cluster; and computer readable program code configured to remove the textual values that do not belong in each cluster. In some such embodiments, the computer readable program code includes computer readable program code configured to apply an OPTNET algorithm to the textual values of each cluster.
In some embodiments, the computer readable program code includes computer readable program code configured to determine a Jaccard distance score threshold; and computer readable program code configured to compare the Jaccard distance score to the Jaccard distance score threshold for each of the textual values of each cluster, thereby filtering textual values based on their similarity and/or dissimilarity.
In some embodiments, the computer readable program code includes computer readable program code configured to apply a standardized value aggregate to the cluster values of each cluster.
According to embodiments of the invention, a method for transforming historical data collected in response to one or more triggering events, in order to classify textual values includes accessing a plurality of textual values from historical transaction data; identifying one or more distinct patterns within the plurality of textual values; grouping the textual values based on the one or more distinct patterns, thereby forming one or more clusters; applying a similarity gauge to the textual values of each of the clusters to determine similarity or dissimilarity among the textual values of each cluster; and filtering the textual values of each cluster to determine which textual values belong in each cluster and which textual values do not belong in each cluster, wherein the textual values that belong are cluster values.
In some embodiments, the method also includes removing undesired characters from the textual values.
The features, functions, and advantages that have been discussed may be achieved independently in various embodiments of the present invention or may be combined with yet other embodiments, further details of which can be seen with reference to the following description and drawings.
The present embodiments are further described in the detailed description which follows in reference to the noted plurality of drawings by way of non-limiting examples of the present embodiments in which like reference numerals represent similar parts throughout the several views of the drawings and wherein:
Introduction
As discussed above, the embodiments presented herein are directed to systems and methods for transforming historical transaction data, collected in response to certain triggers, in order that it may better be used to classify business names and other textual values. For example, in the transaction data, payer and payee names may be populated with many possible permutations. The invention creates “cluster ID”s to assign patterns to specific clusters of data and evaluates whether the textual values exist in an assigned cluster. If such data cannot be assigned to a specific cluster, then a secondary process may search the Internet for the most relevant names, and depending on the accuracy and precision desired by the application, the algorithm assigns a confidence level to the matches between the names found in the Internet search and the unclassified textual value. Thus, a cluster may be assigned to the data. This process may be leveraged for other databases to identify patterns, similarities and clustering to correct data and/or augment information value.
As an input, the system may receive an input list of distinct business names. The data may be cleaned by removing numerals, special characters and the like. Then, the data, e.g., business names, can undergo a three step clustering process. First, pronunciations and phonetics of the names are compared. An example of the first step is the double metaphone process, which is an algorithm to code words phonetically by reducing them to a combination of consonant sounds. The process returns two codes if a word has two plausible pronunciations, thereby reducing matching problems from wrong spellings. Once the distinct patterns are identified, they are then grouped into clusters.
Second, for each cluster, the system identifies the similar patterns in the data. In other words, the system determines how similar or dissimilar multiple strings are to one another. An example of the second step is the Jaccard distance process. This measures dissimilarity between sample strings.
Third, a function that connects similar components is applied. In other words, a cutoff may be applied in order to filter the business names that belong to the cluster and those that do not. An example of the third step is the PROC OPTNET, which helps in grouping similar entities, where the input will be pairs of common entities.
Triggers and Data Gathering
The embodiments presented herein are directed to systems and methods for the creation, institution, and management of account related triggers. In some embodiments, a system that supports ideation, sizing, design, production, and maintenance of triggers is provided. The system develops effective communication routines to aid in trigger delivery.
As will be appreciated by one skilled in the art, aspects of the present embodiments of the invention may be embodied as a system, method, or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present embodiments of the invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Computer program code for carrying out operations for aspects of the present embodiments of the invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present embodiments of the invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
As presented herein, embodiments that enhance and maintain customer relationships with a financial institution via financial account related triggers are provided. As used herein, the term “trigger” refers to, but is not limited to, account activity, transactional data, account costs, account terms and conditions associated with one or more financial accounts, and non-financial data such as online data. Exemplary triggers include transactions and/or events associated with various accounts, such as a checking account, savings account, credit card account, retirement account, investment vehicle, or other type of account. Non-financial exemplary triggers include referrals from an online domain and online cookies. Specific events or trends in account or online activity are used to accomplish various objectives in the support and maintenance of user accounts to thereby increase user satisfaction and account profitability.
Referring now to the figures,
The computing device 200 is configured to communicate over a network 150 with a financial institution's banking system 300 and, in some cases, a third party system 170, such as one or more other financial institution systems, a vendor's system, an online domain, a POS (point of sales) device, and the like. The user's computing device 200, the financial institution's banking system 300, and a trigger repository 400 are each described in greater detail below with reference to
In general, the computing device 200 is configured to connect with the network 150 to log the user 110 into the financial institution's banking system 300, such as an online banking system. The computing device 200 is also configured to connect with the network 150 to allow the user 110 to access the third party system 170, such as an online domain. The banking system 300 involves authentication of a user in order to access the user's account on the banking system 300. For example, the banking system 300 is a system where a user 110 logs into his/her account such that the user 110 or other entity can access data that is associated with the user 110. For example, in one embodiment of the invention, the banking system 300 is an online banking system maintained by a financial institution. In such an embodiment, the user 110 can use the computing device 200 to log into the banking system 300 to access the user's online banking account. Logging into the banking system 300 generally requires that the user 110 authenticate his/her identity using a user name, a passcode, a cookie, a biometric identifier, a private key, a token, and/or another authentication mechanism that is provided by the user 110 to the banking system 300 via the computing device 200. The financial institution's banking system 300 is in network communication with other devices, such as the third party system 170 and the trigger repository 400.
In some embodiments of the invention, the trigger repository 400 is configured to be controlled and managed by one or more third-party data providers (not shown in
Referring now to
As used herein, a “processing device,” such as the processing device 220 or the processing device 320, generally refers to a device or combination of devices having circuitry used for implementing the communication and/or logic functions of a particular system. For example, a processing device may include a digital signal processor device, a microprocessor device, and various analog-to-digital converters, digital-to-analog converters, and other support circuits and/or combinations of the foregoing. Control and signal processing functions of the system are allocated between these processing devices according to their respective capabilities. The processing device 220 or 320 may further include functionality to operate one or more software programs based on computer-executable program code thereof, which may be stored in a memory. As the phrase is used herein, a processing device 220 or 320 may be “configured to” perform a certain function in a variety of ways, including, for example, by having one or more general-purpose circuits perform the function by executing particular computer-executable program code embodied in computer-readable medium, and/or by having one or more application-specific circuits perform the function.
As used herein, a “user interface” 230 generally includes a plurality of interface devices that allow a customer to input commands and data to direct the processing device to execute instructions. As such, the user interface 230 employs certain input and output devices to input data received from the user 110 or output data to the user 110. These input and output devices may include a display, mouse, keyboard, button, touchpad, touch screen, microphone, speaker, LED, light, joystick, switch, buzzer, bell, and/or other customer input/output device for communicating with one or more customers.
As used herein, a “memory device” 250 or 350 generally refers to a device or combination of devices that store one or more forms of computer-readable media and/or computer-executable program code/instructions. Computer-readable media is defined in greater detail below. For example, in one embodiment, the memory device 250 or 350 includes any computer memory that provides an actual or virtual space to temporarily or permanently store data and/or commands provided to the processing device 220 when it carries out its functions described herein.
It should be understood that the memory device 350 may include one or more databases or other data structures/repositories. The memory device 350 also includes computer-executable program code that instructs the processing device 320 to operate the network communication interface 310 to perform certain communication functions of the banking system 300 described herein. For example, in one embodiment of the banking system 300, the memory device 350 includes, but is not limited to, a network server application 370, an authentication application 360, a user account data repository 380, which includes user authentication data 382 and user account information 384, and a banking system application 390, which includes a trigger repository interface 392 and other computer-executable instructions or other data such as a trigger software module. The computer-executable program code of the network server application 370, the authentication application 360, or the banking system application 390 may instruct the processing device 320 to perform certain logic, data-processing, and data-storing functions of the online system 700 described herein, as well as communication functions of the banking system 300.
In one embodiment, the user account data repository 380 includes user authentication data 382 and user account information 384. The network server application 370, the authentication application 360, and the banking system application 390 are configured to implement user account information 384 and the trigger repository interface 392 when monitoring the trigger data associated with a user account. The banking system application 390 includes a trigger software module for performing the steps of methods and systems 500-1100.
As used herein, a “communication interface” generally includes a modem, server, transceiver, and/or other device for communicating with other devices on a network, and/or a user interface for communicating with one or more customers. Referring again to
The network communication interface 410 is a communication interface having one or more communication devices configured to communicate with one or more other devices on the network 150. The processing device 420 is configured to use the network communication interface 410 to receive information from and/or provide information and commands to the user's computing device 200, the third party system 170, the trigger repository 400, the banking system 300 and/or other devices via the network 150. In some embodiments, the processing device 420 also uses the network communication interface 410 to access other devices on the network 150, such as one or more web servers of one or more third-party data providers. In some embodiments, one or more of the devices described herein may be operated by a second entity so that the third-party controls the various functions involving the trigger repository 400. For example, in one embodiment of the invention, although the banking system 300 is operated by a first entity (e.g., a financial institution), a second entity operates the trigger repository 400 that stores the trigger details for the customer's financial institution accounts and other information about users.
As described above, the processing device 420 is configured to use the network communication interface 410 to gather data from the various data sources. The processing device 420 stores the data that it receives in the memory device 450. In this regard, in one embodiment of the invention, the memory device 450 includes datastores that include, for example: (1) triggers associated with a user's financial institution account numbers and routing information, (2) information about sending and receiving users' mobile device numbers, email addresses, or other contact information, which may have been received from the banking system 300, and (3) online data such as browser cookies associated with the user's computing device 200.
Turning now to the production of triggers, in some embodiments, trigger ideas are formulated and undergo a preliminary review. The ideas may be formulated internally, such as by a team of analysts of a financial institution, or the ideas may be formulated externally by segment, channel, and marketing partners of a financial institution. The ideas are prioritized based on an opportunity analysis. For example, transaction channels, transaction categories, business names, amount thresholds, stability, and violation frequencies are selected to determine and quantify opportunities that can be generated from the trigger ideas. These opportunities, such as customer retention and policy education, may be analyzed in view of preferred, retail, and small business demographics. Based on the opportunity review, triggers are developed through rigorous testing. For example, tests may be conducted on transactions associated with a specific account or user. Further, triggers that are similar in scope and that overlap over the same time period may be monitored to further develop the trigger. The results of the testing may then be reviewed to finalize the triggers. In some embodiments, the triggers are modified for automation. For example, the code for automating the triggers may be embellished and specific parameters provided. In further embodiments, the automated triggers are monitored. For example, content and process quality trigger checks can be run on a daily, weekly, bi-weekly, and/or monthly basis.
Trigger End to End 5 Step Process
As shown in
In block 504, patterns of account activity are determined based on the account data. The account activity, in some embodiments, is specifically linked to a transaction category, transaction type, transaction amount, or transaction channel. For example, algorithms may be used to detect upward or downward trends in the number of transactions, the amount of transactions, the occurrence of account costs, or other account activity over a period of time. Deposit amounts for a particular account, for example, may increase during the month of April for several years in a row and provide an indication that the account user has received a tax refund.
In block 506, parameters associated with the patterns are identified, where the parameters include transaction channels, transaction categories, amount thresholds, business names, stability, and violation frequencies. The parameters are identified, in some embodiments, by using algorithms, keywords, Boolean, transaction channel codes, transaction amount calculations, and threshold amounts to search the account data related to the patterns of account activity. The keywords include business names, merchant names, third party financial institution names, web addresses, transaction dates, transaction amounts, user identification, account identification, and the like.
Transaction channels include transaction processes such as electronic funds transfers, automatic deposits and withdrawals, ATM withdrawals and deposits, point-of-sale (POS) purchases, and the like. For example, triggers directed to deposit transactions may include transaction channel parameters such as teller deposits, ATM deposits, ACH deposits, internal transfers, automatic transfers, and pay roll transfers.
Transaction categories include transactions that are grouped according to a desired outcome or purpose. Exemplary transaction categories include user retention, increasing a user's transactional depth or account breadth, timely identification of outside transactions, new products, risk mitigation, policy education, and the like.
The amount thresholds include predetermined amounts associated with one or more transactions such as minimum and/or maximum percent, total, average, or median limits for quantities or values associated with one or more transactions. For example, some parameters may require that all purchases be over a minimum $100 limit and/or under a $10,000 limit. The stability parameters provide an indication of transactions that perform consistently over time, or an indication of transactions that have been adjusted to remove variations in activity over time. For example, the stability parameters may include a range of percentages, ratios, transaction amounts, and frequencies that fall within specific tolerances and that are linked to specific transactions that are tracked over time. Parameters of violation frequencies indicate the frequency of outliers, unexpected events, and negative results in account activity. For example, if the number of ATM withdrawals for a particular account has gradually decreased from six per month to one per month over the last seven months, seven ATM withdrawals on the same day of the current month would indicate a reversal in the trend and would be a violation of the trigger. The violation frequency can indicate an isolated occurrence which can be deleted or ignored from the data, or it can indicate a negative trend. Based on the violation frequency, the parameters of the triggers can be adjusted accordingly.
In block 508, triggers are formed based on the patterns of account activity and the parameters. In some embodiments, the patterns of account activity and the parameters are used to define the triggers. For example, a trigger may be defined by the total monthly number of ATM deposits that occur over a three month period. Further, the patterns of account activity provide the expected trend for transactions defined by the parameters. In the previous example, the trigger may be further defined by requiring that the total monthly number of ATM deposits decrease over the three month period. The patterns of account activity and parameters selected for each trigger may be based on the objective of the trigger. Triggers directed to cross selling investment products to user, for example, may include a pattern of increasing direct deposits in a saving account over a two week period. The triggers, and the patterns and parameters that define the triggers, may take on any number of variations. Specific exemplary triggers are described in more detail below with reference to
The method 500 is further illustrated in
In block 512, one or more of the similar triggers are evaluated over the same period of time. The evaluation of the similar triggers over the same time periods strengthens the trigger data such that any potential flaws, improvements, or strengths in the data are highlighted. In one example, electronic fund transfers associated with multiple accounts are monitored every day over the same six month period. In this way, the number of times the trigger should be run in a week or month, the days of the week for running the trigger, and any discrepancies in the data that occur during particular days of the week, weeks of the month, and months of the year are determined. In some embodiments, a first group of similar triggers is compared to a second group of similar triggers. For example, a group of similar outbound transaction triggers may be compared to a group of similar inbound transaction triggers. In another example, automatic deposits that occur on Mondays may be compared to automatic deposits that occur on Fridays.
In block 514, the parameters associated with the similar triggers are modified in response to the evaluation of the one or more of the similar triggers over the same period of time. One or more of the parameters for a particular trigger can be added or removed and/or the terms of the parameters can be adjusted. Holidays and weekends, for example, may cause discrepancies in the preliminary trigger data and may be taken into account when defining the trigger. Even after the triggers are preliminarily established, the triggers may be continuously monitored on a regular basis as discussed in more detail below with regard to
In block 516, the triggers are categorized based at least on one of a desired objective, a type of transaction, a type of account, an amount threshold, and/or a period of time. In some embodiments, a first group of similar triggers and a different second group of similar triggers are categorized based on the desired objective. For example, ATM deposits may be categorized with payments for education if the purpose of the triggers is to offer the user a loan with a lower interest rate. The triggers categorized according to the desired objective are further categorized according to the type of transaction, the type of account, the amount threshold, and the period of time. In the example above, the ATM deposits used as triggers for the purpose of loan offers may be further categorized according to the amounts of the deposits. In block 518, the categorized triggers are monitored on a period basis, as discussed in further detail below with regard to
Real Time Monitor for Trigger Data Quality
Referring now to
In block 602 of
In block 606, triggers associated with the one or more periods of time are identified based on at least one of a transaction, a transaction amount, a type of transaction, and a type of account. In some embodiments, each set of triggers corresponding to transactions of a certain amount, and/or type are identified first and then the triggers are segregated into time periods. The triggers may be further identified based on a category corresponding to a desired objective. In some embodiments, the triggers are identified based on transactions that occur during the one or more periods of time. For example, a trigger may include all inbound transactions that have values that are greater than a threshold amount and that occur during the month of July.
In block 608, a total transaction count for each of the triggers is calculated. The transaction counts include value amounts for certain transactions associated with one or more accounts or the total number of certain transaction associated with the one or more accounts. In some embodiments, the transaction count is the total number of transactions that occur during the one or more period of time and that are associated with a particular trigger.
Exemplary graphical charts of total counts for a tax refund trigger are illustrated in
In block 610, control limits based on the transaction count for each of the triggers is determined. The control limits are calculated based on trimmed mean and standard deviation. Trimmed mean is calculated by removing a certain percent from the lowest percent of values and an equal certain percent from the highest percent of values in a give data series before calculating the mean. In calculating the trimmed mean, some of the lower numbers of the transaction count and some of the higher numbers of the transaction count are removed before the mean is calculated. For example, tax refund transactions that occur on a Friday and that have a value that is a certain percent higher or lower than the median for all tax refunds that occur on the same Friday are deleted before the mean is calculated.
An exemplary table illustrating the transaction count and control limits is shown in
In block 618, the outliers are tagged. The outliers may be tagged as “outlier” as illustrated in the exemplary table of
In block 620, the cause of the outliers is determined. Periods of time around holidays, cyclic considerations such as tax season, days of the week, weeks of the month, certain historical trends, data obtained from the user, and external data can indicate the cause for the outliers. For example, historical trends may indicate that the number of mortgage payments is higher at the end of the month than at the beginning of the month and the number of ATM withdrawals may be higher on Fridays than it is on Tuesdays. As another example, triggers that include transactions having a specific threshold amount of $10 or greater may have a higher number of transactions during a particular period because a greater number of low end transactions (e.g., transaction of $10 to $12) occur during that period. Based on the cause of the skewed data, appropriate action can be taken. For example, the threshold amount or some other parameter associated with the trigger may be modified or certain triggers associated with a particular day of the week or other period may be tagged as normal even though these certain triggers would appear to be abnormal. Taking the $10 or greater trigger example described above, for example, the threshold amount for that trigger may be increased during the particular period or marked as normal. If the cause of the outliers is not easily explained or if the cause is unexpected, then further investigation may be required.
Although the triggers described herein generally include financial transactions associated with one or more accounts, such as the triggers illustrated in
Mending Through Automated Processes
As discussed above, the embodiments presented herein are directed to systems and methods for transforming historical transaction data, collected in response to certain triggers, in order that it may better be used to classify business names and other textual values. For example, in the transaction data, payer and payee names may be populated with many possible permutations. The invention creates “cluster ID”s to assign patterns to specific clusters of data and evaluates whether the textual values exist in an assigned cluster. If such data cannot be assigned to a specific cluster, then a secondary process may search the Internet for the most relevant names, and depending on the accuracy and precision desired by the application, the algorithm assigns a confidence level to the matches between the names found in the Internet search and the unclassified textual value. Thus, a cluster may be assigned to the data. This process may be leveraged for other databases to identify patterns, similarities and clustering to correct data and/or augment information value.
As an input, the system may receive an input list of distinct business names. The data may be cleaned by removing numerals, special characters and the like. Then, the data, e.g., business names, can undergo a three step clustering process. First, pronunciations and phonetics of the names are compared. An example of the first step is the double metaphone process, which is an algorithm to code words phonetically by reducing them to a combination of consonant sounds. The process returns two codes if a word has two plausible pronunciations, thereby reducing matching problems from wrong spellings. Once the distinct patterns are identified, they are then grouped into clusters.
Second, for each cluster, the system identifies the similar patterns in the data. In other words, the system determines how similar or dissimilar multiple strings are to one another. An example of the second step is the Jaccard distance process. This measures dissimilarity between sample strings.
Third, a function that connects similar components is applied. In other words, a cutoff may be applied in order to filter the business names that belong to the cluster and those that do not. An example of the third step is the PROC OPTNET, which helps in grouping similar entities, where the input will be pairs of common entities.
Referring now to
Referring now to
In some embodiments, a standardized value aggregate is applied to the cluster values of each cluster. This may be done, for example, by searching the Internet for the names most relevant to the clusters. Depending on the accuracy and precision desired, the algorithm assigns a confidence level to the matches between the names found in the Internet search and the unclassified text value. The cluster may be assigned a business name, which may also be referred to herein as a business name aggregate.
Referring now to
Referring now to
Referring now to
Referring now to
Referring to
Referring now to
Referring now to
Referring now to
Referring now to
Referring now to
Referring now to
Referring now to
The flowcharts and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of embodiments of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to embodiments of the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of embodiments of the invention. The embodiment was chosen and described in order to best explain the principles of embodiments of the invention and the practical application, and to enable others of ordinary skill in the art to understand embodiments of the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Although specific embodiments have been illustrated and described herein, those of ordinary skill in the art appreciate that any arrangement which is calculated to achieve the same purpose may be substituted for the specific embodiments shown and that embodiments of the invention have other applications in other environments. This application is intended to cover any adaptations or variations of the present invention. The following claims are in no way intended to limit the scope of embodiments of the invention to the specific embodiments described herein.
Number | Name | Date | Kind |
---|---|---|---|
5864483 | Brichta | Jan 1999 | A |
RE38801 | Rogers | Sep 2005 | E |
7472090 | White | Dec 2008 | B1 |
7657474 | Dybala et al. | Feb 2010 | B1 |
7792727 | Ghosh et al. | Sep 2010 | B2 |
8078529 | Carrier et al. | Dec 2011 | B1 |
8225268 | Nakano | Jul 2012 | B2 |
8229784 | Kala et al. | Jul 2012 | B2 |
8260725 | Crawford | Sep 2012 | B2 |
8380621 | Bent et al. | Feb 2013 | B1 |
8566197 | Satyavolu et al. | Oct 2013 | B2 |
8583550 | Yeri et al. | Nov 2013 | B1 |
8688572 | Shao et al. | Apr 2014 | B2 |
8805730 | Yeri et al. | Aug 2014 | B2 |
8874674 | Allison et al. | Oct 2014 | B2 |
9280747 | Jin | Mar 2016 | B1 |
20020099649 | Lee et al. | Jul 2002 | A1 |
20040024617 | Fralic | Feb 2004 | A1 |
20040039686 | Klebanoff | Feb 2004 | A1 |
20040088243 | McCoy et al. | May 2004 | A1 |
20040177035 | Silva | Sep 2004 | A1 |
20040177053 | Donoho et al. | Sep 2004 | A1 |
20050021456 | Steele et al. | Jan 2005 | A1 |
20050032027 | Patton | Feb 2005 | A1 |
20050086166 | Monk et al. | Apr 2005 | A1 |
20060047593 | Naratil et al. | Mar 2006 | A1 |
20060129896 | Rohn | Jun 2006 | A1 |
20060271997 | Jacoby et al. | Nov 2006 | A1 |
20060276180 | Henry, Jr. | Dec 2006 | A1 |
20070021991 | Etzioni et al. | Jan 2007 | A1 |
20070055594 | Rivest et al. | Mar 2007 | A1 |
20070078719 | Schmitt et al. | Apr 2007 | A1 |
20070078869 | Can et al. | Apr 2007 | A1 |
20070094259 | Shi | Apr 2007 | A1 |
20070106558 | Mitchell et al. | May 2007 | A1 |
20080046334 | Lee et al. | Feb 2008 | A1 |
20080140506 | Christianson et al. | Jun 2008 | A1 |
20080147464 | Sauter et al. | Jun 2008 | A1 |
20080177650 | Jung et al. | Jul 2008 | A1 |
20080177726 | Forbes et al. | Jul 2008 | A1 |
20080189169 | Turpin et al. | Aug 2008 | A1 |
20090024496 | Balachandran et al. | Jan 2009 | A1 |
20090132347 | Anderson et al. | May 2009 | A1 |
20090164351 | Sorbe et al. | Jun 2009 | A1 |
20090164400 | Amer-Yahia et al. | Jun 2009 | A1 |
20090164897 | Amer-Yahia et al. | Jun 2009 | A1 |
20090222325 | Anderson et al. | Sep 2009 | A1 |
20090234715 | Heiser, II et al. | Sep 2009 | A1 |
20090271287 | Halpern | Oct 2009 | A1 |
20090276289 | Dickinson et al. | Nov 2009 | A1 |
20090287536 | Sheng | Nov 2009 | A1 |
20090287687 | Martire et al. | Nov 2009 | A1 |
20090292632 | Dheer et al. | Nov 2009 | A1 |
20090319438 | Jain | Dec 2009 | A1 |
20090327308 | Carter et al. | Dec 2009 | A1 |
20100100424 | Buchanan et al. | Apr 2010 | A1 |
20100106577 | Grimes | Apr 2010 | A1 |
20100145857 | Davila et al. | Jun 2010 | A1 |
20100161379 | Bene et al. | Jun 2010 | A1 |
20100217706 | Griffin et al. | Aug 2010 | A1 |
20100223264 | Bruckner et al. | Sep 2010 | A1 |
20100280892 | Uzunalioglu et al. | Nov 2010 | A1 |
20100280927 | Faith et al. | Nov 2010 | A1 |
20110054981 | Faith et al. | Mar 2011 | A1 |
20110125565 | MacIlwaine et al. | May 2011 | A1 |
20110125643 | Cameo et al. | May 2011 | A1 |
20110166994 | Ross et al. | Jul 2011 | A1 |
20110231305 | Winters | Sep 2011 | A1 |
20110238550 | Reich et al. | Sep 2011 | A1 |
20110246278 | Kubo | Oct 2011 | A1 |
20110246907 | Wang et al. | Oct 2011 | A1 |
20110251917 | Etzioni et al. | Oct 2011 | A1 |
20110295902 | Mande | Dec 2011 | A1 |
20110302079 | Neuhaus | Dec 2011 | A1 |
20110313835 | Falkenborg et al. | Dec 2011 | A1 |
20110313900 | Falkenborg et al. | Dec 2011 | A1 |
20110320250 | Gemmell et al. | Dec 2011 | A1 |
20120059702 | Yoder et al. | Mar 2012 | A1 |
20120066064 | Yoder et al. | Mar 2012 | A1 |
20120078766 | Rose et al. | Mar 2012 | A1 |
20120101881 | Taylor et al. | Apr 2012 | A1 |
20120179633 | Ghani | Jul 2012 | A1 |
20120191776 | Ruffner et al. | Jul 2012 | A1 |
20120215597 | Ross | Aug 2012 | A1 |
20120215610 | Amaro et al. | Aug 2012 | A1 |
20120221505 | Evans et al. | Aug 2012 | A1 |
20120239437 | Harris et al. | Sep 2012 | A1 |
20120239466 | Hu et al. | Sep 2012 | A1 |
20120253918 | Marois et al. | Oct 2012 | A1 |
20120271691 | Hammad et al. | Oct 2012 | A1 |
20130018982 | McConnell et al. | Jan 2013 | A1 |
20130030973 | Ohkubo et al. | Jan 2013 | A1 |
20130054334 | Ross et al. | Feb 2013 | A1 |
20130060669 | Rose et al. | Mar 2013 | A1 |
20130073386 | Rose et al. | Mar 2013 | A1 |
20130090998 | Shimogori | Apr 2013 | A1 |
20130212455 | Titera et al. | Aug 2013 | A1 |
20130325574 | Joa et al. | Dec 2013 | A1 |
20130325598 | Shao et al. | Dec 2013 | A1 |
20130325599 | Yeri et al. | Dec 2013 | A1 |
20130325604 | Yeri et al. | Dec 2013 | A1 |
20130325674 | Yeri et al. | Dec 2013 | A1 |
20130325679 | Yeri et al. | Dec 2013 | A1 |
20130325697 | Allison, Jr. et al. | Dec 2013 | A1 |
20130325698 | Shao et al. | Dec 2013 | A1 |
20130325699 | Yeri et al. | Dec 2013 | A1 |
20130325707 | Joa et al. | Dec 2013 | A1 |
20130325713 | Yeri et al. | Dec 2013 | A1 |
20130325716 | Yeri et al. | Dec 2013 | A1 |
20130325946 | Allison, Jr. et al. | Dec 2013 | A1 |
20140365196 | Melander | Dec 2014 | A1 |
20160041895 | Galvin | Feb 2016 | A1 |
20160041985 | Manterach et al. | Feb 2016 | A1 |
Entry |
---|
Chemical Unveils Basic Bank Account; NY Campaign Targets Blacks, Hispanics; Shoultz, Donald. American Banker (pre-1997 Fulltext) [New York, N.Y] Nov. 24, 1987:2. |
Credit scoring with uncertain class definitions; Kelly etl.; Oct. 1999; ISSN: 0953-0061. |
Ferris, Tom, “Banks warned in wake of Huton mess.” American Banker, SourceMedia Inc. 1985. HighBeam Research. Apr. 14, 2013 http://www.highbeam.com/. |
Community Trend Outlier Detection using Soft Temporal Pattern Mining, Manish Gupta et al, 2001 (Trend Outlier Detection). |
Abnormal Pattern Recognition in Spatial Data, Kou, 2006 (Abnormal Pattern Recognition). |
Behavioral Fraud Mitigation through Trend Offsets, Feb. 2008 (Fraud Mitigation ). |
Regulation D; Reserve Requirements; by Consumer Compliance Handbook; 4 pages; Nov. 2011. |
“ACH/Money Transfer Page” by www.trackdatasecurities.com/; May 15, 2009; 12 pages. |
PayPal Merchant Tools: A Guide for Using Paypal in your Business; Paypal; 24 pages; by PayPal; Oct. 2, 2003. |
Number | Date | Country | |
---|---|---|---|
20170206272 A1 | Jul 2017 | US |