This document relates generally to computer-implemented clustering systems and more particularly to performing clustering operations for determining action(s) to be taken with respect to an entity.
The financial industry processes an inordinate number of transactions for their current or prospective customers. Many of these transactions demand that some action be taken on the part of a financial company in order to more completely handle a transaction. As an example, an individual working for a credit card company may be tasked with determining which credit card transactions require an investigation or inquiry into whether a transaction may be fraudulent. The problem may be further compounded if there are multiple possible actions that can be taken with respect to the transactions. Current methods have difficulty in providing an automated or semi-automated mechanism for determining what action if any should be taken for a particular individual or groups of individuals.
In accordance with the teachings provided herein, systems and methods for operation upon data processing devices are provided for determining one or more actions to be taken with respect to a first entity. As an example, a computer-implemented method and system can be configured to receive data that is related to characteristics of the first entity as well as data that is related to a plurality of segments. Assignments are determined between the first entity and the segments based upon the characteristics of the first entity and the characteristics associated with the segments. A determined assignment includes a membership probability that is indicative of how probable is membership of the first entity with respect to a segment. One or more actions are determined for the first entity based upon the membership probabilities and action information associated with the assigned segments.
As another example, a computer-implemented method and system can be configured to receive data that is related to characteristics of the first entity as well as data that is related to a plurality of segments. A segment identifies entities having one or more similar characteristics. A segment is associated with action information. Assignments are determined between the first entity and the segments based upon the characteristics of the first entity and the characteristics associated with the segments. A determined assignment includes a membership probability that is indicative of how probable is membership of the first entity with respect to a segment. One or more actions are determined for the first entity based upon the membership probabilities and the action information associated with the assigned segments.
System 30 determines what action or actions 80 should be performed with respect to an entity 40. In system 30, process 50 creates assignments 60 between segments and the entity 40. Each of the segments is associated with action information (e.g., decision information) so that if an assignment 60 is made between the entity 40 and a particular segment, that particular segment's associated action information is also made part of the assignment. The action information of a selected segment 62 is then used by process 70 to determine what action 80 (if any) should be performed with respect to the entity 40.
The types of candidate segments in pool 52 depends upon the situation at hand. For example within a credit card transaction processing environment, segments can include, among others, a revolver segment and a transactor segment. In this example, a revolver segment includes segment entity members who roll over part of the bill to the next month, instead of paying off the balance in full each month. A transactor segment includes segment entity members who typically pay off the balance in full each month. Accordingly, if an entity is assigned to the revolver segment, then the action information (e.g., raise the entity's credit limit, etc.) associated with the revolver segment is used to determine what action should be taken for the entity.
It should be understood that many different types entities can be processed by the system 30. As an illustration, entities can be an individual or a collection of individuals (e.g., companies) that are conducting financial transactions. Moreover an entity does not have to constitute individuals but an entity rather may be representative of another aspect of the process for which determination of an action needs to be performed.
With reference to
The assignments 60 include an assignment between the entity and the second segment from the pool of segments 52. There is also an assignment between the entity and the fourth segment. By evaluating the characteristics 110 of the entity 40 with the segments' associated characteristics, process 50 has determined that the assignment between the second segment and the entity has a membership probability value of 0.3. Process 50 has also determined that the assignment between the fourth segment and the entity has a membership probability value of 0.65.
As indicated by the higher value in this example, process 50 of
Process 70 of
It should be understood that similar to the other processing flows described herein, the steps and the order of the steps in this example may be altered, modified, removed and/or augmented and still achieve the desired outcome. As an example, a multiprocessing or multitasking environment can allow two or more steps to be executed concurrently.
A process for determining what segment should be assigned to which entities can take many forms. For example, the segments can be designed to optimize a predefined utility function, such as such as credit risk, attrition risk, profitability etc.
Process 310 then assigns actions to each of the segments. This assignment at process 310 can include analyzing historical data of a segment to determine which actions were more effective in handling entities contained within the segment. The more effective actions can then be assigned to the segment.
It should be understood that assignments can be created in other ways, such as by performing a design of experiments using the characteristics of the first entity and the characteristics associated with the segments. An example of using a design of experiments for this purpose includes identifying the effect of credit line increase on credit risk, profitability etc.
As a description of each of these categories, situational cash revolvers can be defined as a customer who has carried a cash revolving balance at least 2 consecutive months out of the last 6 months; situational revolvers can be defined as a customer who is not a situational cash revolver but has carried a revolving balance at least 2 consecutive months out of the last 6 months; and situational transactors can be defined as any customer who does not fall into the aforementioned two categories. For example, segment 1 can contain, to an extent as specified by a membership probability, accounts having statement balances greater than $1000, credit limit less than $5000, late fees less than $50, and delinquency amount less than $100.
As an illustration, for the first account (i.e., account number 5490098403730050), the membership probability for segment 1 is 0.00 since the first account did not share to any significant extent the characteristics that are used to describe segment 1. The first account has a membership probability of 0.42 for the second segment and has a membership probability of 0.58 for the third segment. These membership probabilities show to what extent the first account can be considered a member of a particular segment (e.g., to what extent an account can be considered a situational revolver, a situational transactor, and a situational cash revolver) and are determined based upon how well the account's characteristics compare to characteristics that define the segments. Different comparison algorithms can be used to determine to what extent an account should be clustered with a particular segment, such as a standard k-means clustering method.
Without a membership probability approach, another approach (e.g., a hard segmentation approach) that is used for model building can misclassify customers that for example have revolved 3 out of the last 6 months, but not consecutively, such as every other month. With membership probabilities, these customers are likely to have a higher probability of membership in the “situational revolver” segment and their data will be used appropriately. With the hard segmentation approach, these customers would be classified as “situational transactors” and the models are not likely to be as predictive.
The action-effect models can be used to determine what action should be taken with respect to an entity. In this example, an action can be what product (e.g., credit life insurance, magazine, convenience checks, and a free gift) should be offered to the customer holding the account. Probabilities of offer acceptance are determined based upon the action-effect models. Additional information (e.g., the derived variable data of
AE5(i.e., the first account's probability of accepting credit life insurance=0.04=$S5*(0.01+0.06*$W5/100+0.03*$Y5/100+0.03*$AA5/100)+$T5*(0.01+0.03*$W5/100+0.02*$X5/100+0.01*$Y5/100+0.02*$Z5+0.01*$AA5/100+0.02*$AB5)+$U5*(0.01+0.06*$W5/100+0.03*$Y5/100+0.03*$AA5/100)
AF5=0.05=S5*(0.02+0.07*W5/100+0.04*Y5/100)+T5*(0.04+0.05*W5/100+0.04*Y5/100+0.03*Z5+0.03*AA5/100)+T5*(0.06+0.04*W5/100+0.05*Y5/100)
AG5=0.10=$S5*(0.01+0.03*$Z5/100)+$T5*(0.01+0.01*$W5/100+0.01*$Y5/100+0.01*$Z5+0.03*$AA5/100+0.04*$AB5)+$U5*(0.01+0.07*$W5/100+0.06*$X5+0.06*$Y5/100)
AH5=0.11=$S5*(0.01+0.01*$W5/100+0.03*$X5/100+0.03*$AB5/100)+$T5*(0.02+0.02*$W5+0.02*$X5/100+0.04*$AB5)+$U5*(0.04+0.06*$W5/100+0.07*$X5/100+0.03*$AB5/100)
The membership probabilities can also be used to predict revenue:
AJ5=(i.e., the first account's predicted revenue with respect to credit life insurance)=$8.73=AE5*($S5*(50+12.6*$Z5/100+14.3*$AB5)+$T5*(66.1+13.4*$Z5/100+23.9*$AB5)+$U5*(81.9+14.7*$Z5/100+34.1*$AB5))
AK5=$5.64=AF5*($S5*(20.1+7.6*$Z5/100+4.3*$AB5)+$T5*(26.1+7.4*$Z5/100+12.4*$AB5)+$U5*(31.9+6.4*$Z5/100+24.1*$AB5))
AL5=$10.64=AG5*($S5*(30.4+91*$Z5/100+8.3*$AB5)+$T5*(34.1+8.9*$Z5/100+11.4*$AB5)+$U5*(41.3+8.7*$Z5/100+21.6*$AB5))
AM5=$5.25=AH5*($S5*(6.1+4.3*$Z5/100+2.1*$AB5)+$T5*(9.3+6.9*$Z5/100+4.6*$AB5)+$U5*(14.6+7.9*$Z5/100+12.5*$AB5))
While examples have been used to disclose the invention, including the best mode, and also to enable any person skilled in the art to make and use the invention, the patentable scope of the invention is defined by claims, and may include other examples that occur to those skilled in the art. Accordingly the examples disclosed herein are to be considered non-limiting. As an illustration, the systems and methods may be implemented on various types of computer architectures, such as for example on a networked system, on a single general purpose computer, etc.
As an illustration,
A server 438 accessible through the network(s) 436 can host the clustering system 434. The same server or different servers can contain the various software instructions 435 (e.g., instructions for creating segment assignments, instructions for determining which actions should be taken, etc.) or modules of the clustering system 434. Data store(s) 440 can store the data to be analyzed as well as any intermediate or final data calculations and data results.
The clustering system 434 can be a web-based analysis and reporting tool that provides users flexibility and functionality for performing action determination for one or many entities. Moreover, the clustering system 434 can be used separately or in conjunction with other software programs, such as with other decision making software techniques.
It should be understood that the clustering system 434 can be implemented in many different ways, such as on a stand-alone computer for access by a user as shown in
It is further noted that the systems and methods may include data signals conveyed via networks (e.g., local area network, wide area network, interne, combinations thereof, etc.), fiber optic medium, carrier waves, wireless networks, etc. for communication with one or more data processing devices. The data signals can carry any or all of the data disclosed herein that is provided to or from a device.
Additionally, the methods and systems described herein may be implemented on many different types of processing devices by program code comprising program instructions that are executable by the device processing subsystem. The software program instructions may include source code, object code, machine code, or any other stored data that is operable to cause a processing system to perform methods described herein. Other implementations may also be used, however, such as firmware or even appropriately designed hardware configured to carry out the methods and systems described herein.
The systems' and methods' data (e.g., associations, mappings, etc.) may be stored and implemented in one or more different types of computer-implemented ways, such as different types of storage devices and programming constructs (e.g., data stores, RAM, ROM, Flash memory, flat files, databases, programming data structures, programming variables, IF-THEN (or similar type) statement constructs, etc.). It is noted that data structures describe formats for use in organizing and storing data in databases, programs, memory, or other computer-readable media for use by a computer program.
The systems and methods may be provided on many different types of computer-readable media including computer storage mechanisms (e.g., CD-ROM, diskette, RAM, flash memory, computer's hard drive, etc.) that contain instructions (e.g., software) for use in execution by a processor to perform the methods' operations and implement the systems described herein.
The computer components, software modules, functions, data stores and data structures described herein may be connected directly or indirectly to each other in order to allow the flow of data needed for their operations. It is also noted that a module or processor includes but is not limited to a unit of code that performs a software operation, and can be implemented for example as a subroutine unit of code, or as a software function unit of code, or as an object (as in an object-oriented paradigm), or as an applet, or in a computer script language, or as another type of computer code. The software components and/or functionality may be located on a single computer or distributed across multiple computers depending upon the situation at hand.
It should be understood that as used in the description herein and throughout the claims that follow, the meaning of “a,” “an,” and “the” includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise. Finally, as used in the description herein and throughout the claims that follow, the meanings of “and” and “or” include both the conjunctive and disjunctive and may be used interchangeably unless the context expressly dictates otherwise; the phrase “exclusive or” may be used to indicate situation where only the disjunctive meaning may apply.
This application claims priority to and the benefit of U.S. Application Ser. No. 60/902,379, (entitled “Computer-Implemented Systems and Methods For Action Determination” and filed on Feb. 20, 2007), of which the entire disclosure (including any and all figures) is incorporated herein by reference. This application contains subject matter that may be considered related to subject matter disclosed in: U.S. Application Ser. No. 60/902,378, (entitled “Computer-Implemented Modeling Systems and Methods for analyzing Computer Network Intrusions” and filed on Feb. 20, 2007); U.S. Application Ser. No. 60/902,380, (entitled “Computer-Implemented Semi-supervised Learning Systems And Methods” and filed on Feb. 20, 2007); U.S. Application Ser. No. 60/902,381, (entitled “Computer-Implemented Guided Learning Systems and Methods for Constructing Predictive Models” and filed on Feb. 20, 2007); U.S. Application Ser. No. 60/786,039 (entitled “Computer-Implemented Predictive Model Generation Systems And Methods” and filed on Mar. 24, 2006); U.S. Application Ser. No. 60/786,038 (entitled “Computer-Implemented Data Storage For Predictive Model Systems” and filed on Mar. 24, 2006); and to U.S. Provisional Application Ser. No. 60/786,040 (entitled “Computer-Implemented Predictive Model Scoring Systems And Methods” and filed on Mar. 24, 2006); of which the entire disclosures (including any and all figures) of all of these applications are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
5335291 | Kramer et al. | Aug 1994 | A |
5500513 | Langhans et al. | Mar 1996 | A |
5519319 | Smith et al. | May 1996 | A |
5650722 | Smith et al. | Jul 1997 | A |
5675253 | Smith et al. | Oct 1997 | A |
5677955 | Doggett et al. | Oct 1997 | A |
5761442 | Barr et al. | Jun 1998 | A |
5819226 | Gopinathan et al. | Oct 1998 | A |
5884289 | Anderson et al. | Mar 1999 | A |
5903830 | Joao et al. | May 1999 | A |
5999596 | Walker et al. | Dec 1999 | A |
6021943 | Chastain | Feb 2000 | A |
6029154 | Pettitt | Feb 2000 | A |
6047268 | Bartoli et al. | Apr 2000 | A |
6064990 | Goldsmith | May 2000 | A |
6122624 | Tetro et al. | Sep 2000 | A |
6125349 | Maher | Sep 2000 | A |
6128602 | Northington et al. | Oct 2000 | A |
6170744 | Lee et al. | Jan 2001 | B1 |
6330546 | Gopinathan et al. | Dec 2001 | B1 |
6388592 | Natarajan | May 2002 | B1 |
6422462 | Cohen | Jul 2002 | B1 |
6453206 | Soraghan et al. | Sep 2002 | B1 |
6516056 | Justice et al. | Feb 2003 | B1 |
6549861 | Mark et al. | Apr 2003 | B1 |
6570968 | Marchand et al. | May 2003 | B1 |
6601049 | Cooper | Jul 2003 | B1 |
6631212 | Luo et al. | Oct 2003 | B1 |
6650779 | Vachtesvanos et al. | Nov 2003 | B2 |
6675145 | Yehia et al. | Jan 2004 | B1 |
6678640 | Ishida et al. | Jan 2004 | B2 |
7117191 | Gavan et al. | Oct 2006 | B2 |
7191150 | Shao et al. | Mar 2007 | B1 |
7269516 | Brunner et al. | Sep 2007 | B2 |
7403922 | Lewis et al. | Jul 2008 | B1 |
7455226 | Hammond et al. | Nov 2008 | B1 |
7461048 | Teverovskiy et al. | Dec 2008 | B2 |
7467119 | Saidi et al. | Dec 2008 | B2 |
7480640 | Elad et al. | Jan 2009 | B1 |
7536348 | Shao et al. | May 2009 | B2 |
7562058 | Pinto et al. | Jul 2009 | B2 |
7580798 | Brunner et al. | Aug 2009 | B2 |
7761379 | Zoldi et al. | Jul 2010 | B2 |
7765148 | German et al. | Jul 2010 | B2 |
20020099635 | Guiragosian | Jul 2002 | A1 |
20020138417 | Lawrence | Sep 2002 | A1 |
20020194119 | Wright et al. | Dec 2002 | A1 |
20030093366 | Halper et al. | May 2003 | A1 |
20030097330 | Hillmer et al. | May 2003 | A1 |
20030191709 | Elston et al. | Oct 2003 | A1 |
20040039688 | Sulkowski et al. | Feb 2004 | A1 |
20050055373 | Forman | Mar 2005 | A1 |
20050131873 | Fan et al. | Jun 2005 | A1 |
20060020814 | Lieblich et al. | Jan 2006 | A1 |
20060181411 | Fast et al. | Aug 2006 | A1 |
20060218169 | Steinberg et al. | Sep 2006 | A1 |
20070192167 | Lei et al. | Aug 2007 | A1 |
20070239606 | Eisen | Oct 2007 | A1 |
20080133518 | Kapoor et al. | Jun 2008 | A1 |
20080134236 | Iijima et al. | Jun 2008 | A1 |
Number | Date | Country | |
---|---|---|---|
60902379 | Feb 2007 | US |