The foregoing and other objects, aspects and advantages will be better understood from the following detailed description of a preferred embodiment of the invention with reference to the drawings, in which:
Referring now to the drawings, and more particularly to
The method described here uses several relationships to evaluate the models and related model data. These relationships utilize the following variables:
A database 101 contains customer data (x, y) where x are different properties of customers and y is the quantity of interest (e.g., the revenue generated by the customer). It should be noted that y is a vector and x is a matrix of length n and width equal to the number of different customer properties (also called features). The database also contains k models (m1, . . . , mk) that for each customer can predict the quality of interest given xi where ŷi=m1(x).
In function block 110, the customer data (x, y) are sorted in increasing order of y and stored in the database 101. The resulting sorted customer data (xi, yi) has the property yi>yj if i=j for i=1, . . . , n customers.
In function block 111, all models are applied to the sorted customer properties x to obtain predictions ŷi=m1(xi) for all customers i from all models l. In function block 112, calculations are made for each model l, the respective predicted rank rl of the predictions ŷ1. Note that each rl is a vector of length n and that the order of the entries in vector r still reflects the order of the true value y. For example, if there are three customers with ordered revenue values $3, $15, $57 for which the model l prediction revenue values: $5.00, $100 and $0, the predicted ranking would be r=2, 3, and 1. Formally, ri is the rank of customer and/or potential customer i in this order:
s
i
=|{j≦n|ŷ
i
≦ŷ
j}|
The invention considers two ranking-based evaluation measures and their interpretations (e.g., ranking order entries in model ranking table, etc.). Function block 113 calculates for each model the number of ranking order switches:
T=Σ
i≦j1{si>sj} (2)
and the weighted sum of order switches:
R=Σ
i<j(j−i)1(si>sj) (3)
The first measure simply counts how many of the pairs in the test data are ordered incorrectly by the model m(x). The second measure also considers these incorrect orderings, but weighs them by the difference in their model ranks, that is, a measure of the magnitude of error being committed. The results of each of these steps are stored electronically in the system database 101.
In function block 114, the ranks are transformed using rescaling equations to put them into the range [−1, 1], where 1 corresponds to perfect model performance (T,R=0) and −1 corresponds to making all possible errors, thus attaining perfect reverse ranking. It is easy to verify that max(T)=n(n−1)=2, max(R)=n(n−1)(n+1)=6. The resealing equations are:
These values are similar to Kendall's τ which measures the strength of the relationship between two variables and Spearman's rank correlation. The moments of {circumflex over (τ)} and {circumflex over (p)} under the relevant null assumptions (τ=0 and p=0, respectively) are calculated and a normal approximation gives a hypothesis testing methodology for the assumption of no correlation. For residual based measures, it is typically not possible to build confidence intervals without parametric assumptions and/or variance estimation. The non-parametric nature of {circumflex over (τ)} allows a general expression for its variance to be written as:
where πc=E({circumflex over (τ)})=½+½τ and πcc are two properties of the ranking function. This is then replaced with the sample means to obtain:
is the number of observations that are “concordant” with observation i, that is, that their ranking relative to i in the model data agrees with the ranking by model scores (as plotted in
In function block 115, three graphical representations of the ordered switches are constructed: (1) percent of correctly ranked pairs involving a particular prediction, (2) AUC as a function of cutoff position, and (3) Lift-curve of the cumulative rank. Examples of these graphical representations are shown respectively in
Starting from the largest model prediction m(xn), function block 115 calculates in decreasing order for each observation Xi the percentage of correctly ranked pairs (yi, yj) over all j≠i. The percentage of correct pairings as a function of the inverse rank is shown in
The performance in a particular region of the graph is characterized by two properties of the plot, 1) the distance of the local optimum from the 100% line, and 2) the distance of the actual performance from the local optimum. A particular region with a performance that on average remains very close to the local optimum has a nearly perfect ranking and is only disturbed by bad predictions that were either larger or smaller than the predictions of the region.
For
For each classification c(i) the model performance is evaluated using the area under the ROC curve (AUC,) (
(1-AUCi)i·(n−i).
In function block 115 of
Since each AUCi is rescaled by a different factor i·(n−i), a graph of the AUCi as a function of the cutoff i would not have an area equal to {tilde over (p)}. In order to achieve a direct correspondence with {tilde over (p)}, i·(n−i) units are allocated to AUCi by resealing the x-axis accordingly.
The plot in
is plotted for increasing cutoffs i in percent. Using the inverse rank emphasizes the model performance on the largest predictions that is shown in the bottom left of the graph. The model performance is bounded above by the optimal ranks
and below by the cumulative worst (inverse) ranking
shown to be equal to
Once all models have been evaluated through function blocks 111-115, function block 116 stores the results and graphical representations in the system database 101. This information is then displayed for the analyst at display block 117. This display can be provided in any format (e.g., printed report, electronic display on a computer monitor, etc.) specified by the user. The analyst then evaluates the model based on the provided information in function block 118 and selects the model that can provide the suggested “best” list of potential customers for targeted sales and marketing efforts. The database is updated with these recommendations at function block 119 and a final report containing the optimal model and customer rankings is provided as the output 102 to be used by company personnel for targeted sales and marketing efforts.
While the invention has been described in terms of its preferred embodiment, those skilled in the art will recognize that the invention can be practiced with modification within the spirit and scope of the appended claims.