INFORMATION SECURITY SYSTEM WITH RISK ASSESSMENT BASED ON MULTI-LEVEL AGGREGATIONS OF RISK PREDICTORS

Information

  • Patent Application
  • 20190325451
  • Publication Number
    20190325451
  • Date Filed
    June 28, 2019
    5 years ago
  • Date Published
    October 24, 2019
    5 years ago
Abstract
A method of protecting a computer system from fraudulent use includes collecting and aggregating sets of risk predictor values for user-initiated events into user-specific aggregations and organization-wide aggregations, and in response to a current event initiated by a user, generating a risk indicator as a combination of a user-specific indicator and an organization-wide indicator based on current event parameters and the user-specific and organization-wide aggregations. Based on the risk indicator indicating that the current event may be a fraudulent use, a protective control action is taken (such as denying or modifying a requested access) to protect the computer system.
Description
BACKGROUND

The invention relates to the field of information security systems, and in particular to information security systems employing explicit risk assessment as a factor in controlling computer system operation.


An information security system may employ a specialized component referred to as a “risk engine”, which is based on machine-learning technology and utilizes a potentially large number of indicators to evaluate the risk of an activity in real-time. In known systems adaptive behavior of a risk engine relies on reliable fraud and authentication feedback and is leveraged to its full extent when some of the feedback is provided by professional analysts. In such a setting, a risk engine is capable of assessing risk level of a transaction based on the risk history (including outcomes) of the organization that deploys the system. The risk engine employs a risk model based on information specific to users, devices, cookies or other instances, which is kept and maintained by profile databases and used to calculate risk scores. Model tuning is performed periodically based on explicit fraud and authentication feedback.


SUMMARY

At least three problems may arise in relation to current use of risk engine technology. First of all, manual feedback (case management) may be impractical in many use cases, limiting the deployment of RE systems in valuable market areas. One example of such unsupervised environments is enterprise authentication. This is in contrast to certain higher-risk and perhaps higher-value use cases such as online banking, for which there is normally a cadre of security professionals analyzing cases of suspected fraud and providing valuable feedback to risk assessment algorithms being used by a risk engine. Secondly, there can be a non-trivial balance between organization- and user-specific patterns, as well as patterns known from previous knowledge. For example: in a multi-national company it is normal to observe user log-in sessions performed from many geographical locations. However, most individual users have their own log-in patterns that are not taken into considerations. Another problem with the existing approach is the update frequency, leading to the fact that the speed at which the model reacts to the real-life changes is limited to the time period between model updates.


A new method of risk assessment is disclosed that does not depend on explicit feedback, (i.e., it is an unsupervised approach), is instantly self-updating (online) and is based on multiple levels of behavioral history. In one arrangement, a two-level history is used: organization-wide and user-specific history, to reflect both aspects in a risk assessment. Deeper structures are possible, such as a four-level history for Organization, Division, Department, User. This method can use the same kinds of input facts and infrastructure used by conventional supervised risk engine implementations, including for example user and device profiles. Also, it may be usable either as a stand-alone risk scoring component or as an adjunct providing additional predictor inputs to a conventional supervised technique.


In particular, a method is disclosed of protecting a computer system from fraudulent use, which includes continually collecting and aggregating sets of risk predictor values for corresponding user-initiated events (e.g., user logins) occurring in the computer system. The risk predictor values are respective parameters of the events (e.g., user location or device type) and are aggregated into user-specific aggregations for events initiated by respective users of a user population and organization-wide aggregations for events initiated by all users of the user population. In response to a current event initiated by a user, a risk indicator is generated as a combination of a user-specific indicator and an organization-wide indicator, the user-specific indicator being generated based on parameters of the current event and the user-specific aggregations, the organization-wide indicator being generated based on the parameters of the current event and the organization-wide aggregations. Then based on the risk indicator indicating that the current event may be part of a fraudulent use of the computer system, a protective control action is taken (e.g., denying access) to protect the computer system against the fraudulent use.





BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other objects, features and advantages will be apparent from the following description of particular embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views.



FIG. 1 is a block diagram of a computer system;



FIG. 2 is a schematic diagram of a risk model;



FIG. 3 is a block diagram of a risk engine from a software perspective;



FIG. 4 is a schematic diagram of a set of counters;



FIG. 5 is a flow diagram of high-level operation of the risk engine; and



FIG. 6 is a block diagram of a computer from a hardware perspective.





DETAILED DESCRIPTION

The following description uses an adaptive authentication process as a usage example. However, the method can be easily deployed to other scenarios. More generally, a probabilistic framework for risk assessment is described that assigns anomaly (risk) scores to events in time: the higher the score, the more anomalous (less probable) the event is.



FIG. 1 shows a computer system 10 including a risk engine computer or “risk engine” 12 coupled to a protected computer system 14, which may be an individual computer or a collection of computers such as a server farm of an organization. The risk engine 12 includes computer hardware executing a risk engine application (software). The protected system 14 includes applications and data (APPS/DATA) 16 and a separate risk agent (AGENT) 18 in communication with the risk engine 12.


The applications and data 16 are typically providing some type of information service, such as an online service, to a population of users such as a set of private subscribers or the employees of a company for example. The risk engine 12 and agent 18 provide security based on explicit assessment of risk levels of activities/events. Thus in one embodiment, for example, upon an initial access (e.g., login) to a service provided by the apps/data 16, the agent 18 is invoked with parameters of the login as well as other information to obtain a risk assessment as a condition of granting access. The agent 18 in turn makes a request for the risk assessment to the risk engine 12, which performs a risk assessment operation and returns a response. The response may be a so-called “risk score” identifying a level of risk as a numerical value, and the returned value is further processed by the agent 18 and/or the service to arrive at a final decision whether to grant or deny access. In some cases a soft decision might be made, and more information obtained as a condition of granting access. Alternatively, the response from the risk engine 12 may reflect further processing of a raw risk score, and could itself convey a hard access decision.


The focus of this description is operation of the risk engine 12. Very generally, it operates by calculating and combining risk assessments based on the following three questions:

    • 1. How is the current event unlikely to happen in the entire population (for example: employee of USA-based company tries to log in from Europe)?
    • 2. How is the current event unlikely to happen for a particular user (for example: George, who usually uses PC, logs in using a Mac)?
    • 3. How is the current event risky in a more general way, regardless of specific history of the population (for example: new users may generally be considered riskier than long-existing users)?


Components that are related to the first two questions are dynamic ones, meaning that they react to the event history, while the last component is static and is a part of general policy. The relative weight of each of the three components can be configurable according to needs. Risk associated with each dynamic component is calculated as a ratio between two respective counters. In the case of the first component (comparison to entire population) this is the ratio between the number of times that a current event occurred in the population divided by the total number of occurrences of all events in a category. For example: number of log-ins from a specific ISP divided by the total number of log-ins (for all ISPs). In the case of the second component (individual pattern) the ratio is between the number of times the event has been observed for this user and the total number of all events for this user, for the given category. For example: number of times George logged-in using a Mac divided by the total number of log-ins for George. The last (static) component of the model can be calculated using historical data on normal behavior or data on known fraud patterns, or it can be set in a more arbitrary or ad hoc manner as a part of security policy.



FIG. 2 depicts a risk model 20 used by the risk engine 12. The model 20 can be viewed as a layered model in which the nodes belonging to each layer compute a function of the inputs carried on the in-edges and send the output on the out-edges. The bottom layer is the input data for “predictors” P, and the node function at this level is to assign the input data to counts of predictor values Vx also referred to herein as “buckets”. Predictor buckets are combined into categories C at a next higher level. In principle the category bucket set is the Cartesian product of the predictors that integrate it. In practice it may be a sparse set, although it is not always a sparse set. Risk score is calculated for each category based on the counter values for each bucket. Additional computational step may be applied on the resulting ratio value, for example normalization to the number of buckets that build up a category. At a next higher group (G) level, categories are grouped and explicit group scores are calculated and used. Group score may be determined for example by computing a maximum of the separate risk scores for the constituent categories. At the top (TL) level, overall risk is calculated from the group scores, for example by calculating a weighted average thereof.


The specific arrangement in FIG. 2 is not exclusive of other possible arrangements. In particular, in some embodiments there may be value in supporting a separate entity (E) level between the G and TL levels. This would represent aggregation/calculation across each of a set of organizations or other entities however defined.


Below is presented a mathematical description of aspects of operation of the risk engine 12 according to the risk model 20. In this description, the variable “p” refers to an individual distinct predictor P. Predictors P may be assigned to multiple categories C. Each category C has a set of predictors P. A vector ci is defined as a specific realization of category C, where i is an index associated with each predictor combination. In order to simplify the notation it is denoted simply “ci”. Additionally, the users are referred to by the variable “u”.



FIG. 3 shows organization of the risk engine 12. It includes a risk score calculator 30 and a set of counters 32. The risk score calculator 30 includes a set of category calculators 34, a group score calculator (GROUP CALC) 36, and a top-level or “transaction” score calculator (TRANSACTION CALC) 38. As shown, both the risk score calculator 30 and the counters 32 receive event information (EVENT INFO) 40, and the counters 32 also receive authentication results (AUTH RESULTS) 42. The event information 40 is information pertaining to a risk score calculation being performed by the risk engine 12, also referred to as a “risk assessment operation” herein. The event info includes, among other things, input data usable to calculate current values Vx for predictors P. For example, one predictor may be the number of unsuccessful login attempts by a user over a recent period. In this case, the event information 40 includes current information about unsuccessful recent login attempts by the particular user involved in the transaction for which the current risk assessment is being performed. It is noted that the authentication results 42 are not used in current calculations, nor are they tracked over time as part of the learning or updating aspect of operation. They may be used in some embodiments to indicate whether the information for a current operation should be used to update the counters 32, reflecting a view that information for failed authentications should not be used because it may diminish accuracy of risk score calculations.


At a high level, the risk engine 12 basically tracks statistics via the counters 32 and uses the risk score calculator 30 to make probability calculations based on those statistics. Operation has an inherent learning aspect just by tracking detailed statistics for predictors P over a long period of system operation. One benefit of this approach, as mentioned above, is that there is no need for explicit feedback about the actual outcomes of authentications or other higher-level security operations for which risk scores have been calculated. The presently disclosed “unsupervised” technique may be used as an adjunct with the known “supervised” techniques that rely on receiving and incorporating such explicit feedback for future operation.



FIG. 4 is a schematic diagram of the counters 32. It includes an array of “primary” counters 50, which may be identified by labels um-ci where m and i are index values. A value M corresponds to the number of users in the population. The index i ranges according to the number of category buckets as explained more below. As shown, the primary counters 50 are arranged according to the categories C (shown as CA, CB, . . . CN).


Also shown in FIG. 4 are what are referred to as secondary or “derived” counters 52 that are used to track certain aggregate counts. A first type of derived counter 52-1 is identified as Smn, and there is one of these counts per category-user pair. Each tracks the total number across all category buckets of the respective category for the respective user. A second type 52-2 is identified as Tci, and there is one of these per category bucket. Each tracks the total number for the category bucket across all users. A third type 52-3 is identified as TSx, and there is one of these per category. It tracks the total number across all users and all category buckets of the respective category.


Formal Description for Per-Category Risk Score Calculations


The risk score of the i-th value of category C for user u is given by






R
c

i

ucP(ci)+(1−βc)P(cl|u)  (1)


where βcϵ[0, 1] is a fraction parameter (potentially configurable) that provides the relative weight of “global” and “user” anomaly values; P(cl) is the probability of not seeing the i-th value of category C in the overall population and P(cl|u) is the probability of not seeing the i-th value of category C in user u:






P(cl)=1−P(ci)  (2)






P(cl|u)=1−P(ci|u)  (3)


Where P(ci|u) is the probability of ci occurring given user u, and P(ci) is the probability of ci occurring across the entire user population.


P(ci) may be assumed to be of the form:






P(ci)=γc+PA(ci)+(1−γc)Pci0  (4)


where PA(ci) is a dynamic or “adaptive” probability value for ci and Pci0 is a static a priori probability for ci. The relative weight between Pci0 and the adaptive probability of ci is controlled by the parameter γc. The Pci0 term is associated with question #3 above.


The probabilities in eq. 2 and eq. 3 are estimated from frequencies associated with each of them:










P


(


c
i

|
u

)


=


N

c
i

u


S
C
u






(
5
)








P
A



(

c
i

)


=


T

c
i



TS
C






(
6
)







where Nciu are estimations of number of events for which user u hit the i-th value of category c in a given time window (tracked by the primary counters 50), and S, T and TS are the derived counts 52 described above:






S
c
uciϵCNciu  (7)






T
c

i
uNciu  (8)






TS
cciϵCΣuNciu  (9)


Using expressions 2, 3 and 4, the score (1) can be re-written as






R
c

i

uc[1−γcPA(ci)−(1−γc)Pci0]+(1−β)(1−P(ci|u)  (10)


where the term βc[1−γcPA(ci)−(1−γc)Pci0] is the global part of the model, while the term (1−βc)(1−P(ci|u) is the individual part (user-specific).


A specific example below provides illustration for the above description. As mentioned above, for a given transaction involving a given user, the individual Rciu scores are grouped and explicit group (G) scores are calculated and used. Group score may be determined by computing a maximum of the separate risk scores for the constituent categories. At the top (TL) level, overall risk is calculated from the group scores, for example by calculating a weighted average thereof.


Aging of Event Counts


It is desirable to apply aging to the event counts to allow the system to adapt over time. Aging effectively provides a higher weighting of more recent events than more distant events in the past. In one embodiment an exponential decay may be employed for aging. The estimation of the total number of events in a window-time d at time t is given by the recursive relation






N
(t)(tk)ΔtN(t-Δt)  (11)


where N(t-Δt) denotes the value of the last update of the number of events estimate that took place at time and (t−Δt) and η(tk) is the event weight of the kth transaction. In the simple counting scenario, all η values equal 1. The decay factor is given by









α
=

1

e
d






(
12
)







This expression is valid under the assumption of continuous time. In the case it is discrete, the decaying factor becomes









α
=


d
-
1

d





(
13
)







Initial Values


Theoretically, the initial counter value should be zero (i.e. N(t=0). At a constant event rate (i.e. Δt=const) with exponential weighting, the counter value approaches an asymptote:











lim

t
->





N

(
t
)



=



lim

t
->








(

α

Δ





t


)

t

-
1



α

Δ





t


-
1



=

1

1
-

α

Δ





t









(
14
)







The time it takes to get to the asymptote depends on the event rate and on α. It can be shortened by setting N0←N(t=0)=(1−αΔt)−1. The source of this parameter can be a result of some preliminary study (for example: average number of logins in a certain period of time), theoretical analysis of a suggested model (for example: expected event rate at equilibrium) or corresponding N value from the same model applied on a different instance (for example: values from model acting on Bank A transferred to Bank B). Thus, a model that is based on these counters is expected to show reasonable performance immediately, without waiting for counter values to reach operational values. At the same time, the use of aging also ensures model accommodation to user and corporate history and that any incorrectly assigned initial values will eventually fade out and not interfere with model performance.


Explicit Evolution Equations


The recursive expression (11) can be written explicitly as











N

(
t
)


=





k

1





α

t
-

t
k





η

(

t
k

)




+


N
0



α

t
-

t
1






,




(
15
)







where tk is the time-stamp of the kth transaction; t1 is the time stamp of first appearance.


Asynchronous Calculation


A possible scenario that should be addressed is batch learning where a batch file is used to train the system post factum. The update in this case is as follows











N

(

t
l

)


=





k

1





α


t
l

-

t
k





η

(

t
k

)




+


α


t
l

-
t




N

(

t
k

)





,




(
16
)







where tl is the learning time−the maximum time among all the known transactions.



FIG. 5 illustrates operation 60 of the risk engine 12 at a high level.


At 62, the set of counters 32 is maintained including the array of primary counters 50 and the set of aggregating counters 52. As described above, the primary counters 50 track occurrences of respective values of risk predictors in connection with risk assessment operations for transactions involving respective users over an operating period, and the aggregating counters 52 track aggregate counts of (i) all risk predictor values in respective categories for respective users, (ii) all risk predictor values for respective risk predictors, and (iii) all risk predictor values in respective categories.


At 64, risk assessment operations are performed. Each includes steps 66, 68 and 70 as shown.


At 66, a request is received for a risk assessment operation from an agent 18 in a protected computer system 14, the request relating to a transaction involving a given user and a given set of current values of risk predictors related to the transaction.


At 68, a risk score is calculated based on the aggregating counters 52 and the respective primary counters 50 for the given user and the given set of current values of the risk predictors. The calculation uses weighted first and second components. The first component is user-independent and reflects first probabilities of occurrence of the current values of the risk predictors across all the users over an operating period. The second component is user-dependent and reflects second probabilities of occurrence of the current values of the risk predictors for the given user over the operating period.


At 70, a risk assessment response is returned to the agent 18. The risk assessment response includes the risk score or a value derived therefrom that is usable by the agent 18 to selectively prevent, allow, or modify the transaction by the user. “Modify” may include allowing the transaction only upon obtaining additional information and making further assessment(s) locally or in further consultation with the risk engine 12 as a condition to allowing the transaction. It may also include allowing the user to perform a more limited version of the original transaction that is inherently less risky.



FIG. 6 shows an example configuration of a physical computer such as the risk engine computer 12 or computer of the protected system 14 from a computer hardware perspective. The hardware includes one or more processors 80, memory 82, and interface circuitry 84 interconnected by data interconnections 86 such as one or more high-speed data buses. The interface circuitry 84 provides a hardware connection to a network for communicating with other computers and perhaps other to external devices/connections (EXT DEVs). The processor(s) 80 with connected memory 82 may also be referred to as “processing circuitry” herein. There may also be local storage 88 such as a local-attached disk drive or Flash drive. In operation, the memory 82 stores data and instructions of system software (e.g., operating system) and one or more application programs which are executed by the processor(s) 80 to cause the hardware to function in a software-defined manner. Thus the computer hardware executing instructions of a risk engine application, for example, can be referred to as a risk engine circuit or risk engine component, and it will be understood that a collection of such circuits or components can all be realized and interact with each other as one or more sets of computer processing hardware executing different computer programs as generally known in the art. Further, the application software may be stored on a non-transitory computer-readable medium such as an optical or magnetic disk, Flash memory or other non-volatile semiconductor memory, etc., from which it is retrieved for execution by the processing circuitry, as also generally known in the art.


The method described herein allows dynamic learning of patterns in multi-variable time series that does not rely on manual feedback, nor does this method rely on complex sets of policy rules. On the other hand, the modular nature of the method enables breaking-down the overall risk score associated with an event into meaningful components that in turn can be fed into the policy manager. The method may be used by itself or in conjunction with more traditional “supervised” methods to achieve desired performance in detecting fraudulent events.


While various embodiments of the invention have been particularly shown and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims
  • 1. A method of protecting a computer system from fraudulent use, comprising: continually collecting and aggregating sets of risk predictor values for corresponding user-initiated events occurring in the computer system, the risk predictor values being respective parameters of the events and being aggregated into user-specific aggregations for events initiated by respective users of a user population and organization-wide aggregations for events initiated by all users of the user population;in response to a current event initiated by a user, generating a risk indicator as a combination of a user-specific indicator and an organization-wide indicator, the user-specific indicator being generated based on parameters of the current event and the user-specific aggregations, the organization-wide indicator being generated based on the parameters of the current event and the organization-wide aggregations; andbased on the risk indicator indicating that the current event may be part of a fraudulent use of the computer system, taking a protective control action to protect the computer system against the fraudulent use.
  • 2. The method according to claim 1, wherein: the aggregating sets of risk predictor values includes maintaining counts of occurrences of risk predictor values for respective risk predictors in connection with risk assessment operations for transactions involving the users, the counts used to calculate multiple sets of frequency values including first and second sets of frequency values, the frequency values of each first set being frequencies of occurrence of the risk predictor values for the population of users, the frequency values of each second set being frequencies of occurrence of the risk predictor values for a corresponding user; andthe generating the risk indicator includes calculating a risk score based on the counts for the user and the parameters of the current event, the calculating having explicitly weighted first and second components, the first component being user-independent and reflecting the frequencies of the first set for current values of the risk predictors across the population of users, the second component being user-dependent and reflecting the frequencies of the second set for the current values of the risk predictors for the user.
  • 3. The method according to claim 2, wherein maintaining counts includes maintaining a set of counters including an array of primary counters and a set of aggregating counters, the primary counters tracking occurrences of respective values of risk predictors in connection with risk assessment operations for transactions involving respective users over an operating period, the aggregating counters tracking aggregate counts of (i) all risk predictor values in respective categories for respective users, (ii) all risk predictor values for respective risk predictors, and (iii) all risk predictor values in respective categories.
  • 4. The method according to claim 2, further including maintaining parameters specifying a third set of frequency values, the frequency values of the third set being known or estimated a priori frequencies of occurrence of respective risk predictor values.
  • 5. The method according to claim 4, wherein calculating the risk score includes applying an aging function to the count values to give greater weight to recent updates of the count values than to earlier updates of the count values, the applying of the aging function ensuring model accommodation to user and population history and that any errors in estimated a priori frequencies of the third set of frequency values are removed over time and do not interfere with long-term accuracy of the risk score calculation.
  • 6. The method according to claim 4, wherein calculating the risk score includes applying a weighting for the frequency values of the third set relative to the frequency values of the first and second sets, the weighting representing a desired balance between static and dynamic long-term behavior of the risk calculation.
  • 7. The method according to claim 2, wherein calculating the risk score includes applying an aging function to the count values to give greater weight to recent updates of the count values than to earlier updates of the count values, the applying of the aging function ensuring model accommodation to user and population history.
  • 8. The method according to claim 2, wherein the first and second components are explicitly weighted by a weighting parameter representing a desired balance between respective effects of user-referenced anomalies and population-referenced anomalies in the calculating of the risk score.
  • 9. The method according to claim 2, employing a hierarchical risk model in which the risk predictors are organized into categories at a lowest level and the categories are organized into groups at a next higher level, and wherein calculating the risk score includes (i) calculating respective category risk scores based on the counts, (ii) calculating respective group scores based on the category scores for the categories in each group, and (iii) combining the group scores.
  • 10. The method according to claim 9, wherein calculating the group scores includes selecting respective highest category scores among the respective categories of the groups, and wherein combining the group scores includes calculating an average of the group scores.
  • 11. The method according to claim 2, wherein maintaining the counts includes: generally updating the counts at the time of a transaction based on the occurrence of the current values of the predictors in the transaction; andreceiving a transaction result indicating whether a current transaction was prevented, and if so then refraining from updating the counts cased on occurrence of the current values of the predictors in the current transaction.
  • 12. The method according to claim 1, wherein the current event is an access to the protected computer system by the user, and wherein the protective control action includes selectively preventing, allowing, or modifying the access.
  • 13. The method according to claim 12, wherein modifying the access includes at least one of (1) allowing the transaction only upon obtaining additional information and making a further assessment based on the additional information, or (2) allowing the user to perform a more limited version of the access that is inherently less risky.
  • 14. A nonvolatile computer-readable medium storing a set of computer program instructions executable by a computer system to protect the computer system from fraudulent use, by steps including: continually collecting and aggregating sets of risk predictor values for corresponding user-initiated events occurring in the computer system, the risk predictor values being respective parameters of the events and being aggregated into user-specific aggregations for events initiated by respective users of a user population and organization-wide aggregations for events initiated by all users of the user population;in response to a current event initiated by a user, generating a risk indicator as a combination of a user-specific indicator and an organization-wide indicator, the user-specific indicator being generated based on parameters of the current event and the user-specific aggregations, the organization-wide indicator being generated based on the parameters of the current event and the organization-wide aggregations; andbased on the risk indicator indicating that the current event may be part of a fraudulent use of the computer system, taking a protective control action to protect the computer system against the fraudulent use.
  • 15. The computer-readable medium according to claim 14, wherein: the aggregating sets of risk predictor values includes maintaining counts of occurrences of risk predictor values for respective risk predictors in connection with risk assessment operations for transactions involving the users, the counts used to calculate multiple sets of frequency values including first and second sets of frequency values, the frequency values of each first set being frequencies of occurrence of the risk predictor values for the population of users, the frequency values of each second set being frequencies of occurrence of the risk predictor values for a corresponding user; andthe generating the risk indicator includes calculating a risk score based on the counts for the user and the parameters of the current event, the calculating having explicitly weighted first and second components, the first component being user-independent and reflecting the frequencies of the first set for current values of the risk predictors across the population of users, the second component being user-dependent and reflecting the frequencies of the second set for the current values of the risk predictors for the user.
  • 16. The computer-readable medium according to claim 15, wherein maintaining counts includes maintaining a set of counters including an array of primary counters and a set of aggregating counters, the primary counters tracking occurrences of respective values of risk predictors in connection with risk assessment operations for transactions involving respective users over an operating period, the aggregating counters tracking aggregate counts of (i) all risk predictor values in respective categories for respective users, (ii) all risk predictor values for respective risk predictors, and (iii) all risk predictor values in respective categories.
  • 17. The computer-readable medium according to claim 15, wherein the steps further include maintaining parameters specifying a third set of frequency values, the frequency values of the third set being known or estimated a priori frequencies of occurrence of respective risk predictor values.
  • 18. The computer-readable medium according to claim 17, wherein calculating the risk score includes applying an aging function to the count values to give greater weight to recent updates of the count values than to earlier updates of the count values, the applying of the aging function ensuring model accommodation to user and population history and that any errors in estimated a priori frequencies of the third set of frequency values are removed over time and do not interfere with long-term accuracy of the risk score calculation.
  • 19. The computer-readable medium according to claim 17, wherein calculating the risk score includes applying a weighting for the frequency values of the third set relative to the frequency values of the first and second sets, the weighting representing a desired balance between static and dynamic long-term behavior of the risk calculation.
  • 20. The computer-readable medium according to claim 15, wherein calculating the risk score includes applying an aging function to the count values to give greater weight to recent updates of the count values than to earlier updates of the count values, the applying of the aging function ensuring model accommodation to user and population history.
Continuations (1)
Number Date Country
Parent 14227766 Mar 2014 US
Child 16455937 US