The invention relates to the field of information security systems, and in particular to information security systems employing explicit risk assessment as a factor in controlling computer system operation.
An information security system may employ a specialized component referred to as a “risk engine”, which is based on machine-learning technology and utilizes a potentially large number of indicators to evaluate the risk of an activity in real-time. In known systems adaptive behavior of a risk engine relies on reliable fraud and authentication feedback and is leveraged to its full extent when some of the feedback is provided by professional analysts. In such a setting, a risk engine is capable of assessing risk level of a transaction based on the risk history (including outcomes) of the organization that deploys the system. The risk engine employs a risk model based on information specific to users, devices, cookies or other instances, which is kept and maintained by profile databases and used to calculate risk scores. Model tuning is performed periodically based on explicit fraud and authentication feedback.
At least three problems may arise in relation to current use of risk engine technology. First of all, manual feedback (case management) may be impractical in many use cases, limiting the deployment of RE systems in valuable market areas. One example of such unsupervised environments is enterprise authentication. This is in contrast to certain higher-risk and perhaps higher-value use cases such as online banking, for which there is normally a cadre of security professionals analyzing cases of suspected fraud and providing valuable feedback to risk assessment algorithms being used by a risk engine. Secondly, there can be a non-trivial balance between organization- and user-specific patterns, as well as patterns known from previous knowledge. For example: in a multi-national company it is normal to observe user log-in sessions performed from many geographical locations. However, most individual users have their own log-in patterns that are not taken into considerations. Another problem with the existing approach is the update frequency, leading to the fact that the speed at which the model reacts to the real-life changes is limited to the time period between model updates.
A new method of risk assessment is disclosed that does not depend on explicit feedback, (i.e., it is an unsupervised approach), is instantly self-updating (online) and is based on multiple levels of behavioral history. In one arrangement, a two-level history is used: organization-wide and user-specific history, to reflect both aspects in a risk assessment. Deeper structures are possible, such as a four-level history for Organization, Division, Department, User. This method can use the same kinds of input facts and infrastructure used by conventional supervised risk engine implementations, including for example user and device profiles. Also, it may be usable either as a stand-alone risk scoring component or as an adjunct providing additional predictor inputs to a conventional supervised technique.
In particular, a method is disclosed of protecting a computer system from fraudulent use, which includes continually collecting and aggregating sets of risk predictor values for corresponding user-initiated events (e.g., user logins) occurring in the computer system. The risk predictor values are respective parameters of the events (e.g., user location or device type) and are aggregated into user-specific aggregations for events initiated by respective users of a user population and organization-wide aggregations for events initiated by all users of the user population. In response to a current event initiated by a user, a risk indicator is generated as a combination of a user-specific indicator and an organization-wide indicator, the user-specific indicator being generated based on parameters of the current event and the user-specific aggregations, the organization-wide indicator being generated based on the parameters of the current event and the organization-wide aggregations. Then based on the risk indicator indicating that the current event may be part of a fraudulent use of the computer system, a protective control action is taken (e.g., denying access) to protect the computer system against the fraudulent use.
The foregoing and other objects, features and advantages will be apparent from the following description of particular embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views.
The following description uses an adaptive authentication process as a usage example. However, the method can be easily deployed to other scenarios. More generally, a probabilistic framework for risk assessment is described that assigns anomaly (risk) scores to events in time: the higher the score, the more anomalous (less probable) the event is.
The applications and data 16 are typically providing some type of information service, such as an online service, to a population of users such as a set of private subscribers or the employees of a company for example. The risk engine 12 and agent 18 provide security based on explicit assessment of risk levels of activities/events. Thus in one embodiment, for example, upon an initial access (e.g., login) to a service provided by the apps/data 16, the agent 18 is invoked with parameters of the login as well as other information to obtain a risk assessment as a condition of granting access. The agent 18 in turn makes a request for the risk assessment to the risk engine 12, which performs a risk assessment operation and returns a response. The response may be a so-called “risk score” identifying a level of risk as a numerical value, and the returned value is further processed by the agent 18 and/or the service to arrive at a final decision whether to grant or deny access. In some cases a soft decision might be made, and more information obtained as a condition of granting access. Alternatively, the response from the risk engine 12 may reflect further processing of a raw risk score, and could itself convey a hard access decision.
The focus of this description is operation of the risk engine 12. Very generally, it operates by calculating and combining risk assessments based on the following three questions:
Components that are related to the first two questions are dynamic ones, meaning that they react to the event history, while the last component is static and is a part of general policy. The relative weight of each of the three components can be configurable according to needs. Risk associated with each dynamic component is calculated as a ratio between two respective counters. In the case of the first component (comparison to entire population) this is the ratio between the number of times that a current event occurred in the population divided by the total number of occurrences of all events in a category. For example: number of log-ins from a specific ISP divided by the total number of log-ins (for all ISPs). In the case of the second component (individual pattern) the ratio is between the number of times the event has been observed for this user and the total number of all events for this user, for the given category. For example: number of times George logged-in using a Mac divided by the total number of log-ins for George. The last (static) component of the model can be calculated using historical data on normal behavior or data on known fraud patterns, or it can be set in a more arbitrary or ad hoc manner as a part of security policy.
The specific arrangement in
Below is presented a mathematical description of aspects of operation of the risk engine 12 according to the risk model 20. In this description, the variable “p” refers to an individual distinct predictor P. Predictors P may be assigned to multiple categories C. Each category C has a set of predictors P. A vector ci is defined as a specific realization of category C, where i is an index associated with each predictor combination. In order to simplify the notation it is denoted simply “ci”. Additionally, the users are referred to by the variable “u”.
At a high level, the risk engine 12 basically tracks statistics via the counters 32 and uses the risk score calculator 30 to make probability calculations based on those statistics. Operation has an inherent learning aspect just by tracking detailed statistics for predictors P over a long period of system operation. One benefit of this approach, as mentioned above, is that there is no need for explicit feedback about the actual outcomes of authentications or other higher-level security operations for which risk scores have been calculated. The presently disclosed “unsupervised” technique may be used as an adjunct with the known “supervised” techniques that rely on receiving and incorporating such explicit feedback for future operation.
Also shown in
Formal Description for Per-Category Risk Score Calculations
The risk score of the i-th value of category C for user u is given by
R
c
u=βcP(
where βcϵ[0, 1] is a fraction parameter (potentially configurable) that provides the relative weight of “global” and “user” anomaly values; P(
P(
P(
Where P(ci|u) is the probability of ci occurring given user u, and P(ci) is the probability of ci occurring across the entire user population.
P(ci) may be assumed to be of the form:
P(ci)=γc+PA(ci)+(1−γc)Pc
where PA(ci) is a dynamic or “adaptive” probability value for ci and Pc
The probabilities in eq. 2 and eq. 3 are estimated from frequencies associated with each of them:
where Nc
S
c
u=Σc
T
c
=ΣuNc
TS
c=Σc
Using expressions 2, 3 and 4, the score (1) can be re-written as
R
c
u=βc[1−γcPA(ci)−(1−γc)Pc
where the term βc[1−γcPA(ci)−(1−γc)Pc
A specific example below provides illustration for the above description. As mentioned above, for a given transaction involving a given user, the individual Rc
Aging of Event Counts
It is desirable to apply aging to the event counts to allow the system to adapt over time. Aging effectively provides a higher weighting of more recent events than more distant events in the past. In one embodiment an exponential decay may be employed for aging. The estimation of the total number of events in a window-time d at time t is given by the recursive relation
N
(t)=η(t
where N(t-Δt) denotes the value of the last update of the number of events estimate that took place at time and (t−Δt) and η(t
This expression is valid under the assumption of continuous time. In the case it is discrete, the decaying factor becomes
Initial Values
Theoretically, the initial counter value should be zero (i.e. N(t=0). At a constant event rate (i.e. Δt=const) with exponential weighting, the counter value approaches an asymptote:
The time it takes to get to the asymptote depends on the event rate and on α. It can be shortened by setting N0←N(t=0)=(1−αΔt)−1. The source of this parameter can be a result of some preliminary study (for example: average number of logins in a certain period of time), theoretical analysis of a suggested model (for example: expected event rate at equilibrium) or corresponding N value from the same model applied on a different instance (for example: values from model acting on Bank A transferred to Bank B). Thus, a model that is based on these counters is expected to show reasonable performance immediately, without waiting for counter values to reach operational values. At the same time, the use of aging also ensures model accommodation to user and corporate history and that any incorrectly assigned initial values will eventually fade out and not interfere with model performance.
Explicit Evolution Equations
The recursive expression (11) can be written explicitly as
where tk is the time-stamp of the kth transaction; t1 is the time stamp of first appearance.
Asynchronous Calculation
A possible scenario that should be addressed is batch learning where a batch file is used to train the system post factum. The update in this case is as follows
where tl is the learning time−the maximum time among all the known transactions.
At 62, the set of counters 32 is maintained including the array of primary counters 50 and the set of aggregating counters 52. As described above, the primary counters 50 track occurrences of respective values of risk predictors in connection with risk assessment operations for transactions involving respective users over an operating period, and the aggregating counters 52 track aggregate counts of (i) all risk predictor values in respective categories for respective users, (ii) all risk predictor values for respective risk predictors, and (iii) all risk predictor values in respective categories.
At 64, risk assessment operations are performed. Each includes steps 66, 68 and 70 as shown.
At 66, a request is received for a risk assessment operation from an agent 18 in a protected computer system 14, the request relating to a transaction involving a given user and a given set of current values of risk predictors related to the transaction.
At 68, a risk score is calculated based on the aggregating counters 52 and the respective primary counters 50 for the given user and the given set of current values of the risk predictors. The calculation uses weighted first and second components. The first component is user-independent and reflects first probabilities of occurrence of the current values of the risk predictors across all the users over an operating period. The second component is user-dependent and reflects second probabilities of occurrence of the current values of the risk predictors for the given user over the operating period.
At 70, a risk assessment response is returned to the agent 18. The risk assessment response includes the risk score or a value derived therefrom that is usable by the agent 18 to selectively prevent, allow, or modify the transaction by the user. “Modify” may include allowing the transaction only upon obtaining additional information and making further assessment(s) locally or in further consultation with the risk engine 12 as a condition to allowing the transaction. It may also include allowing the user to perform a more limited version of the original transaction that is inherently less risky.
The method described herein allows dynamic learning of patterns in multi-variable time series that does not rely on manual feedback, nor does this method rely on complex sets of policy rules. On the other hand, the modular nature of the method enables breaking-down the overall risk score associated with an event into meaningful components that in turn can be fed into the policy manager. The method may be used by itself or in conjunction with more traditional “supervised” methods to achieve desired performance in detecting fraudulent events.
While various embodiments of the invention have been particularly shown and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Number | Date | Country | |
---|---|---|---|
Parent | 14227766 | Mar 2014 | US |
Child | 16455937 | US |