A computing environment can include a network of computers and other types of devices. Issues can arise in the computing environment due to behaviors of various entities. Monitoring can be performed to detect such issues, and to take action to address the issues.
Some implementations of the present disclosure are described with respect to the following figures.
Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements. The figures are not necessarily to scale, and the size of some parts may be exaggerated to more clearly illustrate the example shown. Moreover, the drawings provide examples and/or implementations consistent with the description; however, the description is not limited to the examples and/or implementations provided in the drawings.
In the present disclosure, use of the term “a,” “an”, or “the” is intended to include the plural forms as well, unless the context clearly indicates otherwise. Also, the term “includes,” “including,” “comprises,” “comprising,” “have,” or “having” when used in this disclosure specifies the presence of the stated elements, but do not preclude the presence or addition of other elements.
Certain behaviors of entities in a computing environment can be considered anomalous. Examples of entities can include users, machines (physical machines or virtual machines), programs, sites, network addresses, network ports, domain names, organizations, geographical jurisdictions (e.g., countries, states, cities, etc.), or any other identifiable element that can exhibit a behavior including actions in the computing environment. A behavior of an entity can be anomalous if the behavior deviates from an expected rule, criterion, threshold, policy, past behavior of the entity, behavior of other entities, or any other target, which can be predefined or dynamically set. An example of an anomalous behavior of a user involves the user making greater than a number of login attempts into a computer within a specified time interval, or a number of failed login attempts by the user within a specified time interval. An example of an anomalous behavior of a machine involves the machine receiving greater than a threshold number of data packets within a specified time interval, or a number of login attempts by users on the machine that exceed a threshold within a specified time interval.
Analysis can be performed to identify anomalous entities, which may be entities that are engaging in behavior that present a risk to a computing environment. In some examples, such analysis can be referred to as a User and Entity Behavior Analysis (UEBA). As examples, a UEBA system can use behavioral anomaly detection to detect a compromised user, a malicious insider, a malware infected device, a malicious domain name or network address (such as an Internet Protocol or IP address), and so forth.
Anomaly detection systems or techniques can be complex and may involve significant input of domain data pertaining to models used in performing detection of anomalous entities. Domain data can refer to data that relates to characteristics of a computing environment, entities of the computing environment, and other aspects that affect whether an entity is considered to be exhibiting anomalous behavior. Such domain data may have to be manually provided by human subject matter experts, which can be a labor-intensive and error-prone process.
In accordance with some implementations of the present disclosure, graph-based detection techniques or systems are provided to detect anomalous entities. A graphical representation of entities associated with a computing environment is generated, and features for the entities represented by the graphical representation are derived, where the features include neighborhood features and link-based features. In other examples, other types of features can be derived. Multiple anomaly detectors based on respective features of the derived features are used to determine whether the first entity is exhibiting anomalous behavior.
In some examples, the analysis system 100 can include a UEBA system. In other examples, the analysis system 100 can include an Enterprise Security Management (ESM) system, which provides a security management framework that can create and sustain security for a computing infrastructure of an organization. In other examples, other types of analysis systems 100 can be employed.
The analysis system 100 can be implemented as a computer system or as a distributed arrangement of computer systems. More generally, the various components of the analysis system 100 can be integrated into one computer system or can be distributed across various different computer systems.
In some examples, the entities 102 can be part of a computing environment, which can include computers, communication nodes (e.g., switches, routers, etc.), storage devices, servers, and/or other types of electronic devices. The computing environment can also include additional entities, such as programs, users, network addresses assigned to entities, domain names of entities, and so forth. The computing environment can be a data center, an information technology (IT) infrastructure, a cloud system, or any other type of arrangement that includes electronic devices and programs and users associated with such electronic devices and programs.
The analysis system 100 includes event data collectors 104 to collect data relating to events associated with the entities 102 of the computing environment. The event data collectors 104 can include collection agents (in the form of machine-readable instructions such as software or firmware modules, for example) distributed throughout the computing environment, such as on computers, communication nodes, storage devices, servers, and so forth. Alternatively, some of the event data collectors 104 can include hardware event collectors implemented with hardware circuitry.
Examples of events can include login events (e.g., events relating to a number of login attempts and/or devices logged into), events relating to access of resources such as websites, events relating to submission of queries such as Domain Name System (DNS) queries, events relating to sizes and/or locations of data (e.g., files) accessed, events relating to loading of programs, events relating to execution of programs, events relating to accesses made of components of the computing environment, errors reported by machines or programs, events relating to performance monitoring of various characteristics of the computing environment (including monitoring of network communication speeds, execution speeds of programs, etc.), and/or other events.
An event data record can include various attributes, such as a time attribute (to indicate when the event occurred), and further attributes that can depend on the type of event that the event data record represents. For example, if an event data record is to present a login event, then the event data record can include a time attribute to indicate when the login occurred, a user identification attribute to identify the user making the login attempt, a resource identification attribute to identify a resource in which the login attempt was made, and so forth.
Event data can include network event data and/or host event data. Network event data is collected on a network device such as a router, a switch, or other communication device that is used to transfer data between other devices. An event data collector 104 can reside in the network device, or alternatively, the event data collector can be in the form of a tapping device that is inserted into a network. Examples of network event data include Hypertext Transfer Protocol (HTTP) data, DNS data, Netflow data (which is data collected according to the Netflow protocol), and so forth.
Host event data can include data collected on computers (e.g., desktop computers, notebook computers, tablet computers, server computers, etc.), smartphones, or other types of devices. Host event data can include information of processes, files, applications, operating systems, and so forth.
The event data collectors 104 can produce a stream of event data records 106, which can be provided to a graphical representation generation engine 108 for processing by the graphical representation generation engine 108 in real time. As used here, an “engine” can refer to a hardware processing circuit or a combination of a hardware processing circuit and machine-readable instructions (e.g., software and/or firmware) executable on the hardware processing circuit. The hardware processing circuit can include any or some combination of the following: a microprocessor, a core of a multi-core microprocessor, a microcontroller, a programmable gate array, a programmable integrated circuit device, and so forth.
A “stream” of event data records can refer to any set of event data records that can have some ordering, such as ordering by time of the event data records, ordering by location of the event data records, or some other attribute(s) of the event data records. An event data record can refer to any collection of information that can include information pertaining to a respective event. Processing the stream of event data records 106 in “real time” can refer to processing the stream of event data records 106 as the event data records 106 are received by the graphical representation generation engine 108.
Alternatively or additionally, the event data records produced by the event data collectors 104 can be first stored into a repository 110 of event data records, and the graphical representation generation engine 108 can retrieve the event data records from the repository 110 to process such event data records. The repository 110 can be implemented with a storage medium, which can be provided by disk-based storage device(s), solid state storage device(s), and/or other type(s) of storage or memory device(s).
Based on the stream of event data records 106 and/or based on the event data records retrieved from the repository 110, the graphical representation generation engine 108 can generate a graphical representation 112 of the entities 102 associated with a computing environment. In some examples, a graphical representation of the entities 102 can be in the form of a graph that has nodes (or vertices) representing respective entities. An edge between a pair of the nodes represents a relationship between the nodes in the pair.
The data in the event data records can be used to construct the graphical representation 112 over a given time window of a specified length (e.g., a minute, an hour, a day, a week, etc.). In further examples, multiple time windows can be selected, where each time window of the multiple time windows is of a different time length. For example, a first time window can be a 10-minute time window, a second time window can be a one-hour time window, a third time window can be a six-hour time window, a fourth time window can be a 24-hour time window, and so forth.
Different graphical representations 112 can be generated by the graphical representation generation engine 108 for the different time windows. Choosing multiple time windows can allow for extraction of features that relate to different time periods. Anomaly detection as discussed herein can be applied for the different graphical representations generated for the different time windows of different time lengths.
A relationship represented by an edge between nodes of the graphical representation 112 (which represent respective entities) can include any of various different types of relationships, such as: a communication relationship where data (e.g., HTTP data, DNS data, etc.) is exchanged between the respective entities, a functional relationship where the respective entities interact with one another, a physical relationship where one entity is physically associated with another entity (e.g., a program is included in a computer, a first switch is directly connected by a link to a second switch, etc.), or any other type of relationship.
In some examples, each edge between nodes in the graphical representation 112 can be assigned a weight. The weight can vary in value depending upon characteristics of the relationship between entities corresponding to the edge. For example, the value of a weight can be assigned based on any of the foregoing: the number of connections (or sessions) between entities (such as machines or programs), the number of packets or amount of bytes transferred between the entities, the number of login attempts by a user on a machine, the number of times an entity accessed a file, a size of a file accessed by an entity, and so forth.
Graphical representations can also be constructed from both network event data and host event data, where such graphical representations can be referred to as heterogeneous graphical representations. In other examples, a first graphical representation can be constructed from network event data, while a second graphical representation can be constructed from host event data.
In some examples, edges in the graphical representation 112 are directed edges. A directed edge is associated with a direction from a first node to a second node in the graphical representation 112, to indicate the direction of interaction (e.g., a first entity represented by the first node sent a packet to a second entity represented by the second node). In such examples, weights are assigned to the directed edges (e.g., a first weight is assigned to a first edge between two nodes to represent a relationship in a first direction between the two nodes, and a second weight is assigned to a second edge between the two nodes to represent a relationship in a second direction between the two nodes).
In further examples, an edge between nodes can be direction-less. Such an edge can be referred to as a non-directional edge. For example, multiple edges between nodes can be consolidated into one edge, where weights assigned to the multiple edges are combined (e.g., summed, averaged, etc.) to produce a weight for the consolidated edge. A direction-less edge can be used in various scenarios, such as any of the following, for example: there is no natural direction, e.g., the edge corresponds to the nodes/entities being physically connected, or the edge was created due to similarity between the nodes; a direction is not important or obvious, e.g., when the nodes represent a user and a file, and the edge relates to the user accessing the file; and so forth.
The graphical representation 112 (or multiple graphical representations 112) produced by the graphical representation engine 108 can be provided to a feature derivation engine 114. The feature derivation engine 114 derives features for the entities represented by the graphical representation 112.
A “feature” can refer to any attribute associated with an entity. A “derived feature” can refer to an attribute that is computed by the feature derivation engine 114 based on other information, including information in the graphical representation 112 and/or information computed using the information in the graphical representation 112.
The derived features generated by the feature derivation engine 114 can include neighborhood features and link-based features, where a neighborhood feature for a given entity is derived based on entities that are neighbors of the given entity in the graphical representation 112, and a link-based feature for the given entity is derived based on relationships of other entities in the graphical representation 112 with the given entity.
Neighborhood features and link-based features are discussed further below. In other examples, other types of features can be derived.
The derived features produced by the feature derivation engine 114 based on the graphical representation 112 (or based on multiple graphical representations 112) are output as graph-based features 116 from the feature derivation engine 114 to an anomaly detection engine 118.
The anomaly detection engine 118 is able to determine whether an entity is exhibiting anomalous behavior using the graph-based features 116 from the feature derivation engine 114. The anomaly detection engine 118 can produce measures based on the graph-based features 116, where the measures can include parametric measures or non-parametric measures as discussed further below.
The anomaly detection engine 118 includes multiple anomaly detectors 120 that are applied to respective different features of the graph-based features 116. For example, a first anomaly detector 120 can base its anomaly detection on a first graph-based feature 116 (or a first subset of graph-based features), a second anomaly detector 120 can base its anomaly detection on a second graph-based feature 116 (or a second subset of graph-based features), and so forth.
Based on the detection performed by the anomaly detectors 120, the anomaly detectors 120 provide respective anomaly scores. An anomaly score can include information that indicates whether or not an entity is exhibiting anomalous behavior. An anomaly score can include a binary value, such as in the form of a flag or other type of indicator, that when set to a first state (e.g., “1”) indicates an anomalous behavior, and when set to a second state (e.g., “0”) indicates normal behavior (i.e., non-anomalous behavior). In further examples, an anomaly score can include a numerical value that indicates a likelihood of anomalous behavior. For example, the anomaly score can range in value between 0 and 1, where 0 indicates with certainty that the entity is not exhibiting anomalous behavior, and a 1 indicates that the entity is definitely exhibiting anomalous behavior. Any value that is greater than 0 or less than 1 provides an indication of the likelihood, based on the confidence of the respective anomaly detector 120 that produced the anomaly score. In other examples, an anomaly score that ranges in value between 0 and 1 can also be referred to as a likelihood score. In other examples, instead of ranging between 0 and 1, an anomaly score can have a range of different values to provide indications of different confidence amounts of the respective anomaly detector 120 in producing the anomaly score. In further examples, an anomaly score can be a categorical value that is assigned to different categories (e.g., low, medium, high).
The anomaly scores from the multiple anomaly detectors 120 can be combined to produce an anomaly detection output 122, where the anomaly detection output 122 can indicate whether or not a respective entity is an anomalous entity that is exhibiting anomalous behavior. The combining of the anomaly scores from the multiple anomaly detectors 120 can be a sum or other mathematical aggregate of the anomaly scores, such as an average, a weighted sum, a weighted average, a maximum, a harmonic mean, and so forth. A weighted aggregate (e.g., a weighted sum, a weighted average, etc.) is computed by multiplying a weight by each anomaly score, and then aggregating the products.
The anomaly detection output 122 can include the aggregate anomaly score produced from combining the anomaly scores from the multiple anomaly detectors 120, or some other indication of whether or not an entity is exhibiting an anomalous behavior.
In further examples, the anomaly detectors 120 can be ranked to identify a specified number of top-ranked anomaly detectors. Each anomaly detector 120 can produce a confidence score indicating its confidence in producing a respective anomaly score. The ranking of the anomaly detectors 120 can be based on the confidence scores. Instead of using all of the anomaly detectors 120 to identify an anomalous entity, just a subset (less than all) of the anomaly detectors 120 can be selected, where the selected anomaly detectors 120 can be the M top-ranked anomaly detectors 120 (where M 1).
Although
The process further includes deriving (at 204), such as by the feature derivation engine 114, features for the entities represented by corresponding nodes of the graphical representation, where an edge between a pair of the nodes represents a relationship between the nodes in the pair, and the features include neighborhood features and link-based features. A neighborhood feature for a given entity is derived based on entities that are neighbors of the given entity in the graphical representation, and a link-based feature for the given entity is derived based on relationships of other entities throughout the graphical representation with the given entity.
The process further includes determining (at 206), using multiple anomaly detectors (e.g., 120) based on respective features of the derived features, whether the given entity is exhibiting anomalous behavior.
Although just one edge is shown between each pair of nodes in the graph 300, it is noted that in further examples, multiple edges can be present between a pair of nodes. Moreover, edges are shown as directed edges in
The graph 300 can be generated by the graphical representation generation engine 108 of
The graph-based features can include neighborhood features and link-based features. In other examples, other types of features can be derived. More generally, the graph-based features are according to the structure and attributes of the graph 300.
Neighborhood Features
A neighborhood feature (also referred to as a local feature) for a given entity is derived based on entities that are neighbors of the given entity in the graph 300. In
Although a specific example of a local neighborhood of the node E is shown in
In other examples, the specified proximity can be based on whether the other nodes are in a specified physical proximity of the given node (e.g., the other nodes are on the same rack as the given node, the other nodes are in the same building as the given node, the other nodes are in the same city as the given node, etc.). In further examples, the specified proximity can be based on whether the other nodes have a specified logical relationship to the given node (e.g., the other nodes are able to interact or communicate with the given node). In alternative examples, the local neighborhood of the given node can be defined in a different manner.
Examples of neighborhood features that can be derived from the structure and attributes of the local neighborhood of the node E in the graph 300 can include the following:
In other examples, other neighborhood features can be derived.
In a more specific example, a k-step egonet can be computed for each of the nodes of the graph 300. A k-step (k≥1) egonet of a given node includes the given node, all of the given node's k-step neighbors, and all edges between any of the given node's k-step neighbors or the given node.
In
Once a k-step egonet of a given node is computed, the following neighborhood features can be derived based on the k-step egonet:
In other examples, other neighborhood features can be derived from the k-step egonet.
Link-Based Features
A link-based feature (also referred to as a global feature) for a given entity is derived based on relationships of other entities in the graph 300 with the given entity.
Generally, link-based features for a node of the graph 300 are derived based on the global structural properties of the graph 300.
Examples of link-based features include a PageRank, a Reverse PageRank, a hub score using the Hyperlink-Induced Topic Search (HITS) technique, and an authority score using the HITS technique. In other examples, other link-based features can be derived.
The computation of a PageRank is based on a link analysis that assigns numerical weighting to each node of the graph 300 to measure the relative importance of the node within the set of nodes of the graph 300. The measure of the relative importance of a node (such as the node E in
A reverse PageRank is computed by first reversing the direction of the edges in the graph 300, and then computing PageRank for each node using the PageRank computation discussed above.
The HITS technique (also referred to as a hubs and authorities technique) is a link analysis technique that can be used to rate nodes of a graph, based on the notion that certain nodes, referred to as hubs, served as large directories that were not actually authoritative in the information that they held, but were used as compilations of a broad catalog of information that led to other authoritative pages. In other words, a hub represents a node that points to a relatively large number of other pages, and an authority represents a node that is linked by a relatively large number of different hubs. The HITS technique assigns two scores for each node: its authority score, which estimates the value of the content of the node, and its hub score, which estimates the value of its links to other nodes. The HITS technique used in examples of the present disclosure is similar to that used for a web graph. The input to the HITS technique is the graph, and the authority score and hub score of a node depends on its in-degree and out-degree.
Parametric Anomaly Detection
Detection of anomalous entities can be based on probability distributions (also referred to as densities) computed for respective derived graph-based features as derived by the feature derivation engine 114 of
A probability distribution of a given graph-based feature can refer to a distribution of observed values of the given graph-based feature (e.g., the in-degree of the node E in the graph 300), where for each value of the given graph-based feature, the number of occurrences of the value is indicated in the distribution. A distribution of the given graph-based feature is a parametric distribution if the distribution is parameterized by certain parameters, such as the mean and standard deviation of the distribution. A parametric distribution with a mean and a standard deviation is also referred to as a normal distribution, such as the normal distribution 400 shown in
In another example, a parametric distribution can be a power law distribution. A power law is a functional relationship between two quantities, where a relative change in one quantity results in a proportional relative change in the other quantity. A first quantity varies as a power of another.
An example of a power law distribution 500 is shown in
where x is an input quantity (represented by the horizontal axis), and p(x; xmin, α) is the probability density (represented by the vertical axis) that is a power of the input quantity, x. The input quantity, x, can be a graph-based feature as discussed above.
For the power law distribution, the parameters xmin and a parameterize the power law distribution.
In other examples, other types of parametric distributions can be characterized by other parameters. Other examples can include a gamma distribution that is parametrized by a shape parameter, k and a scale parameter, θ; a t-distribution parametrized by degrees of freedom parameter, and so forth.
For each parametric distribution (e.g., normal distribution, power law distribution, etc.), the parameters that parameterize the parametric distribution can be estimated based on “normal” event data, i.e., event data known to not include those of anomalous entities. Such event data can be referred to as training data.
In some examples, multiple parametric distributions can be computed for each graph-based feature individually. Given values of a respective graph-based feature (such as values of the respective graph-based feature computed based on historical event data records), multiple parametric distributions (including those noted above) can be generated for the respective graph-based feature.
An anomaly detector 120 in the anomaly detection engine 118 (
Two phases can be performed by the anomaly detector 120. A first phase (training phase) uses historical data to determine which of the multiple parametric distributions to use by comparing the likelihoods of the historical data given a parametric distribution. The computed likelihood represents the probability of observing a data point (or set of data points) given a respective parametric distribution. The parameters of each parametric distribution are estimated. The distribution with the maximum likelihood is selected. Once a distribution is selected, then a validation data set can be used to determine a threshold for each of the parametric distributions. A validation data set includes data points, some of which are known to not represent anomalous entities, and others of which are known to represent anomalous entities. Using the validation data set, a threshold in a parametric distribution can be selected, which is the threshold that divides the data points that are known to not represent anomalous entities from the data points that are known to represent anomalous entities. The threshold can be set by a human analyst, or by a machine or program based on a machine learning process, for example.
Once the parametric distribution is selected and the corresponding threshold is known, a second phase (an anomalous entity detection phase) can be performed, where the anomaly detector 120 is ready to detect anomalous data points. Given a new data point or set of data points (i.e., feature values), the anomaly detector 120 computes its likelihood based on the selected distribution and selected parameters, and the anomaly detector 120 uses the threshold to determine if the data point or set of data points corresponds to an anomalous entity.
The above procedure can be used for individual features, or joint features
Each respective parametric distribution is associated with a likelihood function. For example, for the normal distribution, a log likelihood function can be used to compute the likelihood of a data point occurring given the normal distribution. Similarly, a power law distribution has a log likelihood function that can be used to compute the likelihood of a data point occurring given the power law distribution.
This selected likelihood is then compared to a threshold of the given parametric distribution—if the selected likelihood is less than (or has some other specified relationship, such as greater than, within a range of, etc.) the threshold of the given parametric distribution, then the currently considered data point (or set of data points) is marked as indicating an anomalous entity.
For example, in
In the foregoing, reference is made to computing parametric distributions for each graph-based feature individually. In further examples, each parametric distribution can be computed for a subset of multiple graph-based features, such as a pair of graph-based features or a subset of more than two graph-based features. A parametric distribution computed based on a subset of multiple graph-based features can be referred to as a multivariate or joint parametric distribution.
For example, a multivariate normal distribution can have multiple different horizontal axes representing respective different graph-based features of the subset of graph-based features. Similarly, a multivariate power law distribution can have multiple different horizontal axes representing respective different graph-based features of the subset of graph-based features.
Thresholds can be determined for each multivariate parametric distribution, and such thresholds can be used to determine whether a currently considered data point (or set of data points) indicates an anomalous entity.
More generally, a first anomaly detector 120 can compute a first parametric distribution of a first subset of the graph-based features (where the first subset can include just one graph-based feature, a pair of graph-based features, or more than two graph-based features), and determines whether a given entity is exhibiting anomalous behavior based on the parametric distribution. The given anomaly detector 120 determines whether the given entity is exhibiting anomalous behavior based on a threshold for the first parametric distribution.
A second anomaly detector 120 can compute a second parametric distribution of a different second subset of the graph-based features, and determines whether the given entity is exhibiting anomalous behavior based on the second parametric distribution.
Non-Parametric Anomaly Detection
In alternative examples, instead of performing anomaly detection using parametric distributions, non-parametric anomaly detection for detecting anomalous entities can be performed.
For example, an anomaly detector 120 can explore pair-wise relationships between graph-based features (two graph-based features, or more than two graph-based features). Instead of fitting a parametric function (that represents a parametric distribution), the anomaly detector 120 can estimate a density of data points in a neighborhood of a currently considered data point (that represents the graph-based features for a currently considered entity). Essentially, given the currently considered data point, the anomaly detector 120 can retrieve the K (K 1) nearest neighbors to the currently considered data point, and estimate the density of the currently considered data point based on the distances of the currently considered data point to the K nearest neighbors.
This computed density is then used to estimate an anomaly score for the currently considered entity.
The vertical axis of the plot of
The position of a given data point 602 on the plot is based on the value of the first graph-based feature and the value of the second graph-based feature in the given data point 602.
In the example of
Similarly, the given anomaly detector 120 determines the distances of the data point 606 to its K nearest neighbors (the K data points nearest the data point 606 in the plot shown in
The aggregate distance of the data point 604 and the aggregate distance of the data point 606 are compared to a specified threshold distance. If the aggregate distance is greater than the specified threshold distance (or has some other specified relationship to the specified threshold distance), then the corresponding data point is indicated as representing an anomalous entity. In the example of
Effectively, with the non-parametric detection technique discussed above, the given anomaly detector 120 looks for an isolated data point in the plot of
In the example of
Another anomaly detector can be used to identify anomalous entities based on graph-based features (two or more) of another subset of graph-based features. Further anomaly detectors can be used to identify anomalous entities based on graph-based features (two or more) of respective further subsets of graph-based features.
More generally, an anomaly detector 120 computes a density measure for a given data point based on relationships of the given data point to other data points. The anomaly detector 120 uses the density measure to determine whether an entity represented by the given data point is exhibiting anomalous behavior.
In examples according to
For large data sets including a large number of data points, searching for the K-nearest neighbors can be expensive from a processing perspective. In alternative implementations of the present disclosure, instead of searching for the K nearest neighbors as new data points are received for consideration, the anomaly detection engine 118 can construct a grid of data points for each subset of graph-based features, identify multiple cells in the grid, and pre-compute the density in each of the cells of the grid. A “grid” can refer to any arrangement of data points where one axis represents one graph-based feature, and another axis represents another graph-based feature. More generally, a grid can be a multi-dimensional grid that has two or more axes that represent respective different graph-based features.
As part of a pre-computation phase for the grid of
A similar process is performed for the other cells of the grid of
Once all of the cell densities are computed, the pre-computation phase is completed.
Next, an anomaly detection phase is performed for a new data point. In response to receiving the new data point, the K-nearest neighbors of the new data point do not have to be identified. Instead, an anomaly detector 120 locates the cell (of the multiple cells in the grid of
The density of the new data point is used as the estimated anomaly score.
In some examples, an index can be used to map a values of the first and second graph-based features of the new data point to a corresponding cell to retrieve the cell density of the corresponding cell. The index correlates ranges of values of the first and second graph-based features to respective cells.
The grid of
More generally, a given anomaly detector pre-computes density measures for respective cells in a multi-dimensional grid that associates the features of a subset of the derived features. The given anomaly detector determines which given cell of the cells a data point corresponding to an entity falls into, and uses the density measure of the given cell as the computed density measure for the entity, where the computed density measure is used as an anomaly score.
Example Systems
The system 800 further includes a non-transitory machine-readable or computer-readable storage medium 804 storing machine-readable instructions executable on the processor 802 to perform various tasks. Machine-readable instructions executable on a processor can refer to the machine-readable instructions executable on one processor or on multiple processors.
The machine-readable instructions include cell density computing instructions 806 to, for a subset of features of entities associated with a computing environment, pre-compute densities of cells within a multi-dimensional grid (e.g., cells in the grid shown in
The density pre-computed for a respective cell of the cells is based on relationships between data points in the respective cell and other data points in the multi-dimensional grid.
The machine-readable instructions further include cell identifying instructions 808 to, in response to receiving a data point for a particular entity, identify a cell corresponding to the data point for the particular entity. The machine-readable instructions further include anomaly detecting instructions 810 to use the pre-computed density of the identified cell in determining whether the particular entity is anomalous.
The machine-readable instructions of
The machine-readable instructions of
The machine-readable instructions of
The storage medium 804 (
In the foregoing description, numerous details are set forth to provide an understanding of the subject disclosed herein. However, implementations may be practiced without some of these details. Other implementations may include modifications and variations from the details discussed above. It is intended that the appended claims cover such modifications and variations.