The present invention concerns the detection of anomalous behavior in systems displaying a typified complex behavior, encoded in a digital signal, through the study of a computational model (the artificial system) of interacting agents defined using information contained in the digital signal and forcing agents to engage in a maximally frustrated dynamics.
The detection of anomalous behavior in systems with a typical complex behavior is a difficult task given the wide variety of behaviors that can characterize the system's typical behavior, as well as the wide variety of possible anomalous behavior. The difficulty is related to the difficulty in developing methods that are autonomous, precise and that can find a large range of potential applications.
There are essentially two types of methods: 1) detection of anomalies using statistical or spectral analysis of the system's behavior; 2) intrusion detection through the detection of unfamiliar elements.
The book [2] presents methods of type 1. These methods require that anomalies have an impact in the statistics characterizing the system's behavior. They use, in general, the past behavior of the system to establish a profile of normality. Given the statistical nature of these methods, the number of undetected anomalies (false negatives) is large, because these methods require time to react. These methods have also difficulty in distinguishing if statistical fluctuations represent anomalies or legitimate behavior (the number of false positives is also large). These methods have nevertheless the advantage of being able to detect intrusions that share features in common with the legitimate system's behavior. For instance, they can detect the use of an excessive frequency of a sequence usually present but with less frequency. However, these analyses have to be implemented specifically for each case. The present invention has the advantage of producing this type of detections but using self-organized mechanisms, which allows analyzing a considerably larger number of correlations without requiring input information from a user. The method produces in the same circumstances, small amount of errors.
Another possible approach consists in using non-parametric tests—for instance, Kolmogorov-Smirnov, or Anderson-Darling—to evaluate if a sample from the signal deviates significantly from the behavior that could be predicted from the probability distribution characterizing the system's typical behavior. Relatively to these methods, the present invention has the advantage of detecting spatial correlations (within samples) and dynamical correlations (evolving in time), which cannot be accessed with the previous methods.
Type 2 methods can be divided in two types. Methods for the detection of anomalous behavior already experienced and registered in a database (based on ‘signatures’), or methods for detecting behavior never registered before. The first type of methods is commonly used in commercial antivirus. However, their databases tend to grow fast because the number of possible intrusion variants grows exponentially with the number of small changes that could be present. For this reason, these methods require considerable resources, not only in memory but also in computational time. Besides, as databases must be kept within reasonable limited sizes, these methods require continuous updates to consider new threats at the cost of neglecting older ones. These methods are necessarily vulnerable and they require the prior knowledge of possible anomalies. Hence, they cannot avoid damages from anomalies that have never been registered before.
Unlike, the present invention does not require the previous knowledge of potential anomalies and for that reason it does not have this type of vulnerabilities. Moreover, the number of potential intrusions that it can detect is considerably larger.
The present invention is closer to a second type of methods [1]. In documents [3-8] are discussed the most recent developments of this method class, designated as negative selection methods. Similarly to the present invention, negative selection algorithms assume that the system's normal behavior can be encoded in a digital signal. From this signal, smaller sequences are defined. The set of these sequences defines the profile of the system's normal functioning (designated ‘self’).
Negative selection methods use a so-called negative selection trial and error process to define detection domains that do not contain sequences from the original data signal. A detector is associated to each detection domain. The goal is to define a set of detection domains covering the whole set of sequences not present in the original signal.
When a new data series is tested, the algorithm detects anomalies if a sequence extracted from the new data series belongs to a detection domain defined before. In that case the algorithm signals the detection of a foreign sequence or an abnormal behavior. These algorithms lead to two types of errors. One results from the difficulty in defining detection domains that cover the whole space of foreign sequences. Errors of this type cannot be avoided with these methods. However, some techniques have been developed to guarantee that this type of errors do not exceed 5% of the foreign possible sequences. This requires increasing the total number of detectors. It has, nevertheless been noted that this difficulty cannot be overcome when sequences use a large number of digits. In that case, the number of required domains diverges, making these methods unfeasible. On the contrary, the present invention performs perfect detection even in that case.
The second type of errors results from the fact that these methods are blind to presence of sequences that occur in the original signal with an ordering or frequency different from that in the original signal. Unlike, the present invention detects anomalies of this kind. This is due to the fact that the method of detection in the present invention is based on conceptually different mechanisms. It should be mentioned that it is likely that the majority of successful intrusions in computer systems belong to anomalies of this type, and for this reason the present invention is particularly relevant.
The present invention belongs to the previous type of methods because it uses a series of digital signals to build a set of sequences that define the normal behavior of the system to protect. Moreover, it also uses a negative selection procedure to define detectors. However, this invention works in a conceptually different way, as it will become apparent next.
The goal of the present invention is to detect anomalies using sequences from a data set describing a system's behavior.
The present invention uses sequences defined from a digital signal to build the profile of a system's typical behavior. It can be used to detect unfamiliar sequences or unfamiliar combinations of familiar sequences.
Thus it can be used to develop systems dedicated to the detection of anomalies and intrusions in computer systems, the detection of coding errors in DNA or abnormal protein sequences, the detection of tumors using medical imaging or the detection of unfamiliar substances and for quality control or authenticity checking using spectroscopy, among others.
The invention operates in three stages: education of a repertoire of detectors, calibration and detection. During the repertoire education stage, the invention uses sequences characterizing the system's normal behavior—called target system—to define interacting agents in a computational model (artificial model). The calibration stage uses agents defined as in the education stage to compute parameters defining the target system normal behavior. During the detection stage, changes in the target system behavior produce measurable changes in the agent's behavior in the artificial system which signals in this way the presence of anomalies.
Comparing with the prior art, instead of triggering responses depending on whether sequences fall within detector's domains, the present invention establishes an interaction dynamics involving detectors and uses their dynamical properties to signal anomalies.
Besides a negative selection process, the present method uses as well a positive selection process during repertoire the education stage, and in which detectors that cannot establish contacts are eliminated.
The invention describes a system and a general method of detection of anomalous behaviors for systems with a typical behavior described by a digital signal.
It is described how to detect deviations from the target system normal behavior by studying a computational model (artificial system) of interacting agents. The computational model is a kind of cellular automaton that uses a new type of rules to update the system's agent (or cell) states.
In the present invention, agents in the artificial system are defined using the information contained in the digital signal and by imposing that they engage in a maximally frustrated dynamics. In that way, changes in the target system behavior lead to a measurable decrease in frustration in the artificial system dynamics. The method detects perfectly any sequence that has never been produced by the target system normal behavior. It also detects combinations of already presented sequences that had not been presented before.
The invention works as a sophisticated non parametric statistical test, that is capable of detecting deviations from an arbitrary probability distribution and spatial correlations (i.e., within sequences from the signal) and dynamical correlations (that can evolve in time).
The invention can be applied to intrusion detection in computer security, to the analysis of data in genomics and proteomics, in spectroscopy, in image processing, in medicine and economics.
For an easier understanding of the invention attached are the figures, which represent preferred embodiments of the invention, which, however, are not intended to limit the object of this invention.
in the detection stage, sequences are obtained from a signal from the complex system to be tested (possibly with anomalies) and agents are defined (steps 1D and 2D), agents engage in the detection dynamics with anergy (steps 4D and 5D), the number of long contacts is calculated (step 6D), and the whole process is repeated until the predefined number of iterations is reached (stopping criterion C), after which the computed dynamical parameters are compared with the corresponding parameters registered during the calibration stage (step 8D), possibly triggering an alarm signal (stopping criterion D);
A: the mapping of the complex behavior of a system, or of a digital signal, in sequences where (001011100111010) represents the original digital signal,
(A) represents a sequence,
(B) represents a sequence, and
(C) represents a sequence.
B: the definition of the initial population of agents where (presenters) represent presenter agents;
(detetors) represent detector agents;
(A), (B) and (C) represent presenter agents presenting respectively sequences (A), (B) and (C);
(ligand) represents each agent's ligand;
(receptor) represents each agent's receptor;
(cluster1) and (cluster2) represent groups, clusters of agents, and
the dashed line represents presenter agent A connectivity list.
C: agents interaction dynamics where
(decisions) represents the creation of a new pair, even if the agent was already paired, provided that improves its preference.
D: the definition of successive populations of detector agents after positive and negative selection and the creation of the repertoire.
A: the mapping of the complex behavior of a target training system, or of a digital signal, in sequences where (001011100111010) represents the original digital signal,
(A) represents a sequence,
(B) represents a sequence, and
(C) represents a sequence.
B: the definition of the initial population of agents where (presenters) represent presenter agents;
(detetors) represent detector agents;
(A), (B) and (C) represent presenter agents presenting respectively sequences (A), (B) and (C);
(ligand) represents an agent's ligand;
(receptor) represents an agent's receptor;
(cluster1) and (cluster2) represent groups, clusters of agents, and
the dashed line represents presenter agent A connectivity list.
C: the agents interaction dynamics where
(decisions) represents the creation of a new pair, even if any of the agent was already paired, provided it improves his preference.
(anergy) represents the substitution of a detector agent with another equivalent agent in population X from the repertoire, whenever it terminates a pairing that lasted for a time τ larger than a pre-established time τa, without forming new pairings.
D: the evaluation of parameters that defined the dynamics of the system in the absence of anomalies (normality) namely
Tdet(I), the duration of the longest contacts in which presenter agent I participated;
Tinat(I), the duration of the longest periods of time during which agent I did not establish any new pairing;
ndet(I), the number of pairings established by a presenter agent I and lasting Tdet(I) in a given time interval;
ninat(I), the number of periods of time with a duration equal or larger than Tinat(I) and during which presenter agent I could not form a new pairing;
A: the mapping of the complex behavior of a system to test, or of a test digital signal, in sequences where (001011100111010) represents the original digital signal,
(A) represents a sequence,
(B) represents a sequence, and
(C) represents a sequence.
B: the definition of the initial population of agents where (presenters) represent presenter agents;
(detetors) represent detector agents;
(A), (B) and (C) represent presenter agents presenting respectively sequences (A), (B) and (C);
(ligand) represents an agent's ligand;
(receptor) represents an agent's receptor;
(cluster1) and (cluster2) represent groups, clusters of agents, and
the dashed line represents presenter agent A connectivity list.
C: the agent's interaction dynamics where
(decisions) represents the creation of a new pair, even if any of the agent was already paired, provided it prefers the new pairing;
(anergy) represents the substitution of a detector agent with another equivalent agent in population X from the repertoire, whenever it terminates a pairing that lasted for a time τ larger than a pre-established time τa, without forming new pairings.
D: the evaluation of parameters which define the dynamics of the system to be tested and comparison with the parameters found for the system in the absence of anomalies, and activation or not of an alarm system, and where namely
ncos(I) is the number of pairings established by presenter agent I and lasting Tdet(I) during the detection stage;
ndet(I), is the number of pairings established by presenter agent I and lasting Tdet(I) during the training stage;
naus(I), is the number of periods of time with a duration larger or equal to Tinat(I) during which presenter agent I was not capable of establishing a pairing during the detection stage;
ninat(I), is the number of periods of time with a duration larger or equal to Tinat(I) during which presenter agent I was not able to establish a pairing during the training stage.
The present invention detects anomalies in the behavior of a target system using digital data series. It is assumed that a data series or a data set is available with the necessary information about the typical complex behavior of the system. From them, sets with sequences with a fixed number of digits can be defined. The method relates the information contained in the set of sequences, with the behavior of a computational model (artificial system) of interacting agents. The computational model is a new type of cellular automaton in which agent's states evolve dynamically following rules that make use of a temporal component associated to agent's states.
Agents in the artificial system are defined using sequences from the original data series and in such a way as to engage in a maximally frustrated dynamics. Changes in the behavior of the target system decrease the frustration in the artificial system which can be measured and used to trigger the detection system. The way in which sequences can be defined from the original data can be diverse. The method establishes how these sequences can be used to detect sequences that were absent from the original (training) signal or to detect sequences that had already been present but appear in new combinations.
The method operates in three stages, education, calibration and detection (
The education stage uses a trial and error procedure to replace detector agents that have not been able to establish a pairing with a presenter agent (a mechanism designated by positive selection), or that establish pairings with presenter agents that are too stable (a mechanism designated by negative selection). These detector agents are replaced by others with a set of random features, as it will be defined below.
Positive selection increases the global interactivity of the several agents, making the dynamics more homogeneous. In that way maximal pairing lifetimes become representative of the population dynamics, so that negative selection will act over all agents simultaneously and not only over a subset. Negative selection maximizes frustration in respect to the information presented and by reducing pairing lifetimes between the two types of agents. The information related to sequences or combinations of sequences not present in the original signal do not influence the artificial system dynamics. As such, their appearance during the detection stage disturbs the dynamics leading to long pairing lifetimes which signal the presence of anomalies.
During the calibration stage the artificial system dynamics with educated detectors is analyzed to determine parameters that characterize the dynamics in the absence of anomalies. During the detection stage, the repertoire of educated detectors is used to build a population of detectors that interacts with presenter agents. Presenter agents are defined using the sequences of the digital signal to be tested. Given that the number of diverse detectors leading to the same maximally frustrated dynamics can be extremely large, detector agents are continuously replaced anytime they terminate pairings that are not necessarily too stable. This mechanism, called anergy, allows replacing detectors agents by other equivalent detectors contained in the repertoire. However, these detectors produce nevertheless a different dynamics in the presence of sequences, or combinations of sequences, that were not presented during the training (education) stage. In particular, a finite number of detectors can establish stable pairings. The number of stable pairings involving a presenter agent is called costimulation. Costimulation and anergy are used simultaneously to determine whether a presenter agent can establish stable pairings with many different detector agents, in which case the presence of anomalies is signaled, or whether the number of stable pairings it forms is small, and derives from interactions with a small number of badly educated detectors in the population.
The computational model establishes a set of interaction rules that each agent follows and which change its dynamical state. An agent is an element from a population of agents with the following attributes:
Each agent can be associated to a state that registers whether the agent is paired or not. In case it is paired, the agent's state records the agent it is paired with.
All agents, presenters and detectors, use the same interaction rules to pair with agents of the other type and presenting ligands which are preferably on the upper positions of their receptor's lists.
Anytime two agents of opposite type interact, a new pair is formed if both ligands are preferred relatively to the ligands displayed by the agents to those agents are paired with (in case they were previously paired). In that case, previous pairings are terminated. In case one agent is not paired, it will form a pair with any agent of the opposite type listed in its connectivity list and provided its ligand is preferred by the other agent's receptor. These interaction rules assume that each agent can only establish a stable pair with one agent at a time.
Ordered receptor's lists can be defined implicitly or explicitly. Implicit orderings can be established using one parameter functions—such as random number generators—relating a score to each sequence. The explicit ordered list can be obtained by ordering the scores associated to each sequence. Different and diverse lists can be obtained by changing the function parameter—the seed number in the random number generator. The invention achieves qualitatively equivalent results using either an explicit or implicit lists definition.
The implementation of the computational algorithm assumes that N sequences obtained from a digital signal describing the behavior of the target system are provided, and that any sequence can be mapped onto a finite subset of natural numbers (
The algorithm is divided in three stages: repertoire education (
The calibration stage uses the populations of detector agents obtained in the end of the education process, to establish parameters characterizing the usual dynamics of the interacting agents.
During the detection stage characteristic pairing lifetimes increase considerably relatively to the values found in the calibration stage whenever a sequence is presented that was never presented during the education stage.
To each detector agent is associated:
To each presenter agent is associated:
Registers storing information concerning the duration of pairings for each agent and the time each agent spent without establishing pairings, are set to zero.
Pairs of presenter and detector agents with ligands in their corresponding connectivity lists are put in interaction. Denote by i and j their identifier indices and by p(j,i) the rank of the ligand presented by agent j in the receptor list of agent i. Agents i and j form a new pair:
Connectivity and receptor lists and cluster indices of detector agents not forming pairs for a time larger to a number of iterations larger than τpos—designated positive selection time—are replaced by new randomly drawn items, the connectivity list K and the cluster index C and the agent's state E is set to zero.
In case no detector agents satisfy the previous condition, the positive selection threshold time τpos is updated to the largest duration time a detector agent remained without establishing pairings in the last W iterations.
Connectivity and receptor lists of detector agents remaining paired for a number of iterations larger than τneg—designated negative selection time—are replaced by new randomly drawn lists, as for example the receptor list R, and the states of the paired agents are set to zero.
In case no detector agents satisfy the previous condition, the negative selection threshold time τneg is updated to the largest duration time a detector agent remained paired. The population of detector agents is recorded.
Increment the iteration number by one and in case it does not exceed a maximum value, return to STEP 4. If not, terminate the education process. Register the last selected population of detectors and add it to the repertoire of educated detectors.
The repertoire of educated detectors agents should be enlarged by repeating the previous procedure several times (typically a number of times larger than 10) using different random number generations. Steps 5E and 6E can be modified in order to take into account the definition of the first population of educated agents. In that case step 6E* should be used instead:
Receptor lists of detector agents remaining paired for a time larger than the number of iterations τneg—designated negative selection time—are randomly reshuffled, for example it is maintained the cluster index C and replaced the receptor list R by a random permutation and the states of the paired agents are set to zero.
In case no detector agents satisfy the previous condition, the negative selection time τneg is updated to the largest duration time a detector agent remained paired. The population of detector agents is recorded.
In the end of the education process several educated populations are registered. The several detector agents with the same identifier index I in each population have the same index C and the same connectivity lists K.
An important modification consists in changing every J iterations sequences presented by presenter agents. In that case, before reaching the final iteration, step 7E calls step 2E. All registers are reinitialized and the population of presenter agents is defined according to the new sequences to be presented. Detector agents are kept, so that those remaining in the population maximize frustration for several sets of sequences presented by presenter agents.
Another modification that improves the algorithm convergence consists in stopping to change connectivity lists after a given iteration (for instance, during the second half of the average number of iterations required to educate a population), in which case step 5E is omitted as well as the modification in the connectivity list in step 6E.
To perform monitoring and anomaly detection, the algorithm requires two additional stages after education, namely calibration and testing. The calibration stage is needed to find parameters characterizing the normal behavior of the system. In this stage the same sequences that were presented during the education stage are used, to produce operating conditions in the absence of anomalies. The detection stage uses sequences taken from a signal to be tested.
A population of detector agents in the repertoire is selected and agent's states are set to zero.
A population of presenter agents is defined as in step 2E, using sequences from a training signal.
Proceed as in step 3E.
Proceed as in step 4E.
Whenever a detector agent terminates a pairing without starting a new one, the detector agent is replaced by another agent with the same identifier in another randomly drawn population in the repertoire.
The iterative process is repeated after step 3C (or 2C, as in the alternative algorithm described above where presented sequences change) until the final iteration is reached.
For each presenter agent with identifier I, the time duration τdet(I) is calculated, for which a fraction p (for instance, p=99%) of pairings lasted a shorter time. The time duration τinat(I) is also calculated, for which a fraction p (for example, p=99%) of periods of time in which the agent did not establish a pairing, lasted a shorter time. The number of events with these time durations are also registered, respectively, ndet(I) and ninat(I).
The previous procedure is repeated starting from step 1C and for a statistically significant number of times, nr (for example, 20). All time durations, τdet(I) and τinac(I), and corresponding number of events, ndet(I) e ninac (I), are successively recorded.
For each agent, the characteristic time durations, Tdet(I) and Tinat(I), are defined, for which a percentage q (for example q=5%) of the values τdet(I) and τinat(I) are larger. Record as well the number of occurrences ndet(I) and ninat(I) for those values.
Proceed as in step 1C.
Proceed as in step 2E, with sequences taken from the signal to be tested.
Proceed as in step 3E.
Proceed as in step 4E.
Proceed as in step 5C.
Increment ncos(I) whenever a presenter agent I is involved in a pairing lasting longer than Tdet(I) iterations.
Whenever a detector agent stays without forming new pairings for Tinat(I) iterations, a inactivity counter naus(I) (absence of contacts) from the presenter agent is incremented.
The iterative process is repeated starting from step 3D (or from step 2D, if the sequences presented change as assumed in the alternative algorithm described above) until the maximum pre-defined iteration is achieved (equal to the final iteration mentioned in step 6C). It is considered that agents are activated by other agents if ncos(I)−ndet(I)−ε>0. It is considered that agents are activated due to a lack of interactions with other agents if naus(I)−ninac(I)−ε′>0. ε and ε′ are constants greater or equal to zero, which can be used to decrease the impact of stochastic fluctuations on the activation of agents in the absence of anomalies (false positive errors).
The alarm system is activated when one or more agents are activated.
The present invention can find applications in all areas where an anomalous behavior in systems with high complexity needs to be detected. The following areas of application are possible:
In a realization of the present invention the formulation of the invention is characterized by using always a computational algorithm to detect anomalies in the presentation of a plurality of sequences describing the typical behavior of the system to be monitored, these anomalies being detected due to a decrease in frustration in the dynamics of the computational system and possibly occur when a sequence was never observed in the typical behavior of the system or when sequences have never been observed together but are used during the typical behavior of the system.
In a realization of the present invention the formulation of the computational algorithm is characterized by the definition of a frustrated dynamics among agents, with one set of agents presenting the sequences from the signal describing the system's behavior, and the other set using this information to decide to which agent it will remain paired.
In a realization of the present invention the anomaly detection mechanism is characterized for using as abnormality criterion the computational agents pairing duration times and also the time duration during which they cannot form pairs.
In a realization of the present invention the formulation of the computational algorithm is characterized for defining interaction rules among agents such that all agents attempt to form pairs with agents randomly selected from a list defining their connectivity, for forming a new pair whenever they are not paired or whenever the new agent with which they interact is placed higher in a list defining its receptor, and provided the other agent they interact with acts in the same way.
In a realization of the present invention the formulation of the computational algorithm is characterized for using an education stage to build a repertoire of detector agents, during which detector agents are eliminated and replaced by new ones, whenever they remain not paired during a time larger than a continuously optimized characteristic positive selection time, or whenever they establish pairings that last longer than a continuously optimized characteristic negative selection time.
In a realization of the present invention the formulation of the computational algorithm is characterized for using during the education stage presenter agents that present sequences characterizing the typical behavior of the system under analysis.
In a realization of the present invention the formulation of the computational algorithm is characterized for defining after the education stage and for each agent, a profile of the number of pairings each agent formed and as a function of their time duration.
In a realization of the present invention the formulation of the computational algorithm is characterized for defining after the education stage and for each agent, a profile of the number of periods of time each agent remained not paired and as a function of the time duration. In a realization of the present invention the formulation of the computational algorithm is characterized for using during the monitoring phase presenter agents that present sequences characterizing the typical behavior of the system to monitor.
In a realization of the present invention the formulation of the computational algorithm is characterized for using a mechanism of anergy in the monitoring stage, where paired detector agents that are abandoned as a result of the computational system frustrated dynamics, are replaced by another equivalent detector agent in the repertoire formed during the education stage.
In a realization of the present invention the formulation of the computational algorithm is characterized for using a costimulation mechanism establishing that during a time interval presenter agents establishing a number of long pairings greater than a certain typical number, defined above, are activated.
In a realization of the present invention the formulation of the computational algorithm is characterized for using a mechanism of neglect establishing that during a time interval presenter agents not establishing pairings for a number of times greater than a certain typical number, defined above, are activated.
In a realization of the present invention the formulation of the computational algorithm establishes, after the education stage, the number of agents that can be activated by the presentation of sequences obtained from samples encoding the typical behavior of the system.
In a realization of the present invention the formulation of the computational algorithm establishes that a sample exhibits an anomaly if the number of activated presenter agents is greater than established before.
The preferred embodiments described above can obviously be combined. The following claims define additional preferred embodiments of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
106185 | Mar 2012 | PT | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2013/051706 | 3/4/2013 | WO | 00 |