The present disclosure relates to the protection of electronic devices. In particular relates to real-time endpoint security protection using a data model that predicts security actions in response to security events.
Today, there are many endpoint security issues such as viruses, ransomware, phishing ware, stolen identity, stolen device, etc. It is both important and challenging to protect sensitive information stored on and transmitted by endpoints such as smartphones, tablets, laptops and other mobile devices.
There are many endpoint security providers on the market, many of which provide similar solutions to solve security issues. One of them is Microsoft Defender Advanced Threat Protection™ (Microsoft Defender ATP). The main strategy of this solution is the implementation of rules based on knowledge. The typical workflow for rule-based implementations is as follows: after the information related to a security issue is collected from the endpoint, the security provider analyzes the information and determines a solution. After this, an application is deployed to the endpoint to fix the security issue. In this sense, the solution is a rule-based application.
There are some disadvantages with a rules-based strategy. The solution from a given provider to fix a security issue is not necessarily standard, and may not always be relied upon to be the best, because it depends on the specific provider's ability to analyze the issue and build the corresponding rules for its remediation. In addition, the process needs a number of manual steps. Furthermore, there can be some delay in fixing a newly emerged security issue, as the provider needs to collect information and analyze the issue before a fix can be applied to the endpoints.
Other solutions include the use of artificial intelligence (AI) to identify threats, by classifying detected events as either a threat or not a threat. The output of the AI models may be a risk score or whether a pattern of events is an anomaly. However, once the threat or anomaly is identified, other techniques, such as rules-based techniques, are used to determine what response to take. The remedial action is then based on the risk score, the prediction of a threat, or the identification of an anomaly. Remedial actions may be automated or referred to an administrator. Administrators can review the prediction of the AI model as to its correctness, which can be fed back into the model.
Patent application US20190068627 to Thampy analyzes the risk of user behavior when using cloud services. Patent US9609011 to Muddu et al. discloses the detection of anomalies within a network using machine learning. Patent application US20190260804 to Beck et al. uses machine learning to detect a threat in a network entity. A score is assigned to the threat, and an automatic response may be made based on the score. Patent US10200389 to Rostamabadi et al. discloses looking at log files to identify malware. Patent application US20190230100 to Dwyer et al. is a rules-based solution for analyzing events on endpoints. The remedial action may be decided at the endpoint or at a connected server.
An AI data model directly predicts remedial actions to take in response to detected security events, bypassing the intermediate step of determining the risk or threat level of the events. The data model is trained with security events and corresponding security actions. The data model is trained with the data from multiple users' actions, which may result in the action the data model predicts being considered to be best practice.
Once the data model is mature, i.e. after a machine learning technique has been used with enough data to train the data model, the data model has the ability to predict what to do if similar security event patterns later occur on an endpoint. The result of the prediction is the security action or actions that need to be applied to the endpoint. In cases where the data model is present in the endpoint, the endpoint can be protected in real time.
The endpoint can also be protected when a brand new security issue occurs. If the new security issue triggers a set of security events known to the data model, or close to those in the data model, then the data model has the ability to predict an appropriate security action or actions, even though the specific security issue may at this point be still unknown.
The specific AI model disclosed is a multi-label classification of sets of events directly into sets of actions, omitting the step of determining the threat level. By omitting the step of determining the threat level, greater efficiency may be obtained.
Disclosed herein is a method of protecting an electronic device comprising the steps of: generating a multi-label classification data model comprising security event groups labeled with security actions; detecting one or more security events; predicting, using the multi-label classification data model, one or more security actions based on the detected one or more security events; and implementing the predicted one or more security actions on the electronic device.
Also disclosed herein is a system for protecting an electronic device comprising a processor and computer readable memory storing computer readable instructions that, when executed by the processor, cause the processor to: generate a multi-label classification data model comprising security event groups labeled with security actions; receive one or more security events that are detected in relation to the electronic device; predict, using the multi-label classification data model, one or more security actions based on the detected one or more security events; and instruct the electronic device to implement the predicted one or more security actions.
Data model, or AI model, AI data model or machine learning model: an algorithm that takes complex inputs that it may or may not have seen before, and predicts the output that most correctly corresponds to the input. The prediction is based on input and output data sets used for training the data model, in which the outputs for specific inputs are identified as correct or incorrect.
Endpoint, or device: This is any electronic device or any computing device to be protected. Non-limiting examples of a device include a laptop, cell phone, personal digital assistant, smart phone, memory stick, personal media device, gaming device, personal computer, tablet computer, electronic book, camera with a network interface, and netbook. Most devices protected by the invention will be mobile devices, but static devices, such as desktop computers, projectors, televisions, photocopiers and household appliances may also be protected. Many other kinds of electronic devices may be included, such as hi-fi equipment, cameras, bicycles, cars, barbecues and toys, if they include memory and a processor. Devices are configured to communicate with a remote server, and they may initiate the communications and/or the communications may be initiated by the server. Communications may be via Wi-Fi™, SMS, cellular data or satellite, for example, or may use another communications protocol. While the invention is often explained in relation to laptops, it is to be understood that it applies equally to other electronic and computing devices.
Security event: A security event is a change or abnormal behavior on the endpoint that is a security concern, e.g. a software change, a hardware change, a configuration change, abnormal web/network usage, abnormal software usage, abnormal hardware usage, abnormal device usage, or abnormal data file usage. Security events may be specific or general, and may include multiple constituent security events. A security event formed of two constituent events in one order may be different to a security event formed of the same two constituent events in a different order. A security event may depend on the state of the endpoint, such as whether a user is logged in, whether it is connected to a network, or its location.
Security issue: This is a high-level description of a problem related to an endpoint, e.g. viruses, ransomware, phishing ware, identity stolen, device stolen. A security issue may be the cause of one or multiple security events.
Security action: A measure applied to an endpoint to protect it against a security issue or one or more security events. For example, a security action may be to stop an application, stop a service, display a warning message, log out a user, lock a screen, uninstall an application, wipe data, wipe the operating system (OS), or freeze the endpoint. One or multiple security actions may be implemented in response to a security event or security issue.
System: Unless otherwise qualified, this refers to the subject of the invention. It refers to a combination of one or more physical devices, including hardware, firmware and software, configured to protect one or more endpoints. The system uses a multi-label classification data model to predict one or more security actions based one or more detected security events, and implement the predicted actions on the endpoints.
The embodiments described below allow for the prediction of security actions directly from detected security events using an AI data model.
A security issue may be the result of a poor measure applied to an endpoint, and can trigger a series of security events on the endpoint. For example, when ransomware impacts an endpoint, one or more of the following security events may occur: an unauthorized application is downloaded to the endpoint; an unauthorized application runs in the background; an unauthorized application runs at an irregular time compared to a normal endpoint working time; an unauthorized application uses a high processor, memory or input/output resource; or an unauthorized application accesses sensitive data files.
The strategy used in the disclosed solution is facts based. Given that a particular security issue results in a common or near common set of security events, and that a majority of administrative users will apply the same, specific security response when a particular group of security events occur, then this specific security response will be deemed best practice for fixing the particular security issue. The specific security response involves applying one or more security actions to the endpoint.
Examples of security events are shown in the second column of TABLE 1. Each specific security event is shown as belonging to a particular type of security event, which may be referred to as a general security event. However, the generalization is not necessary for building the data model. By analyzing specific events rather than the type of event or general event, the data model may be more discerning and more accurate.
Security actions may have different levels of impact on the endpoint, with the higher impact security actions in general being the response required for correspondingly greater threats. Some examples of security actions are shown in TABLE 2, together with their impact level. However, it is not necessary to determine the level of the threat, not determine the level of action required in response to the threat. This is because the security events are labelled directly with the actions in the data model, which is therefore able to predict security actions directly from security events.
Whether something represents abnormal behavior is based on a comparison of the endpoint's current behavior with normal behavior. Normal behavior is defined based on normal usage of the device on an ongoing basis, or on usage over a period of time when the device is known to be secure, or based on usage of similar devices by similar users, etc. Abnormal behavior is determined from an analysis of current behavior using the normal behavior as a baseline.
Machine learning is the chosen technique to build a data model to construct the relations between security events and security actions. Specifically, the method described in this solution can be treated as a multi-label classification case. Inputs to the data model are security events that occur on the endpoints. The outputs of the data model are security actions, rather than a determination of the security issue. As such, a response to a threat on an endpoint may be determined in fewer steps than if the security issue were first to be determined and then rated with a threat risk level.
The security events occurring on the endpoints 10 and the security actions applied by the users 12 to the endpoints are fed into a machine learning (ML) application 14 on a server, for example a server in the cloud, to build the action prediction model 16. The action prediction model 16 is the data model that predicts the security actions in response to the security events.
The use case diagram in
The action prediction model 16, which is a multi-label classification data model, may use the definitions of security events listed in TABLE 3, for example. An event scenario, which may include one or more security events that are detected in a predetermined period of time, may be described by these attributes. The attributes, or their IDs, may be used in both the machine learning process as well as during operation of the action prediction model 16 after it has been trained. The examples of attributes that are given are non-limiting, and other attributes may also be included in the list. The attributes listed may also be modified. For example, the time period may be set to less than 1 day or more than one day depending on the particular embodiment. Different attributes may have different time periods. Some of the attributes may be combined into a single attribute, for example using the OR disjunction. Other attributes may be divided into multiple individual attributes, such as the abnormal resource usage case.
Examples of the labels that the machine learning application can use to label the security events scenarios or sets of security events may include those that are defined in TABLE 4. These labels represent the security actions to be taken if predicted by the action prediction model. Again, these are non-limiting examples, which may be added to. These labels will also be used in the action prediction model when in use to protect the endpoints. Labels relating to a common subset of security events may be different depending on other of the security events or attributes. For example, events that are detected while there is no user logged on may be considered to be more serious than the same security events if they occur while the user is logged on.
Sample data used by the machine learning application to train the data model is shown in TABLE 5. Each line represents the detection of one or more security events. As such, each line may be said to represent a security event scenario. Each scenario may represent a particular time period over which one or more security events are detected. In some lines, the individual security events, i.e. the attributes, are shown to have been detected 0, 1 or 3 times.
While in the fully developed case it should be ensured that an adequate set of security events is captured for every security issue, or every type of security issue, and used for training the action prediction model, this is not necessary. One option may be considered if the action prediction model is not mature enough, which may be to not initially deploy the action prediction model 16 to the endpoint to predict security actions, but instead collect security events from the endpoint and send them to the server side for analysis and selection of the most appropriate security action or actions. After the action prediction model 16 has been trained, it can then be deployed on the endpoint and used to predict security actions. This may be, however, in an initial mode in which the action prediction model suggests to the administrative user which of the security actions should be applied to the endpoint, rather than automatically applying the security actions to the endpoint. This is a semi-automatic solution in which verification of the predicted action(s) is requested of an administrator before they are implemented.
The endpoint 30 has an endpoint side application 36 to monitor and collect security events and report the events to the server 50 on the server side of the system. The endpoint 30 also has a set of one or more endpoint side applications 38 to apply the security actions determined by the action prediction model 16 or 16A when security events occur.
The endpoint 30, and other similar endpoints 40, 42 are connected via a network 44 such as the internet to the server 50. The server 50 has a set of server side applications 56 to receive and process events from the endpoints 30, 40, 42. The server 50 also hosts the machine learning application 14 for processing the security event data and the security action data, analyzing the security events and the corresponding security actions taken by both the endpoints autonomously and the administrators, and using machine learning to build the action prediction model 16.
Also present is an administrator's computer 60, connected via the network 44 to the endpoints 30, 40, 42. The administrator's computer 60 has a set of applications to display the security events and security actions and allow the administrators to analyze the security events and choose security actions that are applied or to be applied to the endpoints 30, 40, 42. For example, the display screen 66 of the administrator's computer 60 may display a user interface with a tabulated list of event scenarios (or incidents) 70, where each scenario may be caused by a different security issue, or multiple similar or dissimilar scenarios may be caused by the same security issue. Also displayed in the user interface is a series of one or more security events 72 that make up each scenario, a series of one or more predicted security actions 74 for each scenario, and a list of other optional security actions 76 that may be taken to potentially help resolve the security issue. The predicted security actions 74 and the other security actions 76 may be individually deleted by the administrator, or further security actions may be added to the list of other security actions. Once the administrator is ready to implement the security actions 74, 76 for a given security event scenario, then a selection box 80 in the selection column 78 may then be checked and an “Implement” button 82 clicked. As is expected, there are many other different forms the user interface may take in order to permit the administrator to observe the predicted actions, implement the predicted actions and amend the list of security actions to be applied to the endpoint.
Once the data model 94 is trained, a security event that is detected in step 86 is passed to the data model 94 directly, bypassing the analysis step 92. The data model 94 then predicts, in step 96, what security action or actions to take. The security action may be applied directly, in step 88, under control of the data model 94, or it may first be verified in step 98 by the administrative user 91 before being applied.
The predicted actions taken on an ongoing basis by the application when running on the individual endpoints may be used to continually train, evolve and reinforce the action prediction model. Likewise, the actions taken by the administrators may also be used to continually train, evolve and reinforce the action prediction model. For example, whenever a new security issue arises, the administrators may be given the opportunity to either approve the security action(s) predicted by the action prediction model, or suggest a more appropriate set of security action(s).
If a brand new security issue occurs, which causes a pattern or scenario of security events that has not been seen before, then the predicted action made by the action prediction model and applied in real time may in some cases not be optimum, but it is expected to be close to optimum. If a security action automatically taken in response to a set of one or more new security events is not optimum, an administrator may be more likely to analyze the problem and choose appropriate action(s), via the verification step 98, before a centralized security provider may decide upon the most appropriate action. This is because it is likely that several different administrators around the globe may exposed to the same, brand new issue, whereas an existing security provider may have limited staff/hours and an existing workload, and may not be able to get to dealing with the new issue as quickly. As the predicted action is reinforced by multiple administrators, or as it is modified and then invoked by multiple administrators, then it may effectively become the optimum action.
If the predicted security action, in step 96, is optimum, then is it likely to be verified, in step 98, by one of the administrators before an centralized security provider may do so, for the same reason as above. One of the reasons for using a machine learning data model rather than a rules engine is that a predicted response is more likely to be closer to a human response than a response determined by a rules engine. As the model is regularly evolved as more and more new security issues occur, in time it may achieve an ability to provide an optimum response for each new security issue.
Referring to
Event groups 120, 122 that are similar to event group 1 (100) are also labeled with the same actions as event group 1. Event groups 100, 120, 122 can be said to belong to pattern 1 (124) of security events. Event groups 130, 132 that are similar to event group 2 (102) are labeled with the same actions as event group 2. Event groups 102, 130, 132 can be said to belong to pattern 2 (134) of security events. Event groups 140, 142 that are similar to event group N (104) are labeled with the same actions as event group N. Event groups 104, 140, 142 can be said to belong to pattern N (144) of security events. Depending on the data model, the variation between event groups within the same pattern may be wider or narrower than within other patterns, and in some cases there may be no variation. What is notable about the action prediction model is that it does not explicitly output a risk level, nor does it identify a particular security issue. Instead, it jumps directly to predicting the required security actions.
As a consequence of the above, a new set of events that is not identical to any prior event group may be deemed by the model to be within a range of a known pattern, and therefore labeled with the actions corresponding to the pattern. Alternately, the new set of events may be determined to be closer to one pattern than to any other patterns, and therefore labeled with the actions corresponding to the nearest pattern.
By generalizing the security events as in TABLE 1, the data model becomes simpler, as it does not need to discern between the individual, specific security events that are similar to each other.
Other labels, besides those listed above, may be applied to the events. For example, labels may include track the device, take photos, record videos, capture keystrokes, and quarantine files. These labels correspond to security actions that may be taken by the endpoint to recover it, while protecting data, if the security events suggest that it has been stolen.
Other labels may include amounts in their definitions. For example, abnormal internet usage may be defined as being above a threshold number of gigabytes.
The order in which two or more security events occur may be defined as a separate security event, to which an attribute can be ascribed. The time period during which security events are captured may be changed in other embodiments, and the time period may be variable. The interval of time between two security events may in itself be a security event to which an attribute can be ascribed.
A confidence level may be attached to each set of security events that are detected, the confidence level being indicative of how sure the data model is that the detected set of security events lies within a known pattern of events. If the confidence level is high, then it may be assumed that the detected set of events closely matches a known pattern of events for which the labels (i.e. security actions) are well defined, and have stood the test of time. If the confidence level is high, then the set of actions may be implemented automatically, without necessarily alerting an administrator.
If, however, the confidence level is low, then the data model is less certain as to which of at least two patterns the detected set of security actions belongs to. In this situation, an administrator may be alerted and a decision of the administrator requested. In another embodiment, the data model may default to choose the safest set of security actions to apply. Alternately, the data model may automatically invoke all actions that would be predicted if the set of security events could fall within two or more known patterns. This would mean that the data model is acting on the side of caution. If the administrator is prompted for a response, but does not reply within a set time, then the data model may automatically invoke all the predicted actions.
An administrator may set a rule to instruct the data model how to behave if the confidence level is below a threshold value. The administrator may set the level of the threshold. For example, the threshold may be set relatively high during the initial deployment of the data model, and, after the data model has matured and the administrator has developed confidence in it, then the threshold may be set to a relatively lower level. Administrators may instead set a percentage defining how many of the predicted security actions they are to receive notifications for during a set time period.
When the data model is used for generating a prediction, after the security events are processed, for each action there is a score created for each action. The score represents a probability that relates to the suitability of each action, and its value may range, for example, from 0 to 1. The confidence level may be defined from this score. If there are multiple actions predicted, each action will have its own score and the overall confidence level for the set of actions may be the average of the individual scores. The threshold value may then be based on the overall confidence level.
If the pattern of security events detected is significantly different from any known pattern, then the data model may default to shutting down the endpoint and notifying the administrator.
Different administrators could be notified of predicted and implemented security actions depending on which administrator is on duty.
The data model may be trained or reinforced with simulated events and replicated historical events as well as actual, current or real-time events.
The system may automatically correlate similar patterns of security events that are detected across multiple endpoints, and alert an administrator that multiple endpoints are being affected in a similar way.
The application may include a bot, for example for communication with an administrator, learning what security actions the administrator applies, and learning how the administrator verifies sets of predicted security actions.
Some embodiments may include assigning scores for the one or more actions that are predicted in response to a set of detected events. The scores may be related to the frequency at which the administrators employ the actions. Some embodiments may incorporate rules engines to determine what to do based on the scores.
Events may be processed differently, i.e. some in real time and some not.
Where a processor has been described, it may include two or more constituent processors. Computer readable memories may be divided into multiple constituent memories, of the same or a different type. Steps in the flowcharts and other diagrams may be performed in a different order, steps may be eliminated or additional steps may be included, without departing from the invention.
The description is made for the purpose of illustrating the general principles of the subject matter and not be taken in a limiting sense; the subject matter can find utility in a variety of implementations without departing from the scope of the disclosure made, as will be apparent to those of skill in the art from an understanding of the principles that underlie the subject matter.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CA2021/050393 | 3/25/2021 | WO |