Although the Internet has had great successes in facilitating communications between computer systems and enabling electronic commerce, the computer systems connected to the Internet have been under almost constant attack by hackers seeking to disrupt their operation. Many of the attacks seek to exploit vulnerabilities of the application programs or other computer programs executing on those computer systems. Different vulnerabilities can be exploited in different ways, such as by sending network packets, streaming data, accessing a file system, modifying registry or configuration data, and so on, which are referred to as security events. Developers of applications and administrators of enterprise networks commonly go to great effort and expense to identify and remove vulnerabilities because if a hacker identifies a vulnerability which is exploited, it can often result in significant negative consequences.
This Background is provided to introduce a brief context for the Summary and Detailed Description that follow. This Background is not intended to be an aid in determining the scope of the claimed subject matter nor be viewed as limiting the claimed subject matter to implementations that solve any or all of the disadvantages or problems presented above.
Semantic networks are generated to model the operational behavior of an enterprise network to provide contextual interpretation of an event or a sequence of events that are observed in a specific enterprise network. A semantic network is a form of knowledge representation using a directed graph comprising vertices which represent concepts, and edges which represent relationships between the concepts. In various illustrative examples, different semantic networks may be generated to model different behavior scenarios in the enterprise network. Without the context provided by these semantic networks malicious events may inherently be interpreted as benign events as there is typically always a scenario where such events could be part of normal operations of an enterprise network. Instead, the present semantic networks enable interpretation of events for a specific enterprise network. Such interpretation enables the conclusion that an event sequence of events that could possibly be part of normal operations in a theoretical enterprise network is, in fact, abnormal for a specific enterprise network.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Like reference numerals indicate like elements in the drawings.
The enterprise network 105 is coupled to external networks to enable the users 112 and machines 116 to connect to various external resources 1211, 2, N that may include web sites, databases, e-mail services, and the like. A firewall 125 and network intrusion detection system (“NIDS”) 131 are utilized, this example, to provide security protection for the users 112 and machines 116 in the enterprise network 105. The firewall 125 is typically located on the perimeter of the enterprise network 105 and monitors traffic flowing between the enterprise network and the external resources 121. The firewall 125 will commonly permit or block traffic in accordance with a rule set or policies.
The NIDS 131, if conventionally arranged, would perform intrusion detection to identify actions or events occurring in the enterprise network 105 that may be associated with a malicious attempt to compromise the confidentiality, integrity, or availability of a user 112 or machine 116 in the enterprise. While intrusion detection is shown as being performed at the network level (i.e., network-based intrusions detection) in
As shown in
By comparison to the conventional use of flat, fixed attributes, the present semantic networks for intrusion detection uses an enriched set of attributes. A semantic network is often used as a form of knowledge representation. It is typically a directed graph consisting of vertices which represent concepts and edges which represent semantic relationships between the concepts. A generalized example of a semantic network 400 is shown in
In the present case,
Generally, application of this method may take into account events that occur in different levels of the enterprise. For example, a machine might indeed by suspicious if it sends out a lot of data from the network 105 to the external resources 121 and the machine is i) a desktop; ii) this desktop belong to a software developer who typically should not be sending data outside the network 105; iii) the origin of the data is a folder that contains sensitive information (e.g., program code of an upcoming product release); and iv) the destination for the data is a public e-mail account. By comparison, a machine that sends out a lot of data will not be deemed suspicious if the machine is i) a server; ii) the e-mail destination is at a legitimate business partner; and iii) the same data was sent to other partners as well.
The method shown in
The method 700 may be further illustrated using the scenarios described below.
The reporting semantic network 900 may be used, for example, to identify abnormal e-mail which is assumed to have potential for spreading malicious software (i.e., “malware”). Generally, once a suspicious e-mail is identified, it can be examined more closely to determine if it contains malware in fact.
By comparison, an e-mail from shair to ryanh is considered abnormal as the reporting distance between these users is 4. E-mails that span such a large reporting distance are extremely rare in the specific case of enterprise 105. Accordingly, the e-mail from shair to ryanh is suspicious and can be further examined as a source of potential malware, for example.
An e-mail from shair to alomn will be somewhat suspicious. The common vertex shared by these users is rakeshn at a reporting distance of 3. Shair and alomn are also separated from each other by a reporting distance of 3. Due to this distance, it is unlikely that shair and alomn will have too many things in common. However, the e-mail communication is less suspicious than the e-mail from shair to ryanh described above because both users are at the same level (level 1) in the reporting hierarchy. As a result, there is some expectation that shair and alomn might collaborate from time to time.
The above reasoning as to why one e-mail is normal but another is suspicious is intended only to be illustrative, and it is emphasized that the reporting hierarchy using a graph to build the reporting semantic network enables such type of reasoning to be formalized and extended, for example, using a computer program. These programs can apply various algorithms to enable semantic networks to be built and used to provide contextual information in an automated manner for a wide variety of intrusion detection scenarios.
The likelihood of an event (such as an e-mail message being sent between users) being deemed abnormal may also be expressed using a probability. For example, as shown in
Generally, the detection of abnormal behavior in the enterprise network 105 may be enhanced by cross-referencing between several semantic networks. That is, semantic networks can provide contextual meaning for a variety of behaviors and organizational characteristics of a given enterprise. For example, and not by way of limitations, semantic networks can cover geographic organization (i.e., where users, machines, subnets, domains, etc.) are physically located, project team organization (which users, development groups, support organizations, etc. are involved with a particular project), time-based plans (what is planned to occur in an enterprise and when), and so on.
Several more illustrative examples of other semantic networks are discussed below.
Table 1300 in
Table 1600 in
If shair fails to logon to the shai-desk desktop machine, then suspicion that such events are abnormal increases. The repeated logon failures could potentially indicate that the user attempting to logon is not, in fact, shair and/or shair's identity has been compromised in some way. As shown in table 1600, the characterization of the failed logon events uses the term probably (or may be expressed with a Low Fidelity) to reflect the likelihood that shai's logons are abnormal.
Using similar reasoning, the probability that the logons are abnormal and merit further investigation increases when shair repeatedly fails to logon to the ronkar-desk machine. Suspicion is increased, for example to Medium Fidelity, because the repeated failures occur on a machine that is not the user's own.
If shair attempts a logon to savasgdev, then the likelihood that this event is abnormal is even higher as shair and savasgdev are in two different domains where the usual interaction is very low. The view that such a logon event is abnormal is reflected with High Fidelity, for example, as shown in Table 1600.
If shair fails to logon to savasgdev after 10 attempts, then it may be very likely (i.e., Very High Fidelity) that such event is abnormal and could be malicious. Not only are the user and machine in different domains which normally have low interaction, but the inability to logon suggests that the user does not have the correct credentials.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.