This application claims priority based on a Japanese Patent Application No. 2008-215989 filed on Aug. 25, 2008, the disclosure of which is incorporated herein by reference.
The present invention relates to a technique of determining a node set which conducts an unauthorized activity, and controlling an access from the node set.
A number of computers are coupled to the Internet. The computers are subject to unauthorized accesses. For example, a person who does not have an authorized access right to a computer exploits a security hole of software in the computer, or creates a downloadable program infected by a computer virus to intentionally produce a backdoor so as to make the computer freely available without authentication. Further, there have been rapidly increasing DDoS (Distributed Denial of Service attack) attacks or cyber attacks in a distributed manner from multiple points, using a botnet which is a network constituted by computers controlled by those who do not have authorized access rights.
To cope with such problems, there have been known the Intrusion Detection System (hereinafter referred to as IDS) for detecting unauthorized accesses, a firewall for maintaining security of a specific computer network from unauthorized accesses, or the like. The IDS utilizes a previously-registered information pattern of a packet used for an unauthorized access, monitors a packet having the information pattern, and detects an unauthorized access. The firewall detects whether a packet is authorized or unauthorized, based on previously-set information in which whether or not an access is permitted is determined by the IP address or the port number.
Japanese Laid-Open Patent Application, Publication No. 2005-197823 (to be referred to as Reference 1 hereinafter) discloses a technique of blocking an unauthorized access, in which, if a firewall detects an unauthorized access, the firewall identifies an IP address of a source of the unauthorized access, sets a drop of the IP address using a filtering function of a router installed in a LAN, and drops a packet related to the IP address of the unauthorized access source (see paragraph 0019).
However, the DDoS attacks or cyber attacks have been more and more sophisticated and complicated. An attack node launching an attack quickly comes and goes and is soon followed by others.
Therefore, as disclosed in Reference 1, it is inefficient to individually deal with each IP address of a large number of unauthorized access sources. Such an individualized countermeasure has the problem of not capable of detecting a newly-launched attack having a characteristic not the same as but similar to an attack launched before. This is because the countermeasure only detects an attack having an IP address identical to that previously registered, thus leading to a belated countermeasure.
The disclosed system provides a technique of grouping a plurality of attack nodes each having a similar characteristic into an attack node set and conducting a countermeasure against the attack node set.
An attack node set determination apparatus: collects event logs; extracts basic item information from the collected event logs; creates attribute information by processing the basic item information or checking a targeted node based on a basic item; performs a clustering on the attribute information; computes events each having a similar characteristic; and sets clusters as a result of the computation in an information processing device. After the setting, if an unauthorized access is detected, the information processing device identifies a cluster including an event related to the unauthorized access and conducts a previously-set countermeasure against the unauthorized access on the whole identified cluster.
A plurality of attack nodes having similar characteristics are made into clusters. A countermeasure is taken on a whole target cluster. This can improve efficiency of countermeasure operations and prevent a newly-attempted attack from an attack node having a similar characteristic to that previously attacked.
According to the teaching herein, it becomes possible to determine a node set which conducts an unauthorized activity and control an access from the node set.
These and other benefits are described throughout the present specification. A further understanding of the nature and advantages of the invention may be realized by reference to the remaining portions of the specification and the attached drawings.
Next is described a configuration example of an attack node set detection system in which a characteristic of an unauthorized access is extracted, and a plurality of attack nodes each having a characteristic similar to the extracted characteristic of the unauthorized access are grouped, with reference to
In
The firewall 11 (which may also be referred to as an information processing device) maintains security of the terminal 17 or the like coupled to a network (for example, an intranet) configured on an inward side of the firewall 11. This is achieved by permitting only a packet having an authorized communication to pass through, from among packets transmitted from an external network 50 coupled to an outward side of the firewall 11. For example, the firewall 11 includes a DMZ (DeMilitarized Zone) 20 and allows access from the external network 50 to the Web server 14, mail server 15, and proxy server 16 which are installed in the DMZ 20. The firewall 11 implements a prescribed processing of an unauthorized packet using an access control program 111. For example, the firewall 11 drops the unauthorized packet or reports an unauthorized access to an administrator. The firewall 11 then stores a log regarding the unauthorized access as an event log.
The IDS 13 monitors a packet flowing in the external network 50 using an intrusion detection program and detects an unauthorized packet. The IDS 13 stores therein a log concerning the detected unauthorized packet as an event log.
The Web server 14 offers a Web service using a Web server program. Upon offering the service, the Web server 14 stores therein a log concerning an access to a Web page and an authentication each as an event log.
The mail server 15 offers a service related to e-mailing using a mail server program. Upon offering the service, the mail server 15 stores a log concerning mail delivery, mail reception, authentication, detection of a virus-containing mail, or detection of a spam mail each as an event log.
The proxy server 16 performs communications, in place of the terminal 17, if the terminal 17 coupled to the network on the inward side of the firewall 11 uses a service such as the Web, FTP (File Transfer Protocol), Telnet, and the like offered by a server coupled to the external network 50. Upon the communication, the proxy server 16 stores therein a log concerning access and authentication as an event log.
The terminal 17 is embodied by, for example, a personal computer (PC). The terminal 17 monitors an unauthorized access using an intrusion detection program, an antivirus program, an antispam program, or the like and stores therein a log concerning the unauthorized access as an event log.
The attack node set determination apparatus 12 collects event logs from the firewall 11, IDS 13, Web server 14, mail server 15, proxy server 16, and terminal 17; performs a clustering of events in the collected event logs using an analysis program; and groups attack nodes having similar characteristics to each other into clusters. The attack node set determination apparatus 12 transmits information on the clustering and a countermeasure against attack nodes to the firewall 11, based on results of the clustering and using an access control instruction program. Details of such a processing performed by the attack node set determination apparatus 12 will be described later.
It is to be noted that, in
In
In
In
A configuration of each of the firewall 11, IDS 13, Web server 14, mail server 15, proxy server 16, and terminal 17 is not specifically shown herein. However, each of those components includes a computing unit for performing various computation processings using an application program and generating an event log, an input unit for inputting information, a display unit for screen-displaying a computation result and an instruction, a communication unit for controlling communications with other units, and a storage unit for storing the application program and computation result. Details of a configuration of the attack node set determination apparatus 12 will be described later.
Next is described an outline of this embodiment with reference to
The Comparative Example of
First, an IP address identified as related to an unauthorized access and a countermeasure to deal with the IP address are set for each IP address in the firewall 11 (see
That is, the Comparative Example sets an IP address of a packet related to an unauthorized access and a countermeasure against the packet for each IP address in the firewall 11 and performs the set countermeasure by the firewall 11 for each set IP address.
Next is described the outline of this embodiment shown in
First, the attack node set determination apparatus 12 collects event logs from the firewall 11, IDS 13, Web server 14, mail server 15, proxy server 16, and terminal 17. The attack node set determination apparatus 12 then performs a clustering of the collected event logs and computes a cluster. The cluster is a class dependent on events having similar characteristics to each other so as to deal with a botnet. The similar characteristics herein mean, in a case of DDoS attack or a cyber attack which may be attempted if a person living in a country makes a protest against another country, for example, a specific country or a specific IP address. The attack node set determination apparatus 12 sets the computed cluster in the firewall 11.
After the setting, if the IDS 13 or firewall 11 detects an unauthorized access, the firewall 11 identifies a cluster including an event related to the unauthorized access. The firewall 11 then performs a previously-set countermeasure against a packet related to the whole identified cluster.
That is, in this embodiment, a cluster and a countermeasure thereagainst are set in advance in the firewall 11, and the firewall 11 performs the set countermeasure against a whole cluster related to an unauthorized access.
Next is described creation of a cluster taking a source IP address included in an event log as an example, with reference to
Note that, in
In k-means, six plots are arbitrarily assigned to three clusters as initial values. Centroids of the clusters are computed. Plots having the shortest distances to the same centroid are grouped into the same cluster. Then, the centroids of respective clusters are re-computed, and distances between the re-computed centroids and plots are further computed to determine which plots are the closest. Such computations are repeated until the clusters and ranges (sizes) thereof reach convergence.
Next is described a configuration of the attack node set determination apparatus 12 with reference to
The attack node set determination apparatus 12 includes a computing unit 121, a memory 122, an input unit 123, a display unit 124, a communication unit 125, and a storage unit 131.
The computing unit 121 provides control on respective units 122 to 125, 131 of the attack node set determination apparatus 12 and manages information transmission among the units 122 to 125, 131. The computing unit 121 is, for example, a CPU (Central Processing Unit) for performing computation processings. The CPU develops an application program in the memory 122 as a main storage and executes the application program, to thereby realize various computation processings. The memory 122 is embodied by a RAM (Random Access Memory). Note that the application program is stored in the storage unit 131.
The input unit 123 is, for example, a keyboard or a mouse. The input unit 123 receives an input of information by an administrator who operates the attack node set determination apparatus 12, or the like.
The display unit 124 is, for example, a CRT (Cathode Ray Tube) or a LCD (Liquid Crystal Display). The display unit 124 displays a screen for prompting a user to input information, or a screen for confirming results of computation.
The communication unit 125 transmits and receives information to and from respective units 11, 13 to 17 (see
The storage unit 131 stores therein an analysis program 132, an access control instruction program 133, an event database 134, a policy database 135, and a distance function database 136. The analysis program 132 and access control instruction program 133 are developed as application programs in the memory 122 and are executed by the computing unit 121.
The analysis program 132 performs a clustering on collected event logs, using information stored in the event database 134, policy database 135, and distance function database 136 and determines a collective behavior cluster. The collective behavior cluster used herein means a cluster which has events strongly related to each other (that is, a cluster having a high evaluation value, which is described hereinafter), among all clusters. The analysis program 132 sets a countermeasure against an unauthorized access with respect to a collective behavior cluster. For example, if a collective behavior cluster includes an event related to an unauthorized access, the analysis program 132 sets a countermeasure of blocking a communication (that is, dropping a packet). If too many packets are transmitted to a specific node (for example, the Web server 14), the analysis program 132 sets a countermeasure of controlling bandwidth. Details of operations of the analysis program 132 will be described later.
The access control instruction program 133 transmits information on the collective behavior cluster determined by the analysis program 132 and a countermeasure thereto, to the firewall 11.
Next is described the event database 134 with reference to
The event database 134 includes, for each event, event log basic parameter information (basic item information) and attribute information. The event log basic parameter information is information on an item extractable from an event log. The attribute information is information on items obtained by processing an item in the event log basic parameter information or checking a node related to an IP address included in the event log basic parameter information.
The event log basic parameter information includes items as follows.
A detection time and date is a time and date when an event is detected.
A log type is information for identifying a unit which transmits an event log.
A source IP address is an IP address set in a node responsible for an event having been subjected to recording. For example, if such an event occurs in the firewall 11, the source IP address is an IP address of an access source to the firewall 11. If in the IDS 13, an IP address of an attack source. If in the Web server 14, a client. If in the mail server 15, a transmitter of SMTP (Simple Mail Transfer Protocol) or POP (Post Office Protocol). And, if in the proxy server 16, a proxy user. Note that, if an event occurs in the terminal 17, the source IP address is defined according to a software installed on the terminal 17.
A source port number is a port number of a node responsible for an event having been subjected to recording. The source port number is defined according to the log type, like the source IP address.
A destination IP address is an IP address of a destination related to an event of delivering a packet. For example, if such an event occurs in the firewall 11, the destination IP address is an IP address of an access destination to the firewall 11. If in the IDS 13, an IP address of an attack destination. If in the Web server 14, an IP address of the Web server 14 itself. If in the mail server 15, an IP address of a mail destination. And, if in the proxy server 16, an IP address of a proxy access destination. Note that, if an event occurs in the terminal 17, the destination IP address is defined according to a software installed on the terminal 17.
A protocol is a protocol used in a communication related to an event. For example, the protocol may be TCP (Transmission Control Protocol), UDP (User Datagram Protocol), ICMP (Internet Control Message Protocol), or the like.
Note that the items of the event log basic parameter are not limited to those as described above. The items may also include, for example, a virus name in an antivirus software, according to a configuration of a unit from which an event log is transmitted.
The attribute information includes items as follows. An n-th octet of a source IP address is, in IPv4, n=1 to 4, and, in IPv6, n=1 to 16. Note that IP addresses may be broken down by the octet.
A source country, a source city, a source latitude, a source longitude, a source AS (Autonomous System) number, a source line type, and a source time zone difference are derived from a source IP address as an event log basic parameter of: a location country, a location city, a location latitude, and a location longitude of a node with the IP address assigned thereto; an AS (Autonomous System) number and a line class to which the located node belongs; and a time difference in the located zone, respectively. Those items are obtained by referencing IP addresses stored in the storage unit 131 in advance and a table having the items associated with the IP address or by using an outside service providing information similar to that in the above-mentioned table.
The line type includes, for example, dial-up, ISDN (Integrated Services Digital Network), ADSL (Asymmetric Digital Subscriber Line), Cable TV, and FTTH (Fiber To The Home).
A source line speed is information on a network environment of the source IP address. The source line speed is, for example, a response time, TTL (Time To Live), or the like obtained by checking the source IP address by the analysis program 132 (see
A source active OS is information obtained by actively checking the network environment of the source IP address by the analysis program 132 (see
Items having names starting with “destination” in the attribute information are derived from a destination IP address as an event log basic parameter, as in the case of the items having names starting with “source”. Description of the following items is thus omitted herefrom: an n-th octet of a destination IP address, a destination country, a destination city, a destination latitude, a destination longitude, a destination AS (Autonomous System) number, a destination line type, a destination time zone difference, a destination line speed, and a destination active OS.
A destination active service is a service name determined by a destination port number or a service name written in an item constituting an event log. For example, if the destination port number is 80, the destination active service is Web. If the destination active service is an event obtained when the IDS 13 detects a packet attacking SQLServer (registered trademark), the service name is SQLServer (registered trademark).
Next is described the policy database 135 shown in
As shown in
Next is explained a distance function. Generally, if a cluster is obtained using a clustering technique, a distance function between data is defined.
In a two-dimensional Euclidean space, a distance between data A (xa, ya) and data B (xb, yb) is usually calculated as follows. A distance between data A and data B on the x-axis is an absolute value of a difference between “xa” and “xb”. A distance between data A and data B on the y-axis is an absolute value of a difference between “ya” and “yb”. The distance between data A and data B is obtained by calculating a square root of square sum of the two absolute values. The x-axis and the y-axis herein are equally scaled.
In this embodiment, the item of each event is assumed to be an axis. A distance between two points is calculated differently according to a characteristic of an axis used. In other words, the axes used are differently weighed.
For example, as shown in
Next is explained an action policy of the policy database 135 shown in
The action policy includes an action number, a filter condition, an evaluation formula, a threshold, and a countermeasure.
The action number is a number for identification.
The filter condition is a condition used when a data of an event log is screened through a filter, which is used in performing a clustering. For example, if the action number is 3, L2 is WEB. L2 represents an ID shown in
The evaluation formula is A1+A2+A3+A4 in a case in which the action number is 3, wherein A1, A2, A3, and A4 indicate the IDs shown in
For a cluster obtained by a clustering, an evaluation value is obtained which is a quantified combination of a ratio of the number included in the cluster with respect to the total number of events, the number of events, a variance value, and an average of a distance between a centroid of the cluster and an event included in the cluster (which may also be referred to as an evaluation value of the cluster). In this embodiment, the larger the evaluation value is, the higher a correlation between events becomes.
The filter condition and the evaluation formula may also be referred to as filter information.
The threshold is used in determining whether an evaluation value of a cluster is larger or smaller than the threshold. If the evaluation value of the cluster is larger than the threshold, the cluster is determined to have events highly related to each other (having similar characteristics), that is, a collective behavior cluster.
The countermeasure represents contents of a processing performed against a collective behavior cluster, for example, a “warning notice” for informing an administrator of warning information, a “bandwidth control” for limiting bandwidth use on transmission, a “packet filter” for blocking a transmission, or the like.
Next is described the distance function database 136 shown in
The distance function definition stores therein a distance function and an algorithm for defining the distance function. For example, an Euclidean distance function returns an Euclidean distance between two points as a return value of the distance function. A country distance function assigned to a source country (ID=A5) and a destination country (ID=18) shown in the distance function assignment policy (see
The port number distance function is assigned to a destination port number (ID=L6) (see
The protocol ranging matrix is used for defining a protocol distance function of the protocol distance function definition of the distance function database 137. For example, it is usually assumed that a relation between different protocols is low, and thus, a value as large as 255 is returned, if the numbers of protocols are different. However, since ICMP(1) for IPv4 and IPv6-ICMP(58) for IPv6 are both protocols concerning ICMP, a distance therebetween is defined as small as 1 (one).
The line type ranging matrix shown in
A service ranging matrix is used for defining a service distance function of the distance function definition of the distance function database 137. For example, since both a mail delivery service (SMTP) and a mail reception service (POP) are services concerning e-mails, the service ranging matrix defines a distance therebetween as 1 (one). Further, since applications providing services, such as Winny, Winnyp, and WinMX are all P2P file sharing software, the service ranging matrix defines a distance therebetween to be short.
An OS ranging matrix shown in
Referring back to
As shown in
Then, “L2=WEB”, “A1÷A2+A3+A4”, “0.9”, and “packet filter” are set as the filter condition, evaluation formula, threshold, and countermeasure, respectively (steps S1102 to S1105).
The distance function assignment policy (see
Event logs are read from the event database 134 (see
Out of the read event logs, data having L2=WEB is extracted based on the filter condition (step S1108).
The source IP address is broken down into 4 octets as attribute information, based on the evaluation formula (step S1109).
Each event is projected onto a four-dimensional space having A1, A2, A3, and A4 as axes (step S1110).
Respective distance functions corresponding to A1, A2, A3, and A4 are read from the distance function database 136 (see
A clustering is performed using the respective distance functions of the axes A1, A2, A3, and A4 (step S1112).
An evaluation value of the created cluster is computed (step S1113).
It is determined whether or not the cluster is a collective behavior cluster, that is, the computed evaluation value of the cluster is compared to the threshold (step S1114).
If the evaluation value of the cluster is equal to or more than the threshold, (if Yes in step S1114), the cluster determined as a collective behavior cluster and a countermeasure thereagainst are transferred to the access control instruction program 133 (step S1115).
If the evaluation value of the cluster is not more than the threshold (If No in step S1114), the processing is terminated.
Though not shown, steps S1114 to S1115 are performed for each cluster.
In
Next is described an example of a collective behavior cluster determined by the analysis program 132.
In
In
In
Next are described operations of the attack node set detection system according to this embodiment with reference to
In
First, the attack node set 60 sends an attack packet to the Web server 14 (step S101).
The IDS 13 detects the attack packet sent from the attack node set 60 as an attack and records the attack as an event concerning the attack packet (step S102). The firewall 11 detects the attack packet as a passing packet and records the passing packet in an event log(step S103). The Web server 14 records the attack packet as an access record and also records the attack packet as an event log(step S104).
The attack node set determination apparatus 12 obtains respective event logs from the IDS 13, firewall 11, Web server 14, mail server 15, proxy server 16, and terminal 17 at prescribed intervals or by an operation of an administrator (steps S105, S106, and S107).
The attack node set determination apparatus 12 extracts the event log basic parameters (see
(for example, the source IP address); checks the attribute information (for example, the line speed and active OS); and obtains the checked attribute information (which may also be referred to as first attribute information) (see
The attack node set determination apparatus 12 adds the obtained attribute information to the information on the event and stores the information on the event in the event database 134. Further, the attack node set determination apparatus 12 adds the attribute information created by processing the event log basic parameters (which may also be referred to as second attribute information) to the information on the event and stores the information on the event in the event database 134 (step S109).
After adding the attribute information to the event database 134 (see
A specific example of steps S111 to S112 is described below assuming a case in which, for example, an event of an event log in the Web server 14 is subjected to a clustering, using the conditions shown in Action No. 3 of the action policy (see
The attack node set determination apparatus 12 transmits the cluster information on the collective behavior cluster including Cluster A and the countermeasure against the collective behavior cluster to the firewall 11 (step S113). The firewall 11 stores the received collective behavior cluster and the countermeasure in a storage unit thereof not shown. Upon receiving a new attack packet from the attack node set 60, the firewall 11 extracts an event log basic parameter of an event related to the packet, using the access control program 111; transmits a check packet to an attack node of interest; and obtains information on the attack node, based on the checked result (step S114). If the firewall 11 determines using the access control program 111 that Cluster A includes the packet, the firewall 11 implements the countermeasure targeting all nodes included in Cluster A (step S115).
<Variation>
In a variation, the present invention can be carried out with an existing firewall 11 which implements a countermeasure only against an IP address as a target.
That is, in a step corresponding to step S112 of
In a step corresponding to step S114, the firewall 11 obtains attribute information on a node (which may also be referred to as first attribute information). Upon receiving a new attack packet from the attack node set 60, in a step corresponding to step S115, the firewall 11 compares an event log basic parameter of an event related to the packet and the attribute information, to the event log basic parameter and the attribute information stored in the storage unit, to thereby identify a cluster related to the attack packet. The firewall 11 references the storage unit; extracts an IP address related to the whole cluster; and implements a countermeasure against all IP addresses included in the whole cluster, based on the extracted IP address.
For example, if an IP address of a whole collective behavior cluster is represented as 192.168.1.0/24, all packets corresponding to the IP address are subjected to the same countermeasure.
As described above, the attack node set determination apparatus 20 (see
The embodiment and variation according to the present invention have been explained as aforementioned. However, the present invention are not limited to those explanations, and those skilled in the art ascertain the essential characteristics of the present invention and can make the various modifications and variations to the present invention to adapt it to various usages and conditions without departing from the spirit and scope of the claims.
For example, in this embodiment, as shown in
The firewall 11, attack node set determination apparatus 12, IDS 13, Web server 14, mail server 15, proxy server 16, and terminal 17 may be or may not be installed as different hardwares. Those components 11 to 17 may be installed as virtually separate units on a single hardware using a technique of software aggregation or virtualization.
A method of a clustering is not limited to k-means.
The attack node set determination apparatus 12 may operate also as the firewall 11. The Web server 14, mail server 15, proxy server 16, and terminal 17 may be designed to be capable of executing the attack node set determination apparatus 12 and access control program 111.
In the embodiment and variation, upon receiving an attack packet, the firewall 11 conducts a countermeasure to deal with the attack packet in step S115. However, the present invention is not limited to this. The firewall 11 may constantly conducts the countermeasure received in step S113, even when the firewall 11 has not yet received an attack packet. This allows the firewall 11 to conduct the countermeasure even when the firewall 11 has not yet determined whether or not a packet having a specific event is an unauthorized access.
The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. It will, however, be evident that various modifications and changes may be made thereto without departing from the spirit and scope of the invention as set forth in the claims.
Number | Date | Country | Kind |
---|---|---|---|
2008-215989 | Aug 2008 | JP | national |