This application claims priority to and the benefit of Korean Patent Application No. 10-2007-0132622 filed in the Korean Intellectual Property Office on Dec. 17, 2007, the entire contents of which are incorporated herein by reference.
(a) Field of the Invention
The present invention relates to a traceback method. Particularly, the present invention relates to a method based on a Markov chain model.
The present invention was supported by the IT R&D program of MIC/IITA [2006-S-009-02, Development of WiBro Service and Operation Standard].
(b) Description of the Related Art
Tracebacks in an IP (Internet protocol) layer that deal with the transmission of packets over a network are classified into a proactive IP traceback and a reactive IP traceback. In addition, the tracebacks are classified into a router-based traceback, a technique for implementing a management system for packet information, a traceback based on a specific network, and a traceback based on a management technique.
The proactive IP traceback includes two representative methods, that is, a probabilistic packet marking method and an Internet control message protocol (ICMP) traceback method.
In the probabilistic packet marking method, two routers adjacent to a path of packets mark their information on the packets with a predetermined probability, and find an attack source on the basis of the information marked on the packets when a distributed denial of service (DDoS) attack occurs.
The probabilistic packet marking method probabilistically marks information on the packets to reduce the overhead of the router and to minimize a marking size. Therefore, the probabilistic packet marking method can solve the problems of the traceback due to fragmentation.
The ICMP traceback method copies the content of a specific ICMP traceback message and forwards the copied message to all the routers. The ICMP traceback method can efficiently access the routers, but has a disadvantage in that an attacker will transmit a fraudulent ICMP traceback message to a victim host.
A hash-based traceback method is a representative example of the reactive IP traceback. In the hash-based traceback method, a source patch isolation engine (SPIE)-based traceback server is provided, the entire network is classified into sub-groups, and an agent is provided for each of the sub-groups, thereby managing the network. Each router has a data generation agent (DGA) function. The DGA function applies a hash function to packet information transmitted to each router to hash the packet information. That is, the hash-based traceback method stores and manages IP header information and payload information, and generates a database using a Bloom filter having a hash-based data structure.
If a destination intrusion detection system detects hacking and an illegal act, the agent managing the network group compares information stored in a DGA router in the group with hacking packet information, analyzes the comparison result, and transmits the analyzed result to an SPIE system, thereby reconstructing a transmission path of the packet related to the hacking.
The above information disclosed in this Background section is only for enhancement of understanding of the background of the invention and therefore it may contain information that does not form the prior art that is already known in this country to a person of ordinary skill in the art.
The present invention has been made in an effort to provide a traceback method having an advanced traceback performance, which is a combination of a proactive traceback method and a reactive traceback method.
According to an aspect of the present invention, a traceback method includes: receiving data including router information according to the path of an attacker; filtering the data to hash the data, and storing the hashed information; determining whether the data is normally received on the basis of the hashed information; and predicting a path loss on the basis of the determination result.
The router information may be included in the data by probabilistic packet marking.
The router information may be marked on the data by a transition probability corresponding to a router.
The router information of a plurality of routers may include the results obtained by performing an exclusive OR operation on IDs of the plurality of routers.
The filtering and storing of the information may include separating an Internet protocol header and query information from the data using a Bloom filter, and storing the Internet protocol header and the query information.
The determination of whether the data is normally received on the basis of the hashed information may include examining the Internet protocol header to determine whether the data is normally received.
The determination of whether the data is normally received on the basis of the hashed information may include, when it is determined that the data is abnormally received, predicting the path loss.
The predicting of the path loss may include setting the plurality of routers as nodes, generating a transition probability matrix on the basis of the transition probabilities of the nodes, generating the incidence of each of the nodes on the basis of the transition probability matrix, and determining priorities of the nodes on the basis of the incidences.
The determination of whether the data is normally received may include determining whether there is router information.
According to another aspect of the present invention, a signal receiving apparatus includes: a receiver that receives data including router information according to the path of an attacker; a filter that groups the data and classifies acknowledgement information of the groups; a storage unit that stores the acknowledgement information; and a determining unit that determines whether the data is normally received on the basis of the acknowledgement information and predicts the path of the attacker.
The acknowledgement information may include mobile router information of the attacker.
The mobile router information may be included in the data according to Markov chain-based probabilistic packet marking.
The router information may include a transition probability corresponding to a router.
The router information of a plurality of routers may be generated by performing an exclusive OR operation on IDs of the plurality of routers.
The acknowledgement information may include an Internet protocol header and query information.
The determining unit may examine the Internet protocol header to determine whether the data is normally received.
When it is determined that the data is abnormally received, the determining unit may predicts the path loss.
The determining unit may calculate the incidence of each of the routers on the basis of a transition probability matrix for the plurality of routers and determine priorities of the routers on the basis of the incidences.
The determining unit may determine whether the data is normally received on the basis of whether there is the router information.
According to the above-mentioned aspects of the present invention, it is possible to perform an accurate IP traceback using a probabilistic packing marking method and a hash-based traceback method.
In the following detailed description, only certain exemplary embodiments of the present invention have been shown and described, simply by way of illustration. As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention. Accordingly, the drawings and description are to be regarded as illustrative in nature and not restrictive. Like reference numerals designate like elements throughout the specification.
In the specification, unless explicitly described to the contrary, the word “comprise” and variations such as “comprises” or “comprising” will be understood to imply the inclusion of stated elements but not the exclusion of any other elements. In addition, the terms “-er”, “-or”, and “module” described in the specification mean units for processing at least one function and operation and can be implemented by hardware components or software components and combinations thereof.
In the specification, a terminal may be referred to as a mobile station (MS), a mobile terminal (MT), a subscriber station (SS), a portable subscriber station (PSS), user equipment (UE), or an access terminal (AT). The terminal may include some or all of the functions of the mobile terminal, the subscriber station, the portable subscriber station, and the user equipment.
In the specification, a node may be referred to as a base station (BS), an access point (AP), a radio access station (RAS), a node B, a base transceiver station (BTS), or a mobile multihop relay (MMR)-BS. The node may include some or all of the functions of the access point, the radio access station, the node B, the base transceiver station, and the MMR-BS.
Hereinafter, a traceback method using a Markov chain model will be described.
Referring to
The router (ACR1) 30 is for connecting separated networks using the same transmission protocol. The router 30 connects network layers, and has functions of packet switching, packet forwarding, packet filtering, and routing.
The radio access station 20 transmits signals generated by the mobile station 10, and registers positional information for checking the position of the mobile station 10 existing in the access network 100 controlled by the radio access station 20.
The router 30 of the radio access station 20 controlling the access network 100 including the mobile station 10 generates a binary router ID to perform marking.
That is, the router 30 stores router information of received request packet data, marks the router ID on the router information of response packet data, and transmits the response packet data.
Meanwhile, as shown in
As shown in
The router ID is represented by an arbitrary binary value, as shown in
In this case, the routers ACR3 and ACR6 on the path perform probabilistic packet marking using the Markov chain on the router IDs.
The state of each of the routers through which the mobile station passes for probabilistic packet marking may be represented by the following set:
{??, ACR3, ACR6, (V), ACR3 and ACR6, (ACR6, V), (ACR3, V), (ACR3, ACR6, V)}.
In this case, each state has a transition probability, and a transition probability matrix may be formed on the basis of the transition probability and a total number of transitions.
The transition probability between the router to which the attacker belongs first and the third router ACR3 and the transition probability between the sixth router ACR6 and the router V of the victim host are calculated.
The calculation of the transition probabilities satisfy Equation 1 given below:
P(T(G)=ACRi)=(the number of sources reached ACRi)/the total number of sources*[Pm(1−Pm)d(ACRi, v)−1. [Equation 1]
In addition, the calculation satisfies
(where T(G) indicates a packet type in a network graph G, ACRi indicates an i-th router in the network graph G, Pm indicates the probability marking values of all routers (1/d), d indicates the distance between the router and a victim host that is most distant from the router, and d(ACRi, v)−1 indicates the distance between the victim host V and ACRi).
When the mobile station 10 of the attacker performs a plurality of handovers and the router V of the last victim host is defined through the first router ACR1, the third router ACR3, and the sixth router ACR6, the router V of the victim host traces back the IP of the attacker.
Referring to
When a victim host is defined, the router 400 of the victim host receives data packets using the receiver 410, filters the data packets using the Bloom filter 420, and hashes the filtered data packets (S301). Then, the router 400 stores the hashed data in the database 430 (S303).
The Bloom filter 420 allows a predetermined amount of false positives to make up for the defects of the hash function. Therefore, it is important to reduce the false positives. Therefore, it is determined only whether there is a router ID, but it is not determined whether to store the router ID in its original form, which makes it possible to store a large amount of data information using a small database 430.
Then, the determining unit 440 searches interested query information from the stored data to know the packet type and the storage format of the stored data. The determining unit 440 uses them to generate information for IP traceback (S305).
Then, the determining unit 440 examines the IP header of the stored data to determine whether the data is normally transmitted (S307).
When it is determined that the data is normally transmitted, the determining unit 440 immediately perform the IP traceback (S311). When it is determined that a transmission loss occurs, the determining unit 440 finds a lost portion using a prediction module and then performs a traceback (S309).
In order to find the lost portion, the determining unit sets each router in the network graph G shown in
As shown in
When the second to sixth routers between the first router of the attacker and the router of the victim host are set as nodes and the incidence of each node is calculated, (0.2260, 0.0904, 0.2203, 0.1243, 0.2203, 0.1186)T shown in
When the incidences are arranged in ascending order, it is possible to know priorities in ascending order, and it is possible to perform a traceback by determining the priorities as the path of the attacker.
When the IP traceback is actually implemented as shown in
Therefore, if marking is not performed due to the packet loss of the router ACR6, the router ACR5 may also be considered to have the highest probability of a packet loss. Therefore, it is possible to exclude other routes from the traceback.
As such, it is possible to reconstruct a transmission path in consideration of both whether a transmission loss occurs and whether packets are normally transmitted. Therefore, this embodiment is more effective than the traceback method according to the related art.
The above-described exemplary embodiment of the present invention can be applied to programs that allow computers to execute functions corresponding to the configurations of the exemplary embodiments of the invention or recording media including the programs as well as the method and apparatus. Those skilled in the art can easily implement the applications from the above-described exemplary embodiments of the present invention.
While this invention has been described in connection with what is presently considered to be practical exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2007-0132622 | Dec 2007 | KR | national |