The present invention relates to an apparatus and method for countering Denial-of-Service attacks in Communication Appliances and specifically for appliances which deploy Voice over Internet Protocol.
Voice over Internet Protocol (VoIP) relates to the transmission of voice or speech over data-style packet-switched networks, i.e., the Internet. An advantage of VoIP is that a user making a call is typically not charged beyond the Internet access charge, thereby making VoIP an attractive option for long distance calls. A typical VoIP deployment includes media gateways, media gateway controllers, end-user communication devices and many other support servers such as, for example, DNS, DHCP, and FTP. Media gateways, media gateway controllers and VoIP end-devices exchange the VoIP signaling/control and media packets. Many different types of end-user communication appliances implement VoIP including traditional telephone handsets, conferencing units, mobile phones, Personal Digital Assistants (PDAs) and desktop and mobile computers.
Denial-of-Service attacks are becoming a concern in viable VoIP deployments. Non-specific viruses, worms and Trojans as well as targeted VoIP Denial-of-Service (DoS) attacks can disrupt the service by either degrading the performance of IP end-points and/or media servers and gateways or by bringing them down altogether. The malicious packet flood, upon reaching these VoIP infrastructure elements consume network and/or host resources such as central processing units (CPU) and memory to the extent that the host device is unable to process legitimate packets resulting in service disruption. Phones deploying VoIP (“IP-phones”) and other lightweight devices are especially susceptible to such attacks because of the inherent imbalance in network and processor resources. For instance, in an IP Phone, the network interface is typically 10/100 Mbps Ethernet whereas the CPU is an Advanced RISC Machine (ARM) or Microprocessor without Interlocked Pipeline Stages (MIPS) type processor meant for embedded systems. To contain the costs, the CPUs have fairly low horsepower (i.e., low processing power).
Currently, firewalls are used in the network infrastructure, mostly at the periphery of the network (the technique is called perimeter protection) to prevent and/or rate-limit malicious packets from reaching servers and the end-points. However, this alone is not sufficient to prevent DoS attacks on VoIP, as it takes very little network traffic to disrupt a VoIP end-point. Setting bandwidth limits at very low levels at the perimeter of the network also prevents legitimate traffic from reaching the devices. Therefore, a complementary and viable approach is to filter illegitimate traffic at the device itself in addition to the network perimeter. In other words, each device needs an efficient embedded firewall to be resilient against flooding based DoS attacks. The core of any firewall is the packet-classification engine. There are two conflicting dimensions to the performance of a packet classifier, time and space. A large body of research has been devoted to understanding the trade-offs between time and space complexity of the packet classification problem. Typical packet classification is done on a limited set of header fields in a packet. The fields, associated values and the firewall action (drop/forward) are specified as rules, which the classification engine takes as input.
Packet classification is a core technology used in infrastructure elements such as routers, switches, and firewalls. The goal of these elements is to process/forward as much traffic as they can at wire speeds up to the backplane capacity. In other words, the efficiency of packet classification should be such that it should not be the bottleneck in packet forwarding while still being able to support a large number of rules.
Accordingly, the primary design goals for packet classifiers have been scalability and speed traded off against memory space needed. While a simple linear search takes O(n) storage, its time complexity is also O(n), which is not appropriate for efficient processing of a large number of rules. The techniques for efficient packet classification can be divided broadly into four categories: algorithmic; heuristic; hardware assisted; and special cases.
An object of the present invention is to provide an apparatus and method for protecting a communication appliance against Denial-of-Service attacks.
Assuming that the communication appliance is a unit cycle device and that the device needs X cycles to process peak, valid load then 1-X cycles are left to classify and filter all arriving packets. If the classification and filtering mechanism ensures that it can correctly handle a packet rate which is less than or equal to the upper bound on ingress packet rate within 1-X cycles, then the communication appliance can sustain any flooding based DoS attack up to the limit of the network pipe. Therefore, another object of the invention is to design a device-based efficient firewall which meets the above condition for withstanding a flooding-based DoS attack.
The object is met by a method for preventing or limiting the effects of Denial-of-Service attacks in a communication appliance having a packet-classification rule base which allows all legitimate packets to be forwarded to the communication appliance, wherein the method includes monitoring incoming packets to the communication appliance to determine whether conditions indicating a Denial-of-Service attack are present and selecting a rule base subset of the packet-classification rule base from a plurality of rule base subsets based on a current one of a plurality of operating states of the communication appliance when the conditions indicating a Denial-of-Service attack are determined to be present.
The determination of whether conditions indicating a Denial-of-Service request are present includes determining whether a rate of ingress exceeds a threshold rate. The communication appliance may have a plurality of operating states having different maximum legitimate packet ingress rates. The threshold rate may be varied based on a current operating state of the communication appliance. The threshold rate may be further dependent on whether the received traffic is periodic, features used by the communication appliance, an inherent packet rate transmitted by the sender, and/or network latency and jitter.
The object of the present invention is also met by a method for preventing or limiting the effects of Denial-of-Service attacks in a communication appliance having a packet-classification rule base which allows all legitimate packets to be forwarded to the communication appliance, the method comprising the step of rejecting a packet including a gratuitous reply.
A firewall may be arranged in the communication appliance and configured for performing the above described method. The communication appliance may be an IP-phone, conference unit, computer, or any other appliance capable of VoIP communications.
Other objects and features of the present invention will become apparent from the following detailed description considered in conjunction with the accompanying drawings. It is to be understood, however, that the drawings are designed solely for purposes of illustration and not as a definition of the limits of the invention, for which reference should be made to the appended claims. It should be further understood that the drawings are not necessarily drawn to scale and that, unless otherwise indicated, they are merely intended to conceptually illustrate the structures and procedures described herein.
In the drawings:
Current VoIP systems use either a proprietary protocol or one of two standards, H.323 and Session Initiation Protocol (SIP). The implementation of the present invention is described below using an H.323 based IP phone example. However, the generic solution described below may be implemented in communication appliances in any of the different VoIP systems.
The H.323 standard is specified by International Telecommunication Union (Telecommunications Sector). An example of an H.323 network 10 is shown in
The H.323 network 10 is also connected to a gateway 14 which connects the H.323 network to a non-H.323 network 16 such as, for example, an ISDN or PSTN. A gatekeeper 18 provides address translation and bandwidth control for the appliances 12a-12n connected to the H.323 network 10. A Back End Service (BES) 20 is connected to the gatekeeper and comprises a database which maintains data about the appliances 12a-121n, including permissions, services, and configurations. A Multi-point Control Unit (MCU) 22 is an optional element of the H.323 network which facilitates communication between more than two terminals.
The gateway 14 may be decomposed into one or more media gateways (MGs) 14a and a media gateway controller (MGCs) 14b. The MGC 14b handles the signaling data between MGs 14a and other network components such as the gatekeeper 18 or towards SS7 signaling gateways. MGs 14a focus on the audio signal translation function.
A firewall 24 is embedded in a communication appliance 12a-12n to prevent Denial-of-Service (DoS) attacks to that appliance. A firewall is also embedded in the gateway 14 to similarly prevent DoS attacks to the gateway 14. Each firewall 24 includes a packet classification engine for filtering packets received at the communication appliances. The firewall 24 utilizes rules in a packet-classification rule base 26 to determine whether an incoming packet should be forwarded to the respective communication appliance 12a-12n. The firewall thus prevents malicious packets from reaching and/or limits the rate at which malicious packets reach the communication appliances.
As discussed in more detail below, the packet classification rule base 26 is administered by a network administrator. The packet classification rule base 26 may be connected directly to each communication appliance 12a-12n as shown in
The overall method for efficient filtering of packets for communication appliances according to the present invention is shown in
The parameters of the detection conditions in steps S102 and S106 may be based on a simple heuristic, an example of which is described below for normal, in-call state of an H.323 based IP-phone. During the in-call state of an H.323 based IP-phone, real-time transfer protocol (RTP), real-time control protocol (RTCP) and H.225 (call signaling) heartbeats between the IP phone and the server constitute periodic traffic received at the IP-phone, of which, the RTP flow consumes the most bandwidth. Assuming 20 millisecond audio payload per packet, a periodic stream of packets using the codec for G.711 (the ITU-T recommendation for audio coding at 64 kbps) produces a receive rate of 50 packets per second. It is possible for these packets to get queued in-transit and then released causing the rate to artificially exceed 50 packets/sec for a limited period. This must be taken into account when determining the granularity of measuring ingress rate and determining when the filtering-policy is to be put into affect by updating the rule-base. While calculating the ingress rate each time a packet is received is possible, this frequency of calculation is a large drain on CPU availability. However, calculating the ingress rate every so-many received packets would result in the loss of some accuracy. The QoS guidelines for VoIP traffic help determine the optimal interval for calculating ingress rate. The G.711 codec can typically deliver acceptable call quality with less than 150 milliseconds delay, and less than 2% loss. This implies that any intrusion that lasts less than 150 milliseconds will not have perceptible effect on call quality. In other words, a suitable rate monitoring interval is less than 150 milliseconds. Similarly, once the packet ingress rate is determined to be higher than the established threshold, the filtering policy may take 150 milliseconds (from the start of the attack) to take affect before call quality suffers. Therefore, a reasonable heuristic may be to monitor every 50 milliseconds and have three consecutive measurements exceed the threshold before the firewall rule-base is updated.
In the above example, step S102 of the method in
The present invention recognizes differences between the design requirements for packet classification and filtering in communication appliances and classification and filtering design requirements in large network elements such as routers and switches.
The first difference is that a computer appliance legitimately sends and receives traffic to and from only a small set of IP addresses. For example, an IP-phone 12a in the H.323 network shown in
Another difference between computer appliances and network elements is that, while a computer appliance has a full-fledged IP stack, the number of protocols that the computer appliance uses is smaller than the amount of protocols used by a network element. For example, the H.323 based IP-phone 12a uses RTP, RTCP, H.323 suite, SNMP, TFTP, and HTTP protocols. The IP-phone 12a is never expected to receive IP datagrams, arbitrary UDP packets, or TCP packets not belonging to above protocols.
Yet a further difference between communication appliances and network elements is that communication appliances have a plurality of distinct operating states. The network element such as routers and switches do not have such distinct operating states. The IP-phone, for example, has four distinct operating states once it has booted as shown in
Another important difference in the properties of communication appliances and other network elements which may be used to defend against a flooding based DoS attack in a communication appliance is the fact that the legitimate traffic rate that a communication appliance ever receives is upper bounded by a known value which is significantly less than the capacity of the network connection of the communication appliance. Continuing with our example of an IP phone, the RTP stream is the most rate intensive packets received by the IP phone in normal operation. Using the G.711 codec, packets are received at the rate of 50 frames per second assuming 20 milliseconds audio payload per packet. Accordingly, packets received at a higher rate can be determined to be illegitimate. In some cases, certain peculiarities that might be induced by network latency and jitter need to be taken into account in determining the allowable packet ingress rate.
The above-described differences between communication appliances and particularly IP-phones and network elements may be used in the implementation of the general method for preventing DoS attacks at a communication appliance as disclosed by
After the IP-phone boots and once the network stack is initialized, the IP-phone does not have an IP address (for static configured IP addresses, the DHCP phase may be bypassed). It initiates a DHCP broadcast request. In this state, the reduced rule base in step S108 of
The determination of the threshold for the ingress packet rate in step S104 of
Continuing with the example of an IP-phone, the most network intensive traffic the IP-phone receives during normal operation is RTP packets during a call. If the G.711 codec is being used with 160 byte packets, the data bandwidth of the RTP stream is 64 kbps. Assuming 20 millisecond audio per packet, this translates to 87.2 kbps at the Ethernet layer. This is more than two orders of magnitude lower than the capacity of the full-duplex 10 Mbps Ethernet port of the IP-phone. Therefore, if the packet ingress rate at the IP-phone exceeds the 87.2 kbps rate, some packets have to be illegitimate. In other words, packet ingress rate monitoring is a strong measure for intrusion detection.
The table in
As explained above, the upper-bound for comparison purposes needs to be carefully determined by the system administrators. The factors which affect this value include the state of the appliance, whether the traffic is periodic or non-periodic, whether features such as silence suppression are used, the inherent packet rate as transmitted by the sender, and network (in-transit) latency and jitter. All but the last of these factors have a deterministic effect on the traffic volume as seen at the ingress port of the computer appliance. Network latency, jitter and loss, on the other hand can introduce random queuing and loss at various points while the packets are in transit resulting in variable arrival rate of otherwise periodic traffic such as RTP. It is possible that substantial congestion in-transit leads to excessive queuing. Under pathological conditions, it is also possible that quick clearing of these packets would lead to their arrival at the end-point in a short duration creating an artificially high arrival rate.
The following describes an additional method for reducing the effects of DoS attacks. As described above, a communication appliance (i.e., terminal or end-point) performs specific specialized tasks carried out by a small set of protocols. The message exchanges between the appliance and media gateways and servers and the end-points themselves often involve “request-reply” type of messages, which are characterized by one-to-one pairing. A specific class of DoS attack involves sending gratuitous replies when no request has been issued. In many cases, the behavior of the communication appliance upon such a reply is unspecified and is implementation dependent. A classic exploit against VoIP systems is the sending of “gratuitous address resolution protocol (ARP) replies”, where the Media Access Control (MAC) address for any IP address is changed to the one of the attacker. Upon receiving the “ARP reply”, the communication appliance updates its ARP tables resulting in call-hijacking or the phone not being able to communicate with the media-server.
The present invention implements a message pairing rule, in which, the communication appliance effectively ignores any gratuitous replies for which it did not issue a corresponding request. Other examples request-reply messages include DHCP request-reply, and gatekeeper request-gatekeeper confirmation (GRQ-GCF) in the H.323 suite. According to this embodiment, the firewall of the communication appliance stores a list of requests which are unanswered. This list may be stored as part of the packet classification rule base. Upon receiving a packet containing a reply, the communication appliance determines whether the reply corresponds to any of the unanswered requests. If the reply corresponds to one of the unanswered requests, the reply is forwarded to the communication appliance. Otherwise, the reply is discarded.
Thus, while there have shown and described and pointed out fundamental novel features of the invention as applied to a preferred embodiment thereof, it will be understood that various omissions and substitutions and changes in the form and details of the devices illustrated, and in their operation, may be made by those skilled in the art without departing from the spirit of the invention. For example, it is expressly intended that all combinations of those elements and/or method steps which perform substantially the same function in substantially the same way to achieve the same results are within the scope of the invention. Moreover, it should be recognized that structures and/or elements and/or method steps shown and/or described in connection with any disclosed form or embodiment of the invention may be incorporated in any other disclosed or described or suggested form or embodiment as a general matter of design choice. It is the intention, therefore, to be limited only as indicated by the scope of the claims appended hereto.