The present invention generally relates to overload protection for a control plane processor inside a network node, e.g. a DSLAM or Digital Subscriber Line Access Multiplexer, a BRAS or Broadband Remote Access Server, an IP edge router, etc. Such a network node typically has one or more data plane processors handling at wire speed incoming data packets that need no special treatment. This is called the fast path. Data packets that need a special treatment are redirected by the data plane processor(s) towards the control plane processor, also known as the slow path.
Note that there is no fixed relationship between the number of data plane processors and the number of control plane processors. This relationship may vary from a 1:1 (a dedicated control plane processor for each data plane processor), to a N:M or even N:1 (all N data plane processors are served by a single control plane processor). Special treatment could be any type of complex protocol handling, like for instance parameter checking for quality of experience purposes on RTP (Real-time Transport Protocol) messages, fragmenting/de-fragmenting IP (Internet Protocol) packets, text parsing on SIP (Session Initiation Protocol) messages, etc. Because no single general purpose or communication processor is capable of handling the packets that need special treatment at wire speed, adequate overload protection measures are needed for the control plane processor.
Known solutions for overload protection of control plane processors in network nodes generally can be classified in two categories: the first category is based on static rate-limiting inside the data plane, the second category is based on low level packet dropping inside the control plane.
Static rate-limiting inside the data plane requires policers in the data plane that determine the maximum amount of data traffic that can be accepted and treated by the slow path, i.e. the control plane processor. The maximum is pre-configured to be a static value that the control plane processor is always able to handle. Static rate-limiting protection where the data plane processor's policing engines are used to protect the control plane from denial of service attacks is for instance suggested in the publication “Networking Systems Require Tight Control/Data Plane Integration” from author Hemant Triveldi. This publication of 29 May 2002 can be downloaded from the Internet via the URL: http://www.commsdesign.com/design_corner/showArticle.jhtml?articleID=16504831
A drawback of the static rate-limiting solutions is that it requires knowledge in advance of the amount of traffic that each service running in the control plane can handle. These amounts are hard to predict and usually require empirical measurements. Once a maximum value has been determined for the amount of data packets that can be redirected to the control plane processor per time unit, the static rate-limiting solution is rather inflexible. Additional services may be installed to run on the control plane processor, impacting the amount of traffic that existing services can handle. An upgrade of the control plane processor may be executed, requiring a complete re-evaluation of all policing parameters used in the data plane. Static rate-limiting solutions require re-evaluations, new empirical measurements and eventually manual interventions to reconfigure the policers in the data plane each time upgrades or changes in the control plane services take place. Moreover, in case the data plane is equipped with multiple elements each redirecting data packets to the same control plane processor, the determination of maximum traffic rate values for the policers becomes extremely difficult, and the system becomes even less flexible in case of upgrades.
As opposed to static rate-limiting, low level packet drop solutions do not require any precautions inside the data plane. All traffic that needs special treatment is redirected towards the control plane processor, where low level software will start to drop data packets when the load on the control plane processor is becoming too high. Eventually, the control plane starts dropping packets belonging to certain classes or services. An example software package that implements low level packet dropping for certain pre-configured classes of packets is Cisco's Control Plane Policing (CPP) software described in the white paper “Deploying Control Plane Policing”. The May 2005 update of this white paper can be extracted from the Internet via URL: http://www.cisco.com/en/US/products/sw/iosswrel/ps1838/products_white_paper09186a0080211f39.shtml
A disadvantage of low level packet drop solutions inside the control plane is that the software algorithm dropping the packets already consumes processing power in the control plane. Due to the fact that the control plane processor itself has to decide on the dropping of packets, its performance will decrease, in particular when the processor is near overload conditions. This shortcoming of low level packet drop solutions opens the door to Denial of Service (DoS) attacks on the slow path where an unlimited amount of malicious data packets that need special treatment are sent to the network node resulting in business impacting control plane processor outages. Further, low level packet drop solutions are blind regarding the different services running on the control plane and/or blind regarding the flows handled by the different services running on the control plane. In case only one service is receiving an excessive amount of data packets, the low level packet drop software will not only drop packets destined to the service in trouble, but for instance also packets destined to other services running on the control plane. More advanced implementations of low level packet dropping like Cisco's CPP solution, already cited above, distinguish between services or classes of packets. These implementations however do not only drop data packets from the flow(s) that cause a service running on the control plane to suffer from overload, but also impact other users making use of the same service or flows belonging to the same class.
The object of the present invention is to provide an alternate solution for overload protection of control plane processors inside network nodes, but which does not suffer from the shortcomings of the prior art static rate-limiting and low level packet drop solutions. It is an object to add flexibility to the static-rate limiting solution and to add more detailed flow control to a CPP-like solution.
The above drawbacks are overcome and the object of the current invention is realised through a network node as defined in claim 1 having an overload protection function at the control plane able to identify individual flows causing the overload such that the data plane processor can apply increased rate-limiting on those individual flows.
Thus, a SW process (or alternatively a hardware implemented version of the overload protection function) is monitoring the load on services running on the control plane and in case of overload is producing a detailed indication of which flow (or which user) is causing the overload condition. This indication is sent back to the data plane enabling dynamic rate-limiting of a single stream of packets (called a flow). The basic idea underlying the invention in other words is to implement flow control or user based flow control (because at least for access nodes a single flow can always be mapped to a single user inside network nodes) through a load protection function in the control plane providing detailed feedback to the data plane processor. The use of the flow-ID as rate-limiting granularity is the finest level of flow control achievable between the control plane and data plane. The current invention therefore enables the best possible control and flexibility. It provides DoS attack prevention for the control plane since the control plane processor no longer has to drop the packets itself, and does not require static pre-configuration.
The object of the current invention is further realised through a method for overload protection of a control plane processor as defined in claim 10.
An optional implementation of the current invention based on load monitoring means, flow identification means and instruction means, is defined in claim 2. Indeed, the load monitoring means may monitor services running on the control plane in order to identify the service(s) that reach an unacceptable CPU load level. Next, the flows will be identified that cause the service(s) to suffer from overload. Thereupon, instructions will be issued to the data plane to intensify the rate-limiting for the identified flows causing overload.
A possible way to monitor the load based on queue lengths is covered by claim 3. In this way, when data packets are sent as messages from the data plane to the control plane, it is sufficient to monitor the length of the message queues for different services and to determine which queues exceed a certain threshold in order to identify the services that have reached an unacceptable CPU load level.
An alternate way to monitor the load based on processing time is covered by claim 4. Thus, if the services maintain the processing time they consume within the control plane, simply interrogating the services and comparing the reported processing times to a threshold will enable the load monitor to identify the services that have reached an unacceptable CPU load level.
Yet another optional aspect of the current invention is that the identification of individual flows causing overload of the control plane processor might be based on interrogation of a dispatching function inside the control plane. This is defined by claim 5. The dispatching function in the control plane dispatches data packets that are redirected by the data plane to the different services running on the control plane. Once the services that suffer from overload are identified, the dispatcher may be consulted by the overload protection software to map individual flows onto the services in overload state such that the flows can be identified responsible for the overload situation.
Still an optional feature of the present invention is that the overload protection unit in the control plane might be able to identify individual flows for which the rate-limiting conditions in the data plane can be decreased or relaxed. This is covered by claim 6. In other words, the load protection function may open the throttle again and instruct the rate-limiting function in the data plane to return to a higher rate level again as soon as the service in trouble settles down again to normal operation load.
As indicated by claims 7, 8 and 9, the invention could find its way into different types of network nodes. Examples are access multiplexers like DSLAMs (Digital Subscriber Line Multiplexers), fibre aggregators, DLCs (Digital Loop Carriers); server nodes like BRASs (Broadband Remote Access Servers); routing/switching nodes like IP edge routers, ATM switches, etc.
The functional block diagram in
Data packets received by the DSLAM, e.g. in upstream direction from a DSL CPE, enter the data plane processor 101 via port 141. The data packets pass through the packet classifier 111 which determines the flow-ID through inspection of the data packet and classifies the packet for further processing. The flow-ID is a unique identification of the stream of packets originating from a certain user and destined to a certain service. In general, the packet classifier 111 distinguishes between a first type of data packets 144 that can be processed entirely in the data plane and a second type of data packets 142 that need dedicated processing that cannot be performed at wire speed in the data plane. The first type of data packets 144 along with their flow-IDs 145 can be handled at wire speed in the data plane and consequently are forwarded to the other data plane functional blocks 113 for being processed there, and forwarded through outbound port 146. These other functional blocks 113 inside the data plane, following the packet classifier 111 can be of numerous types. It can for instance be a packet mangling block (e.g. to do NAT or Network Address Translation), it can be a traffic shaping/scheduling block, etc. The second type of data packets 142 along with their flow-IDs 143 are redirected to the control plane via the rate limiter 112. Typically, they belong to flows that require complex operations, like extensive parsing, encryption/decryption, editing and scheduling, fragmentation and de-fragmentation, validation, etc. Examples are SIP (Session Initiation Protocol) messages that require extensive text parsing, RTP (Real-time Transport Protocol) messages that require extensive monitoring of parameters for quality of service and quality of experience purposes, IP packets that require de-fragmentation, etc. Data packets of such flows are redirected to the control plane processor 102 via connection 147, after it has been checked by the rate limiter 112 in the data plane that the flow does not exceed the maximum acceptable rate (this maximum rate might be expressed as a maximum amount of bits or bytes per time unit, or alternatively as a maximum amount of packets per time unit). In the control plane processor 102, a dispatching function 121 dispatches the redirected data packets to different services, e.g. 122 and 123, that run on the control plane. For instance, IP packets that need to be de-fragmented (like for instance SIP packets) are dispatched via connection 151 to the de-fragmentation service 122, IGMP packets may be dispatched to the IGMP proxy server 123. Additional services may be included in the control data plane like for instance an RTP quality measurement service, but they are not drawn to avoid overloading the drawing. The dispatcher 121 in other words has the knowledge of the flow to service mapping and ensures that all redirected flows are delivered to the correct service running on the control plane. Additionally, the dispatcher 121 monitors the data packet rate of all the redirected flows. This information is essential in relation to the current invention and will be used by the overload protection function 124 as will be explained later on. Key to the current invention is the presence of the overload protection function 124 in the control plane. This function consists of three pieces of software, i.e. load monitoring software 131, flow identification software 132, and rate-limiting instruction software 133. The load monitor 131 basically monitors the CPU load of each of the services, e.g. 122 and 123, running on the control plane. This is represented by the dashed lines 153 and 154 in
Thanks to the explicit feedback from the control plane to the data plane, specifying the flow(s) that actually cause the service(s) on the control plane to suffer from overload, the system illustrated by
Later on, when the de-fragmentation service 122 that was in trouble, settles down to normal operation again, the load monitor and flow identification software will again open the throttle and the instruction software 133 shall issue instructions for the rate limiter 112 to increase the maximum allowable packet rate (or bit rate) for those flows again, in order to relax the rate-limitations imposed on flows that temporarily caused overload of the control plane processor 102.
Typically, the rate monitoring of flows in the control plane according to the current invention will consume 1 à 2% of the CPU capacity in the control plane processor 102, which is negligible.
Although the present invention has been illustrated by reference to specific embodiments, it will be apparent to those skilled in the art that various changes and modifications may be made within the spirit and scope of the invention. It is therefore contemplated to cover any and all modifications, variations or equivalents that fall within the spirit and scope of the basic underlying principles disclosed and claimed in this patent application. For example,
The maximum values used for static rate-limiting could for instance be made dynamically adjustable on the basis of feedback of the overload monitoring function in the control plane. Existing systems that apply static rate limiting in other words would be upgraded to apply dynamic user based flow control according to the present invention if the control plane is extended with an overload monitor and protection function according to the current invention and the static rate limits in the data plane are made adjustable on the basis of feedback received from the control plane.
Number | Date | Country | Kind |
---|---|---|---|
05292477.6 | Nov 2005 | EP | regional |