1. Field of the Invention
This invention relates generally to network processor-based devices, and more specifically to an improved equal cost multipath routing and recovery mechanism that enables the routing system to recover more quickly that the routing protocol.
2. Discussion of the Prior Art
In today's networked world, bandwidth is a critical resource. Increasing network traffic, driven by the Internet and other emerging applications, is straining the capacity of network infrastructures. To keep pace, organizations are looking for better technologies and methodologies to support and manage traffic growth and the convergence of voice with data.
The convergence of voice and data will play a large role in defining tomorrow's network environment. Because voice communications will naturally follow the path of lowest cost, voice will inevitably converge with data. Technologies such as Voice over IP (VoIP), Voice over ATM (VoATM), and Voice over Frame Relay (VoFR) are cost-effective alternatives in this changing market. However, to make migration to these technologies possible, the industry has to ensure quality of service (QoS) for voice and determine how to charge for voice transfer over data lines.
Integrating legacy systems is also a crucial concern for organizations as new products and capabilities become available. To preserve their investments in existing equipment and software, organizations demand solutions that allow them to migrate to new technologies without disrupting their current operations.
Eliminating network bottlenecks continues to be a top priority for service providers. Routers are often the source of these bottlenecks. However, network congestion in general is often mis-diagnosed as a bandwidth problem and is addressed by seeking higher-bandwidth solutions. Today, manufacturers are recognizing this difficulty. They are turning to network processor technologies to manage bandwidth resources more efficiently and to provide the advanced data services, at wire speed, that are commonly found in routers and network application servers. These services include load balancing, QoS, gateways, fire walls, security, and web caching.
For remote access applications, performance, bandwidth-on-demand, security, and authentication rank as top priorities. The demand for integration of QoS and CoS, integrated voice handling, and more sophisticated security solutions will also shape the designs of future remote access network switches. Further, remote access will have to accommodate an increasing number of physical mediums, such as ISDN, T1, E1, OC-3 through OC-48, cable, and xDSL modems.
A network processor (herein also mentioned as an “NP”) has been defined as a programmable communications integrated circuit capable of performing one or more of the following functions:
For exemplary purposes, reference is made to
The CP code base provides support for the Layer 2 and Layer 3 topology protocols and Layer 4 and Layer 5 network applications and systems management. Examples are protocol support for VLAN, IP, and Multiprotocol Label Switching standard (MPLS), and the supporting address- and route-learning algorithms to maintain topology information.
With particular reference to
Traditional frame routing capability provided in network processor devices typically utilize a network routing table having entries which provide a single next hop for each table entry. Commonly-owned U.S. Pat. No. 6,721,800 entitled SYSTEM USING WEIGHTED NEXT HOP OPTION IN ROUTING TABLE TO INCLUDE PROBABILITY OF ROUTING A PACKET FOR PROVIDING EQUAL COST MULTIPATH FORWARDING PACKETS, the whole content and disclosure of which is set forth herein, describes a system and method for providing the ability for a network processor to select from multiple next hop options for a single forwarding entry.
FIG. 2(a) depicts an example network processor frame routing scenario 40 and FIG. 2(b) illustrates an example Equal Cost Multipath Forwarding (ECMP) table 50 that may be used to provide a lookup of a next hop address for forwarding packets as described in commonly-owned, co-pending U.S. patent application Ser. No. 09/546,702. Preferably, such a table is employed in a Network Processor (NP) device having packet routing functions such as described in commonly-owned, co- pending U.S. patent application Ser. No. 09/384,691.
Thus, the example ECMP forwarding table 50 illustrated in FIG. 2(b), is particularly implemented in a frame forwarding context for network processor operations. In the example ECMP forwarding table 50, there is provided subnet destination address fields 52, with each forwarding entry including multiple next hop routing information comprising multiple next hop address fields, e.g., fields 60a-60c. Additionally provided in the ECMP routing table is cumulative probability data for each corresponding next hop such as depicted in action data field 70. Particularly, in the exemplary illustration of the ECMP packet forwarding table 50 of FIG. 2(b), there is included three (3) next hop fields to addresses 9.1.1.1, 8.1.1.1, 6.1.1.1 associated with a destination subnet address 7.*.*.*. An action data field 70 includes threshold values used to weight the probability of each next hop and is used to determine which next hop will be chosen. In the action field 72, shown in FIG. 2(b), these values as being stored as cumulative percentages with the first cumulative percentage (30%) corresponding to next hop 0, the second cumulative percentage value (80%) corresponding to next hop 1, etc. This means that, the likelihood of routing a packet through next hop 0 is 30% (i.e., approximately 30% of traffic for the specified table entry should be routed to next hop 0), and, the likelihood of routing a packet through next hop 1 is 50% (i.e., approximately 50% of traffic for the specified table entry should be routed to next hop 1). This technique may be extended to offer as many next hops as desired or feasible.
Currently, in such network processing systems, if a destination NP device (hereinafter referred to as Targetblade or blade) or interface (such as a port or TargetPort) associated with the target blade and capable of handling the frame type fails, i.e., the packet or frame cannot be routed to the correct destination set forth in the ECMP forwarding table. However, it is often the case that the other Network Processors (NP's) in the system will continue to attempt to forward frames through the failed interface/blade until the routing protocol, e.g., the Open Shortest Path First (OSPF) protocol which enables routers to understand the internal network architecture, i.e., within an autonomous network, and calculate the shortest path from an IP Source Address (SA) to IP Destination Address (DA), detects the failed link and downloads a new forwarding entry that avoids the failed interface/blade. The time for this routing protocol to detect the failed link could be relatively long, and during this period all the data packets routed through the failed interface/blade may be lost.
Consequently, it would be highly desirable to provide a methodology that would enable a routing system to recover more quickly that the routing protocol so as to significantly reduce the occurrence of lost data packets to a failed target interface/blade with minimal performance penalty.
Accordingly, it is an object of the present invention to provide a network processor with a system that that would enable a routing system to recover more quickly that the routing protocol so as to significantly reduce the occurrence of lost data packets to a failed target interface/blade.
It is another object of the present invention to provide in a network processor system, a method of maintaining the operational status of all the network processors (blades) in the routing system so that packet forwarding issues resulting from a failed interface/blade may be quickly resolved without the loss of data packets routed in the system with minimal performance penalty.
In accordance with the preferred embodiment of the invention there is provided for a networking environment including one or more network processing (NP) devices and implementing a routing protocol for routing data packets from a source NP devices to destination NP devices via a switch fabric, with each network processing device supporting a number of interface ports, a system and method for enabling a routing system to recover more quickly that the routing protocol so as to significantly reduce the occurrence of lost data packets to a failed target interface/blade. The routing system is enabled to track the operational status of each network processor device and operational status of destination ports supported by each network processor device in the system, and maintains the operational status as a data structure at each network processing device.
Prior to routing packets, an expedient logical determination is made as to the operational status of a target network processing device and target interface port of a current packet to be routed as represented in the data structure maintained at the source NP device. In this manner, correct routing of packets is ensured with reduced occurrence of lost data packets due to failed target NP devices/ports.
Further features, aspects and advantages of the apparatus and methods of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings where:
FIG. 2(a) depicts an example network processing scenario 40 including network processors (routers) employing a packet routing table such as an ECMP forwarding table.
FIG. 2(b) illustrates an example ECMP forwarding table for use in a network processor, router or packet switching device according to the example network processing scenario of FIG. 2(a).
A first method of maintaining operational status at the blade/NP level involves implementation of a data structure (hereinafter referred to as opStatus) that is maintained by each NP device. This opStatus data structure includes information representing the operational status of all the network processors (blades/ports) in the routing system and, for example, may comprises a bit vector of sixty-four (64) bits long (in an example system employing 64 NP devices). If the ith bit is set, for instance, then the ith NP/blade is indicated as operational.
In operation, after choosing the next hop according to ECMP rules, the layer-3 forwarding picocode will check the operational status of the NP/blade through which the chosen next hop is reachable. If that NP is not operational, then a different equal-cost next hop (the next hop with the smallest index) that is reachable through an operational NP/blade will be chosen.
This first solution essentially maintains the operational status at the TB (blade)/NP level. In order to extend this solution to an interface/port (TB/TP) level, there needs to be maintained a datastructure that is 64×16 bits long, assuming each blade in the example system maintains sixteen (16) ports, for instance. Since the opStatus datastructure is consulted in the main forwarding path, it must be stored in a fast, expensive memory.
Another solution relies on the assumption that the interface/blade failures are rare and it is unlikely that more than one blade will fail at the same time. The advantage of tracking a single failure is the reduction of the size of the opStatus data structure. The current solution only requires 48 bits in expensive high-speed memory where as the previous solution required 64×16 bits in such a memory. Thus, the following data structure may be maintained in each NP device in the routing system.
According to this embodiment, the following algorithm is invoked to check whether a given TB, TP is operational:
According to this algorithm, if all blades are operational, the routing of packets throughout the system will continue and no ECMP re-routing is necessary. However, only if both a Target Blade is a failed blade AND the result of the bitwise operation between the Target Port and failedPort Mask is equal to the failedPortValue, then a FALSE is returned and the ECMP table invoked for re-routing subsequent packets to another TB or TP. If a TRUE is returned, i.e., either the Target Blade is not a failed blade or the result of the bitwise operation between the Target Port and failedPort Mask is not equal to the failedPortValue, then the packet will still be routed to the destination TB/TP.
It should be understood that this solution may handle individual failures at port, data move unit (DMU) and blade levels. However, multiple blade failures cannot be handled by this solution. As an example, if all the interfaces in all the blades are operational then failedBlade will contain the value of 0xffff and the values of failedPortMask and failedPortValue will be ignored. If the blade number, e.g., bladeNum, is not operational (i.e., all the ports in that blade have failed) then failedBlade will include bladeNum and faildPortMask will contain the value of 0 and failedPortValue will contain the value of 0. If the port numbered portNum in the blade numbered bladeNum is not operational, then failedBlade will contain bladeNum and failedPortMask will contain the value of 0xffff and the failedPortValue will contain the value of portNum. Assuming a blade having four data move units (DMUs) of four ports each, the ports in DMU A have last (least significant) 2 bits set to 00, the ports in DMU B have last 2 bits set to 01, the ports in DMU C have last 2 bits set to 10, and the ports in DMU D have last 2 bits set to 11. If DMU C were to fail in blade numbered bladeNum, failedBlade will contain the value of bladeNum, and failedPortMask will contain the value of 0x0003 and failedPortValue will contain the value of 0x0002.
In the preferred embodiment, a range is used to represent the failed blades and a mask on the port number to represent the set of failed ports. This solution only requires 32 bits of high-speed memory. The following data structure will be maintained in all of the NPs in the preferred embodiment:
According to this data structure, if failedPortMask and failedPortValue are both 0xff, then all blades will be considered operational. This convention is founded on the assumption that no port is numbered 0xff.
According to this embodiment, the following algorithm is invoked to check whether a given TB, TP is operational:
According to this algorithm, if all blades are operational, then both failedPortMask and failedPortValue are set to 0xff and the values of the other fields are ignored. This is a simple test that may be performed in one machine cycle. If the blade numbered bladeNum is not operational (i.e., all the ports in that blade have failed) then, according to this algorithm,
However, if the blades numbered, for example 8, 9, and 10 are not operational then set
If the port numbered portNum in the blade numbered bladeNum is not operational, then, according to this algorithm,
The ports in DMU A have last (least significant) 2 bits set to 00. The ports in DMU B have last 2 bits set to 0 1. The ports in DMU C have last 2 bits set to 10 and the ports in DMU D have last 2 bits set to 11. In an example scenario when all the ports in DMU C fail in blade numbered bladeNum, then, according to this algorithm,
While the invention has been particularly shown and described with respect to illustrative and preformed embodiments thereof, it will be understood by those skilled in the art that the foregoing and other changes in form and details may be made therein without departing from the spirit and scope of the invention which should be limited only by the scope of the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
4999829 | Fite, Jr. et al. | Mar 1991 | A |
5042027 | Takase et al. | Aug 1991 | A |
5182744 | Askew et al. | Jan 1993 | A |
5504863 | Yoshida | Apr 1996 | A |
5537392 | Wille et al. | Jul 1996 | A |
5581543 | Natarajan | Dec 1996 | A |
5583848 | Glitho | Dec 1996 | A |
5590117 | Iida et al. | Dec 1996 | A |
5629925 | Pfeiffer et al. | May 1997 | A |
5825772 | Dobbins et al. | Oct 1998 | A |
5835727 | Wong et al. | Nov 1998 | A |
5850395 | Hauser et al. | Dec 1998 | A |
5854899 | Callon et al. | Dec 1998 | A |
5881243 | Zaumen et al. | Mar 1999 | A |
5886643 | Diebboll et al. | Mar 1999 | A |
5909427 | Manning et al. | Jun 1999 | A |
5951651 | Lakshman et al. | Sep 1999 | A |
6032194 | Gai et al. | Feb 2000 | A |
6049834 | Khabardar et al. | Apr 2000 | A |
6094685 | Greenberg et al. | Jul 2000 | A |
6104701 | Avargues et al. | Aug 2000 | A |
6130875 | Doshi et al. | Oct 2000 | A |
6130891 | Lam et al. | Oct 2000 | A |
6269330 | Cidon et al. | Jul 2001 | B1 |
6411599 | Blanc et al. | Jun 2002 | B1 |
6639895 | Helles et al. | Oct 2003 | B1 |
6660195 | Usui et al. | Dec 2003 | B2 |
6701449 | Davis et al. | Mar 2004 | B1 |
6711137 | Klassen et al. | Mar 2004 | B1 |
6711612 | Blumenau et al. | Mar 2004 | B1 |
6798740 | Senevirathne et al. | Sep 2004 | B1 |
Number | Date | Country |
---|---|---|
0 858 189 | Aug 1998 | EP |
Number | Date | Country | |
---|---|---|---|
20030002443 A1 | Jan 2003 | US |